Prometheus 2.18 incompatibility with 2.04

45 views
Skip to first unread message

Johny

unread,
Jun 19, 2020, 9:55:22 PM6/19/20
to Prometheus Users
I recently upgraded Prometheus version 2.4 to 2.18. It seems there is some incompatibility in the api. When I query a metric, I get same value across all time series in 2.18.

e.g.
query:   my_metric

2.4 api-
Element                                                             Value
my_metric{l1="a1",l2="b1",l3="x-y-z1"}                      3434
my_metric{l1="a2",l2="b2",l3="x-y-z2"}                      3.433
my_metric{l1="a3",l2="b3",l3="x-y-z3"}                      94344
my_metric{l1="a4",l2="b4",l3="x-y-z4"}                      1000

2.18 api returns the same value for all time series-
Element                                                             Value
my_metric{l1="a1",l2="b1",l3="x-y-z1"}                      1000
my_metric{l1="a2",l2="b2",l3="x-y-z2"}                      1000
my_metric{l1="a3",l2="b3",l3="x-y-z3"}                      1000
my_metric{l1="a4",l2="b4",l3="x-y-z4"}                      1000

The same problem occurs when I query across time. If I plot count(my_metric) on the graph, I get a flat line in 2.18 whereas 2.4 changes over time.

However, when I query a single time series, I don't have this issue in 2.18.

Is this a known incompatibility issue between the two versions and does it affect storage layer as well?
How do I solve it?



Julius Volz

unread,
Jun 20, 2020, 4:30:33 AM6/20/20
to Johny, Prometheus Users
When I create a file "metrics" with your metrics from the 2.4 example block and serve it to Prometheus (using just "python -m SimpleHTTPServer 12345") and then query for "my_metric" from a 2.18 Prometheus, I don't see this problem.

Would it be feasible to share a minimal /metrics endpoint example that reproduces this behavior in 2.18?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/191deffd-faf7-4370-a907-15863e84f64ao%40googlegroups.com.


--
Julius Volz
PromLabs - promlabs.com

Ben Kochie

unread,
Jun 20, 2020, 4:43:32 AM6/20/20
to Julius Volz, Johny, Prometheus Users
Yes, this sounds a lot like duplicate sample ingestion or some other non-compliant metrics endpoint.

It would also be helpful to have the prometheus.yml configuration to understand what else is going on.

Brian Candler

unread,
Jun 20, 2020, 4:48:35 AM6/20/20
to Prometheus Users
Which exact version? Have you tried 2.18.2?  There were some bugs fixed between 2.18.0 and 2.18.2.

Johny

unread,
Jun 20, 2020, 11:18:56 AM6/20/20
to Prometheus Users
new version is 2.18.1

Johny

unread,
Jun 20, 2020, 11:31:54 AM6/20/20
to Prometheus Users
If it is non-compliant endpoint, the problem should appear in both versions, isn't it? It is effecting more than one series. The set up is in corporate org so I cannot expose end points publicly. 

I have an prometheus front end instance that remote reads from multiple prometheus backends. the time series is sharded across multiple backends. The results are also inconsistent in 2.18.1. Sometimes I get fewer time series back but what is consistent is the last data point is duplicated on all time series. Just switching front end to 2.4 with same configuration file fixes the problem.
To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.


--
Julius Volz
PromLabs - promlabs.com

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Julien Pivotto

unread,
Jun 20, 2020, 11:34:39 AM6/20/20
to Johny, Prometheus Users
On 20 Jun 08:31, Johny wrote:
> If it is non-compliant endpoint, the problem should appear in both
> versions, isn't it? It is effecting more than one series. The set up is in
> corporate org so I cannot expose end points publicly.
>
> I have an prometheus front end instance that remote reads from multiple
> prometheus backends. the time series is sharded across multiple backends.
> The results are also inconsistent in 2.18.1. Sometimes I get fewer time
> series back but what is consistent is the last data point is duplicated on
> all time series. Just switching front end to 2.4 with same configuration
> file fixes the problem.


Are you using remote read? A bug was fixed in 2.18.2.


>
>
>
> On Saturday, June 20, 2020 at 4:43:32 AM UTC-4, Ben Kochie wrote:
> >
> > Yes, this sounds a lot like duplicate sample ingestion or some other
> > non-compliant metrics endpoint.
> >
> > It would also be helpful to have the prometheus.yml configuration to
> > understand what else is going on.
> >
> > On Sat, Jun 20, 2020 at 10:30 AM Julius Volz <juliu...@promlabs.com
> > <javascript:>> wrote:
> >
> >> When I create a file "metrics" with your metrics from the 2.4 example
> >> block and serve it to Prometheus (using just "python -m SimpleHTTPServer
> >> 12345") and then query for "my_metric" from a 2.18 Prometheus, I don't see
> >> this problem.
> >>
> >> Would it be feasible to share a minimal /metrics endpoint example that
> >> reproduces this behavior in 2.18?
> >>
> >> On Sat, Jun 20, 2020 at 3:55 AM Johny <mailitt...@gmail.com <javascript:>>
> >>> an email to promethe...@googlegroups.com <javascript:>.
> >>> <https://groups.google.com/d/msgid/prometheus-users/191deffd-faf7-4370-a907-15863e84f64ao%40googlegroups.com?utm_medium=email&utm_source=footer>
> >>> .
> >>>
> >>
> >>
> >> --
> >> Julius Volz
> >> PromLabs - promlabs.com
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Prometheus Users" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to promethe...@googlegroups.com <javascript:>.
> >> To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/prometheus-users/CAObpH5wg9q8S8yYUFs9cP5Ctoir5hqQ%3DVo4XMJt05WW6iKXX-w%40mail.gmail.com
> >> <https://groups.google.com/d/msgid/prometheus-users/CAObpH5wg9q8S8yYUFs9cP5Ctoir5hqQ%3DVo4XMJt05WW6iKXX-w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> >> .
> >>
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/5f49b5ab-d28e-4574-81fd-3a300cbb2c70o%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Christian Hoffmann

unread,
Jun 20, 2020, 11:42:37 AM6/20/20
to Johny, Prometheus Users
On 6/20/20 5:31 PM, Johny wrote:
> If it is non-compliant endpoint, the problem should appear in both
> versions, isn't it? It is effecting more than one series. The set up is
> in corporate org so I cannot expose end points publicly.
Maybe you can build a small reproducer: Grab your metrics via curl, set
up a webserver to serve the file and let a fresh Prometheus instance
scrape it. If the problem no longer occurs, this would be a chance to
look for differences.
If it does occur, try obfuscating the data as needed and providing the
obfuscated data points so that someone can look into it.

> I have an prometheus front end instance that remote reads from multiple
> prometheus backends. the time series is sharded across multiple
> backends. The results are also inconsistent in 2.18.1. Sometimes I get
> fewer time series back but what is consistent is the last data point is
> duplicated on all time series. Just switching front end to 2.4 with same
> configuration file fixes the problem.
As you are using remote read, try updating to at least 2.18.2 as Brian
suggested.

Kind regards,
Christian

Johny

unread,
Jun 20, 2020, 12:08:44 PM6/20/20
to Prometheus Users
Thanks everyone. Upgrading to Prometheus 2.18.2 fixes the issue.

Johny

unread,
Jun 20, 2020, 1:00:50 PM6/20/20
to Prometheus Users
I am actually using version 2.17.2 now. Appreciate if you could inform me of any such issues in this version.

Julius Volz

unread,
Jun 20, 2020, 1:29:15 PM6/20/20
to Johny, Prometheus Users
Given that https://github.com/prometheus/prometheus/pull/7005 introduced the bug on March 24, but 2.17.x was branched off of master on March 12, I think 2.17.x should be free of this bug.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d6e0f432-eaf8-48b0-91e8-709f5a502f08o%40googlegroups.com.

Julien Pivotto

unread,
Jun 20, 2020, 1:35:24 PM6/20/20
to Julius Volz, Johny, Prometheus Users
On 20 Jun 19:28, Julius Volz wrote:
> Given that https://github.com/prometheus/prometheus/pull/7005 introduced
> the bug on March 24, but 2.17.x was branched off of master on March 12, I
> think 2.17.x should be free of this bug.

As 2.17 release shepherd, I confirm. 2.17 does not have the bug.
> > <https://groups.google.com/d/msgid/prometheus-users/d6e0f432-eaf8-48b0-91e8-709f5a502f08o%40googlegroups.com?utm_medium=email&utm_source=footer>
> > .
> >
>
>
> --
> Julius Volz
> PromLabs - promlabs.com
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAObpH5xRQ-f2%2BRM2h8dJawaMcYiFnWQdTKTWMzhkwjjbidR%3Diw%40mail.gmail.com.

--
Julien Pivotto
@roidelapluie
Reply all
Reply to author
Forward
0 new messages