Prometheus keep last value when metric with label disappear

1,031 views
Skip to first unread message

Andrej Dorinec

unread,
Feb 8, 2021, 12:53:28 AM2/8/21
to Prometheus Users

I encounter an issue with metrics that sometimes disappear. I am using SQL exporter for extracting data from database and create metrics from them. I am query the queue size and there is description of metric that is used as label. The label change when size of queue is more than 200 (warning) and more than 1000 (critical).

When the label changed. Lets say, I can see the queue size at 125 - OK. Then, it suddenly increased to 250 - Warning, so the label will change. In prometheus UI, I can see two metrics (two lines), one with label warning and second one with label ok. Now, warning metric dissapear, but in prometheus, there is still last value, in our case 250 hundred. This persists for approximately 5 minuts. In this case, its ok, If I apply max function, it will catch the highest value. The problem is, when the order switch. So I had 250 messages in queue only for one sample and then it jump to 125. In prometheus, I can see both lines warning with 250 and normal with 125. 250 persist for 5 minutes, and my alert raises because my max function will catch maximal value even when the limit is already ok.

I googled little bit and find out, the metrics should not disappear just like that, read about it here and here.

I tried to recreate this issue on localhost, but unsuccessfully.

Questions:

  1. Could somebody explain me, whats wrong when the metric is there for 5 minutes?
  2. Is it some issue with prometheus or is it right behaviour?
  3. Am I understand correctly that this is wrong way how to implement labels - suddenly missing/present with different labels?
  4. How can I recreate this issue and test it properly?
Link on stackoverflow question here.

Matt Palmer

unread,
Feb 8, 2021, 5:36:21 PM2/8/21
to Prometheus Users
On Sun, Feb 07, 2021 at 09:53:28PM -0800, Andrej Dorinec wrote:
> I encounter an issue with metrics that sometimes disappear. I am using SQL
> exporter <https://github.com/free/sql_exporter> for extracting data from
> database and create metrics from them. I am query the queue size and there
> is description of metric that is used as label. The label change when size
> of queue is more than 200 (warning) and more than 1000 (critical).

There's your problem. Stop doing that. If you absolutely feel you need to
have thresholds in your exporter (which you don't), then make them separate
time series (with 0/1 values), and leave the actual numeric data on a
constantly-existing time series.

- Matt

Reply all
Reply to author
Forward
0 new messages