Why not using interval/step as lookback delta while doing RangeQuery?

73 views
Skip to first unread message

Dongxu Mei

unread,
Jun 21, 2022, 3:31:58 PM6/21/22
to Prometheus Users
I recently doing a testing with scraping a target with interval as 15s, when the target is gone and the prometheus agent stopped at the same time. But found the data response with a last sampled value for 5mins.

examples.png
When I look into the code and found the the RangQuery section using the engine's lookupDelta param. But why not using interval instead?

code.png

David Leadbeater

unread,
Jun 21, 2022, 11:54:54 PM6/21/22
to Dongxu Mei, Prometheus Users
Lookback exists to deal with missed scrapes or other gaps in data, 5 minutes is the arbitrarily chosen default value. The reasons mentioned at https://www.robustperception.io/what-range-should-i-use-with-rate/ apply here too (although if you know you're not calculating a rate you obviously don't need as many samples). It is hard for Prometheus to automatically know what (multiple of) interval to use as you suggest -- a query might return multiple targets which have different scrape intervals.

In this case you could potentially use your knowledge of the interval to use 2 * 15s as the effective lookback interval, by using a query like last_over_time(metric[30s])

You could also change the default value via the command line flag (--query.lookback-delta), although personally I think it's nicer to be explicit in the query, as changing the lookback delta affects all queries (however last_over_time might be slightly more expensive if you used it for a lot of queries).

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b6f53388-59fc-489f-bb27-5ccd528cb9e6n%40googlegroups.com.

Brian Candler

unread,
Jun 22, 2022, 5:18:14 AM6/22/22
to Prometheus Users
The value of a timeseries at some query time T is defined as the *most recent* stored value in that timeseries, at or before time T (up to "loopback-delta" in the past, which defaults to 5 minutes)

With normal scraping, this isn't an issue.  If scrape at T1 is successful and includes some timeseries S, and the next scrape at T2 does not include timeseries S, then timeseries S is immediately marked as "stale" (i.e. "no more data") and it vanishes from graphs.

But if you're using an agent with remote_write, and you kill the agent, there's no opportunity for this to happen.

Reply all
Reply to author
Forward
0 new messages