Prometheus time-dependent metric

205 views
Skip to first unread message

Conrad Mukai

unread,
Sep 8, 2021, 5:18:34 PM9/8/21
to Prometheus Users
I want to track a metric that depends on the Prometheus timestamp. I know the rate of change for the metric, but not the metric value. For example, suppose I have the following metric m:

m = k * (t1 - t0)

where t1 is the Prometheus timestamp associated with m, and t0 is the timestamp of the previous value of the metric. k is a constant that I know prior to pushing the metric to Prometheus.

Is it possible to post a metric that can be computed based on timestamp? An alternative would be to predetermine the timestamp and require that Prometheus use it when recording the metric. Any suggestions?

Thanks in advance.

Conrad Mukai

unread,
Sep 8, 2021, 6:01:51 PM9/8/21
to Prometheus Users
Also one other idea is to do a 2 phase commit. First create a time history record with some default value, and then use the resulting timestamp to update the value.

Brian Candler

unread,
Sep 9, 2021, 4:44:32 AM9/9/21
to Prometheus Users
Something you could investigate is that the exposition format in principle allows you to provide your own timestamps for your metrics.  However I don't know if current versions of prometheus honour this at all, or just throw it away and use the scrape time.

However, I don't really understand what's going on with your metric, and it sounds to me like this might be an X-Y problem.

In general, exporters don't "post" metrics, they are "scraped", and this scraping can take place by multiple clients concurrently.  For example, in a HA setup you may scrape the same exporter by multiple prometheus servers; or you might spin up a test prometheus server which scrapes from the same exporters as your live server.  Therefore the "prometheus timestamp associated with m" doesn't really make much sense, except in terms of the time when the value was actually scraped by a particular server.

If the problem is that you want to turn a rate gauge into a counter (why?), then it's possible in a recording rule, but the results may not be satisfactory.  You're only collecting samples of the rate at particular instants in time, so integrating over it is likely to lead to error accumulation.  If you know that your gauge is reporting "rate over the last 15 seconds", and you're scraping at 15 second intervals, it may be good enough.  But it would be much better to change the exporter into a native counter.

Conrad Mukai

unread,
Sep 9, 2021, 12:17:08 PM9/9/21
to Prometheus Users
Thanks for the response. I am new to time series data, so many of my questions may not make sense. What I am trying to do is to track cost. I know the price of an item in USD/hr (price includes CPU and memory usage so it varies with time). I was using Postgres, but it turns out that Grafana templates/variables did not seem to work well with that data source. I then turned to Prometheus.

It appears that you are saying is I should not worry about integrating rate, and just scrape the cost (price*dt). I assume you mean that the difference between the time I use to compute cost and the time recorded during the scrape is insignificant. One other thing is that I need to maintain 2 time series. One for monthly cost and the other for fiscal year cost. Basically I need to reset the cost at the beginning of the month and the fiscal year respectively to 0. So I believe I should be using a Gauge and not a Counter to record costs.

Brian Candler

unread,
Sep 10, 2021, 3:27:24 AM9/10/21
to Prometheus Users
If I understand you correctly, you say you want to record "absolute cost over the last polling interval" - you said "(price*dt)" but I think you mean "(charging_rate*dt)".  As I tried to explain, this doesn't work well for prometheus, where the same metric can be scraped concurrently by multiple clients at arbitrary points in time.

I think you should consider the following for Prometheus metrics:

1. A counter, like a taxi meter, which increments with absolute money (e.g. dollars) as you spend it
2. A gauge, which represents the current charging rate (e.g. dollars per hour)

Option 1 is by far the best.  In principle, if you want to find the amount spent in the current financial year, you simply subtract the value of this counter at the start of the financial year from the current value.  Some extra care is needed if there is any chance of the counter being reset; the prometheus functions rate() and increase() take account of that.

Setting aside money for the moment, this is how calculations around traffic bandwidth are done (from counters of bytes passing through an interface).  It also allows you easily to calculate usage over arbitrary periods.  You *don't* want this counter to reset at the start of the financial year for example; that makes the data less useful.  You want it to reset as little as possible, preferably never.

The only problem with counters is that prometheus uses float64 values, which have reducing accuracy as the value gets larger.  But with 52 bits of precision, you should be able to get to $45 trillion with one-cent accuracy :-)

Option 2 is problematic.  To find the absolute cost over a given time period, you have to integrate over that period.  If the rate changes, you may be wrongly apportioning the old and new rates to a given timeslot - although that may not be a major problem if the sampling interval is small.  It's possible to do integration using recording rules summing the previous value with the current value, but that's risky and you need to be extremely careful about missing scrapes.  If I wanted up-to-the-second cost estimates, and the supplier only gave an instantaneous "rate of charging" value, then I'd be inclined to integrate it myself in my own exporter - i.e. write a "taxi meter" exporter.

Really, I'd say the most important design consideration is where the data is coming from: you should keep it as close to the source of truth as possible.  In the case of AWS for example, they provide detailed daily reports with line items showing exactly how much you've used for each resource and how much it cost.  I wouldn't attempt to second-guess these using my own metrics.
Reply all
Reply to author
Forward
0 new messages