> I try to set step to 1m and prometheus give me metrics back with per
> minute.
> How prometheus group data to per minute?
> I mean I set scrape_interval to 1h and it should be just one metric in
> an hour right?
No, that's not right.
When you do call the query_range API, you give it an *instant vector* query, a start time, end time and step. Prometheus evaluates this expression repeatedly at different times across the requested range: t, t+s, t+2s, t+3s etc.
The value of a metric at any particular time "t" is defined to be the most recent value of that metric *at or before* time t, looking back in time up to --query.lookback-delta [default 5m] to find the most recent value.
It's necessary to work this way if you think about it. Suppose you give a query which works across multiple timeseries, like sum(foo) where there are multiple timeseries of the metric "foo" (with different labels). Those values will almost certainly have been sampled at different points in time. To sum them, you have to pick all their values at a common point in time, which is the time of the result.
Hence the result of such a query_range is the data resampled at the step interval.
This will give you the raw data in that period (i.e. the 24 hours up to the evaluation time), and each data point will have its original timestamp. However, there are only a very few queries that can be built this way: basically just plain metrics. If you try to generate a range vector from a more complex expression, you'll have to build a
subquery, which again involves sweeping an instant query across a range with a fixed step: e.g. sum(foo)[24h:1m]
> Yeah we don't worry about about storage, actuall we are worry about
> bandwidth cost, it because we are trying to get kubernetes metris(mainly
> for cost) form customer's cloud cluster(GCP, AWS etc...), and generate a
> cost report for our customer.
> The report's min time step is Hour
In that case I would be inclined to:
- set up an hourly scrape using a cronjob and curl
- get curl to write the data to a file
- get prometheus to scrape the contents of this file (e.g. using node_exporter textfile collector, or just serve the file using a webserver like Apache)
Prometheus can then scrape this data at 2 minute intervals.
If you're using the textfile collector, then it also exposes a metric giving the mtime of the file. This allows you to write alerting rules to detect when a file hasn't been updated for more than a certain amount of time (say more than 90 minutes)
Alternatively: perhaps prometheus isn't the right tool for the job here. You might be better off putting your hourly reports into a SQL database, or something which stores events like Loki or Elasticsearch.