Is there any way way to measure disk space usage ?

2,584 views
Skip to first unread message

Peter Zaitsev

unread,
Sep 20, 2016, 9:01:30 PM9/20/16
to Prometheus Developers
Hi,

I wonder is there some built in way to check the space usage by different time series in Prometheus ? 

We get some users complain  about how much space the monitoring uses and it would be great to find out what takes lots of space in their environment to advice them on optimization

--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



Tobias Schmidt

unread,
Sep 20, 2016, 9:12:32 PM9/20/16
to Peter Zaitsev, Prometheus Developers
We don't have detailed per-metric/per-timeseries metrics on disk space usage at this point. A good approximation is to look at the number of time series per metrics as described in http://www.robustperception.io/which-are-my-biggest-metrics/. Querying such expressions at differnt timestamps might help to gain more insight.

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ben Kochie

unread,
Sep 21, 2016, 2:56:11 AM9/21/16
to Peter Zaitsev, Prometheus Developers
Are you using varbit (-storage.local.chunk-encoding-version 2) encoding?  It saves about 50% the space at the cost of 20% more CPU use.

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.

Peter Zaitsev

unread,
Sep 21, 2016, 10:59:30 AM9/21/16
to Tobias Schmidt, Prometheus Developers
Hi,

Thank you for suggestion,  Indeed this seems to work




If I understand though it shows the number of time series not really amount of data they store or get.  In our case we use different scrape rate for different data - ie performance schema events even though has largest number of time series is sampled more rarely than mysql global status



To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Ben Kochie

unread,
Sep 22, 2016, 6:24:22 AM9/22/16
to Peter Zaitsev, Prometheus Developers, Tobias Schmidt

Yes, it will take a bit of math to figure it all out.  You can get counts by job, which should align with your scrape rates.  Then factor in the number of bytes per sample.  ~1.3 for format 2, ~3.3 for format 1.


On Sep 21, 2016 16:59, "Peter Zaitsev" <p...@percona.com> wrote:
Hi,

Thank you for suggestion,  Indeed this seems to work




If I understand though it shows the number of time series not really amount of data they store or get.  In our case we use different scrape rate for different data - ie performance schema events even though has largest number of time series is sampled more rarely than mysql global status


On Tue, Sep 20, 2016 at 9:12 PM, Tobias Schmidt <tob...@gmail.com> wrote:
We don't have detailed per-metric/per-timeseries metrics on disk space usage at this point. A good approximation is to look at the number of time series per metrics as described in http://www.robustperception.io/which-are-my-biggest-metrics/. Querying such expressions at differnt timestamps might help to gain more insight.

On Tue, Sep 20, 2016 at 9:01 PM Peter Zaitsev <p...@percona.com> wrote:
Hi,

I wonder is there some built in way to check the space usage by different time series in Prometheus ? 

We get some users complain  about how much space the monitoring uses and it would be great to find out what takes lots of space in their environment to advice them on optimization

--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsubscri...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



Björn Rabenstein

unread,
Sep 22, 2016, 6:36:53 AM9/22/16
to Ben Kochie, Peter Zaitsev, Prometheus Developers, Tobias Schmidt
In principle, you could check out the size of series files in the data
directory. You have to figure out the fingerprint, though. That's a
functionality that could go into the storagetool, see
https://github.com/prometheus/prometheus/tree/master/storage/local/storagetool
, i.e. you could add a feature there that would evaluate a match
expression (like the /federate endpoint) and then print meta data
and/or the sample data for a time series, including the total number
of chunks etc.

--
Björn Rabenstein, Engineer
http://soundcloud.com/brabenstein

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
Reply all
Reply to author
Forward
0 new messages