Hi Lars,
On 2019-05-02 15:08, Lars.schotte via Prometheus Users wrote:
> I have watched that talk about prometheus storage, but I can not say
> that it is compressing at all, or at least no idea how to configure it
> so that it uses up less space.
There is not really something to configure there, AFAIK.
> To me it has accumulated ~ 700 MB a week of data and compressed they
> make about ~ 30 MB using bzip2 --best.
I find this rather surprising. While Prometheus does not use classical
compression such as bzip2 or gzip, I would expect a compacted Prometheus
block to be nearly uncompressable.
The used space you are seeing may be dominated by the WAL. This part is
not optimized for efficient storage and it should not grow endlessly (if
it does, there may be another problem -- recent Prometheus versions had
several optimizations/bugfixes there). I would not be surprised if this
part is well-compressable.
To confirm, I tried some tests on a small Prometheus instance of my own
(4k metrics, still on 2.7.2) and I'm a bit surprised about the
compressability:
$ du -sh .
678M .
$ tar c . | wc -c
708106240
$ tar cz . | wc -c
317304621
$ tar cj . | wc -c
231356989
Compression via gzip would be able to reduce the overall amount to 44%,
bzip2 even to 32%.
The compress-it-all-test is a bit unfair as Prometheus needs to keep it
in chunks. Also, the WAL special case from above applies as well.
However, single-chunk tests still yield unexpectedly good results:
$ cat ./01D9V7R7BK6ZF0YVEMGQ23JXGP/chunks/000001 | wc -c
18211931
$ cat ./01D9V7R7BK6ZF0YVEMGQ23JXGP/chunks/000001 | gzip -9 | wc -c
9630865
$ cat ./01D9V7R7BK6ZF0YVEMGQ23JXGP/chunks/000001 | bzip2 -9 | wc -c
9477296
(both down to ~50ish%)
So, there may be indeed some potential savings in storage space. The
question is whether implementing (or using via a compressing filesystem)
would be worth it, as data would have to be compressed and uncompressed
on-the-fly, as such increasing CPU demands.
Back to your actual case -- can you provide some more details like
- your Prometheus version
- if you started fresh on this version or if you updated; if latter,
from which version
- space distribution within your data directory, e.g. du -shc
/var/lib/prometheus/data/*
- amount of currently available metrics (e.g. curl -sG
localhost:9090/api/v1/query --data-urlencode
'query=count({__name__=~".+"})' | python -m json.tool)
More information about the storage layout can be found here, btw:
https://prometheus.io/docs/prometheus/latest/storage/
Kind regards,
Christian
> So does not seem to me that prometheus is good at not wasting space,
> that's why I am proposing using a filesystem with compression support
> like BTRFS for storing the prometheus database.
> Any ideas?
>
> Am Donnerstag, 27. September 2018 15:30:00 UTC+2 schrieb
>
squareoc...@gmail.com:
>
> Hey all,
>
> We released a project using Prometheus last week, and I've been
> monitoring its disk space and noticed that occasionally the disk
> space will shrink.
> How does this work? Is Prometheus compressing the data? I've checked
> our metrics over Grafana and no data have been lost.
>
> last week: 106M, 216M, 128M
> this week: 323M, 510M, 450M, 383M
>
> I can also post the individual file distribution I recorded using du
> -ah if that can help figure out what's going on.
>
> I've skimmed through documentation, and googled a bit but can't seem
> to find much; most people were complaining about it using too much
> disk space. Any insight would be helpful. Thanks!
>
> --
> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
prometheus-use...@googlegroups.com
> <mailto:
prometheus-use...@googlegroups.com>.
> <mailto:
promethe...@googlegroups.com>.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/prometheus-users/c801e165-bbe5-4685-b563-bc3e858bdc38%40googlegroups.com
> <
https://groups.google.com/d/msgid/prometheus-users/c801e165-bbe5-4685-b563-bc3e858bdc38%40googlegroups.com?utm_medium=email&utm_source=footer>.