Changing storage.tsdb.max-block-duration on live system

1,033 views
Skip to first unread message

mike....@replicon.com

unread,
Apr 23, 2018, 4:27:35 PM4/23/18
to Prometheus Users
We have a prometheus system where we're currently seeing what seems like long compaction (~5-10min) that result in high CPU and a substantive drop in available memory on our machines.

We currently have a retention of 120d setup with no override on storage.tsdb.max-block-duration.

We wanted to try limiting storage.tsdb.max-block-duration to the 3-5% range (4d) Ben K has suggested in other threads to limit compaction times and impact.

My question is what will happen to the existing blocks that exceed the new range on our existing storage system.  Will they get de-compacted?  Or will existing compacted blocks be left alone and just going forward the block sizes will be limited.

Thanks

Ben Kochie

unread,
Apr 23, 2018, 4:36:59 PM4/23/18
to mike....@replicon.com, Prometheus Users
Existing blocks will stay untouched, no need to worry about "de-compaction".  They will just stick around till they expire on the retention time.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/57e95dd5-8770-4eee-9c4b-a36340c7701c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mike....@replicon.com

unread,
Apr 23, 2018, 4:45:25 PM4/23/18
to Prometheus Users
great thanks Ben!


On Monday, April 23, 2018 at 2:36:59 PM UTC-6, Ben Kochie wrote:
Existing blocks will stay untouched, no need to worry about "de-compaction".  They will just stick around till they expire on the retention time.
On Mon, Apr 23, 2018 at 10:27 PM, <mike....@replicon.com> wrote:
We have a prometheus system where we're currently seeing what seems like long compaction (~5-10min) that result in high CPU and a substantive drop in available memory on our machines.

We currently have a retention of 120d setup with no override on storage.tsdb.max-block-duration.

We wanted to try limiting storage.tsdb.max-block-duration to the 3-5% range (4d) Ben K has suggested in other threads to limit compaction times and impact.

My question is what will happen to the existing blocks that exceed the new range on our existing storage system.  Will they get de-compacted?  Or will existing compacted blocks be left alone and just going forward the block sizes will be limited.

Thanks

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

Brian Brazil

unread,
Apr 24, 2018, 4:15:27 AM4/24/18
to mike....@replicon.com, Prometheus Users
On 23 April 2018 at 21:27, <mike....@replicon.com> wrote:
We have a prometheus system where we're currently seeing what seems like long compaction (~5-10min) that result in high CPU and a substantive drop in available memory on our machines.

That doesn't sound right. Compaction shouldn't take more than a core. Does your machine not have enough resources?

The block size settings are only there for our benchmarking, you should not change them.

Brian
 

We currently have a retention of 120d setup with no override on storage.tsdb.max-block-duration.

We wanted to try limiting storage.tsdb.max-block-duration to the 3-5% range (4d) Ben K has suggested in other threads to limit compaction times and impact.

My question is what will happen to the existing blocks that exceed the new range on our existing storage system.  Will they get de-compacted?  Or will existing compacted blocks be left alone and just going forward the block sizes will be limited.

Thanks

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.



--

Ben Kochie

unread,
Apr 24, 2018, 5:17:06 AM4/24/18
to Brian Brazil, Mike Roest, Prometheus Users
The problem is the 10% max block duration does not work well for long retention times.  This flag is broken as-is, and needs to be changed.

Ben Kochie

unread,
Apr 24, 2018, 5:57:34 AM4/24/18
to Brian Brazil, Mike Roest, Prometheus Users
I've filed https://github.com/prometheus/prometheus/issues/4110 to propose reducing the max block duration default.  This should help with larger Prometheus servers.

On Tue, Apr 24, 2018 at 11:17 AM, Ben Kochie <sup...@gmail.com> wrote:
The problem is the 10% max block duration does not work well for long retention times.  This flag is broken as-is, and needs to be changed.
Reply all
Reply to author
Forward
0 new messages