Quick Question about Prometheus Retention Time in kubernetes

700 views
Skip to first unread message

David B G.

unread,
Sep 6, 2021, 4:26:37 PM9/6/21
to Prometheus Users

Hello everybody, nice to meet you all

I have a quick question regarding Prometheus using the kube prometheus stack deployment.

and recently have an issue regarding the retention of metrics, my issue is like this post that its appear my retention of time series exceeds the  limit of 10 days and its causing a disk usage of 100%.

After making an exec inside the pod it seems that there are blocks that have more than 10 days of retention.

Aug 24 01FDVX28YSKFRCMDCRP4111VBJ

Aug 24 11:03 01FDVX4XK0ACRPPZ65KM2YFJRZ

Aug 24 13:00 01FDW3Y06JWQ81DG1XF59JR944

 

So, I have the next naive question:

Is there any configuration I am missing?

It’s possible that after the pod of Prometheus restart its counts again 10 days from the time the pod gets reset?

Its safe to delete manually my old blocks with a command like rm block_name?

I am using the next configuration of kube prometheus:

Metric retention

retention: 10d

retentionSize: ""

walCompression: false

 

I am using Prometheus version 2.28.1

 

Any advise or help I will appreciate it  😊

David B G.

unread,
Sep 7, 2021, 11:47:06 AM9/7/21
to Prometheus Users
Hello I fix my issue,

For the record, I add retentionSize limits to 45 GB (my persistent volume use 50GB)

After that, prometheus starts to work normally and I think the root cause of my issue was one corrupt chunk  of data

level=error caller=head.go:785 component=tsdb msg="Loading on-disk chunks failed" err="iterate on on-disk chunks: corruption in head chunk file /prometheus/chunks_head/000380: head chunk file has some unread data, but doesn't include enough bytes to read the chunk header - required:67104769, available:67104768, file:380"
level=info component=tsdb msg="Deleting mmapped chunk files"
level=info 5 component=tsdb msg="Deletion of mmap chunk files successful, reattempting m-mapping the on-disk chunks"
level=info  component=tsdb msg="On-disk memory mappable chunks replay completed" duration=6.63822568s
level=info component=tsdb msg="Replaying WAL, this may take a while
Reply all
Reply to author
Forward
0 new messages