How to solve "Two hour in-memory prometheus data loss during upgrade/failover"

Mega Rajan

unread,

Sep 30, 2025, 9:55:35 AM9/30/25

to Prometheus Users

hi team,

I am using prometheus instance in a docker container with data directory volume mapped to vm disk .

when I update the docker container image with newer version and restart the prometheus I see the prometheus data loss for 2 hours .

how to prevent this ? or fix this ?

thanks

Megarajan

Brian Candler

unread,

Oct 1, 2025, 5:58:36 AM10/1/25

to Prometheus Users

My guess is that your (unspecified) container environment is not allowing enough time for prometheus to shut down cleanly, and is doing a "kill -9" or equivalent. It may have a configuration setting to change this.

For running prometheus under systemd I do this:

[Service]
TimeoutStopSec=300

which gives Prometheus up to 5 minutes to shutdown.

See: https://www.freedesktop.org/software/systemd/man/latest/systemd.service.html#TimeoutStopSec=

Ben Kochie

unread,

Oct 1, 2025, 6:03:22 AM10/1/25

to Brian Candler, Prometheus Users

The Prometheus Operator defaults to a terminationGracePeriodSeconds of 600 (10 min).

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/prometheus-users/96115794-3e2c-45ce-a5c2-b4e5d3bc2ccdn%40googlegroups.com.

Reply all

Reply to author

Forward