How to solve "Two hour in-memory prometheus data loss during upgrade/failover"

26 views
Skip to first unread message

Mega Rajan

unread,
Sep 30, 2025, 9:55:35 AMSep 30
to Prometheus Users
hi team,

I am using prometheus instance in a docker container with data directory volume mapped to vm disk .

when I update the docker container image with newer version and restart the prometheus I see the prometheus data loss for 2 hours .


how to prevent this ? or fix this ?

thanks
Megarajan

Brian Candler

unread,
Oct 1, 2025, 5:58:36 AMOct 1
to Prometheus Users
My guess is that your (unspecified) container environment is not allowing enough time for prometheus to shut down cleanly, and is doing a "kill -9" or equivalent. It may have a configuration setting to change this.

For running prometheus under systemd I do this:

[Service]
TimeoutStopSec=300

which gives Prometheus up to 5 minutes to shutdown.

Ben Kochie

unread,
Oct 1, 2025, 6:03:22 AMOct 1
to Brian Candler, Prometheus Users
The Prometheus Operator defaults to a terminationGracePeriodSeconds of 600 (10 min).

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/prometheus-users/96115794-3e2c-45ce-a5c2-b4e5d3bc2ccdn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages