what is the safe/best way to restart Prometheus service quickly without any errors

75 views
Skip to first unread message

rs vas

unread,
Mar 16, 2020, 1:59:07 AM3/16/20
to Prometheus Users
Hello, Whenever we try to restart the Prometheus service, we are facing some issues and Service is not starting...

Issue1: It is complaining about time ranges overlap after restart and not starting, after deleting the directories named with uuid, service is started successfully.
level=error ts=2020-03-16T05:46:32.648Z caller=main.go:736 err="opening storage failed: invalid block sequence: block time ranges overlap: [mint: 1584151200000, maxt: 1584158400000, range: 2h0m0s, blocks: 5]: <ulid: 01E3BPKK6345WJCJBDJ6MD54X1, mint: 1584151200000, maxt: 1584158400000, range: 2h0m0s>, <ulid: 01E3BPKSQ14ZX63XP7GNMYF9PB, mint: 1584151200000, maxt: 1584158400000, range: 2h0m0s>, <ulid: 01E3BPPBV6Q1M7VQTWC0A16BMR, mint: 1584151200000, maxt: 1584158400000, range: 2h0m0s>, <ulid: 01E3CCXVM5J72658E3YDK9WGZ8, mint: 1584151200000, maxt: 1584158400000, range: 2h0m0s>, <ulid: 01E3D099N14KRHFF5XA2H6J2A8, mint: 1584151200000, maxt: 1584158400000, range: 2h0m0s>"

Issue2:It is complaining about meta.json not found, and after deleting those directory, started fine.
level=error ts=2020-03-14T09:02:29.497Z caller=db.go:604 component=tsdb msg="compaction failed" err="plan compaction: open /v../l../prometheus/01E3C4BBFR0VCQWZDPGJMAT2NB/meta.json: no such file or directory"

Issue3: We have got luck but no issues on restart, then it is taking few min to start the service, as it taking time on below process... WAL segment loaded....
level=info ts=2020-03-16T05:50:21.522Z caller=head.go:560 component=tsdb msg="WAL segment loaded" segment=405 maxSegment=534
level=info ts=2020-03-16T05:50:23.005Z caller=head.go:560 component=tsdb msg="WAL segment loaded" segment=406 maxSegment=534
level=info ts=2020-03-16T05:50:24.612Z caller=head.go:560 component=tsdb msg="WAL segment loaded" segment=407 maxSegment=534
level=info ts=2020-03-16T05:50:26.104Z caller=head.go:560 component=tsdb msg="WAL segment loaded" segment=408 maxSegment=534

Prometheus is running as systemd, and trying to restart service as "systemctl restart prometheus.service"

Any help or input is appreciated.

rs vas

unread,
Mar 16, 2020, 2:19:41 AM3/16/20
to Prometheus Users
And Prometheus is running on version: 2.13.0.

Rahul Hada

unread,
Mar 16, 2020, 2:29:46 AM3/16/20
to rs vas, Prometheus Users
Hi,
After making any changes in prometheus yaml file, no need to restart the service everytime. You can reload the service using below two methods.

1>  You can use postman tool to smoothly reload the service using api. In that case you have to restart the prometheus service with --web.enable-lifecycle  parameter.

2>  You can reload the service using command line argument.  
 curl --location --request POST 'http://server-IP:Portno./-/reload'

Hope this helps.

Thanks
Rahul Hada

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAPs_AfhF%3DR7_jCuCkJR4xaASgv1VBG5k1Kr%2BGynbj%2BziojGd1g%40mail.gmail.com.

Brian Candler

unread,
Mar 16, 2020, 4:03:53 AM3/16/20
to Prometheus Users
And a third option (for picking up changes after modifying config files) is even simpler:

killall -HUP prometheus

This doesn't require setting any config flags.

If there's a problem with your configs, prometheus will keep running with the old ones.  You can check for errors in the error output, e.g. under systemd it would be "journalctl -eu prometheus"

Another useful command is:

promtool check config /path/to/prometheus.yaml

which can be used to check whether your new config and rules are valid *before* you send the reload signal.

Cameron Kerr

unread,
Mar 16, 2020, 4:15:43 AM3/16/20
to Prometheus Users
In a similar vien, I use the following for prometheus running in a docker container (where prometheus is PID 1)

sudo docker exec monitoring_prometheus_1 kill -HUP 1

(where monitoring_prometheus_1 is the name given to the docker container)

rs vas

unread,
Mar 16, 2020, 2:34:09 PM3/16/20
to Cameron Kerr, Prometheus Users
Thanks all for all the options where we can easily reload the updated configurations.

That does mean, we should not restart the service, if we restart the issues I have mentioned in the email are expected?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.

Brian Candler

unread,
Mar 16, 2020, 2:48:08 PM3/16/20
to Prometheus Users
Yes.  A full prometheus restart *will* take some minutes to re-read its WAL files.

Therefore, never restart prometheus - unless you need to change the command-line flags or upgrade to a new version.

Anything else (e.g. config or recording/alerting rules) can be picked up by a HUP.  Changing a file_sd targets file doesn't even need a HUP; it's picked up immediately.

Christian Hoffmann

unread,
Mar 16, 2020, 4:24:37 PM3/16/20
to rs vas, Prometheus Users
On 3/16/20 7:33 PM, rs vas wrote:
> Thanks all for all the options where we can easily reload the updated
> configurations.
>
> That does mean, we should not restart the service, if we restart the
> issues I have mentioned in the email are expected?

You should also ensure that Prometheus shuts down cleanly. Check your
systemd timeouts -- maybe Prometheus needs longer than the configured
timeout. In this case, systemd might send a KILL signal to Prometheus,
leaving it with no chance to clean up.

Kind regards,
Christian

rs vas

unread,
Mar 18, 2020, 12:40:16 PM3/18/20
to Christian Hoffmann, Prometheus Users
Thanks everyone for the input. Any recommendation on the suggested timeout on the systemd to wait for graceful Prometheus shutdown!
Reply all
Reply to author
Forward
0 new messages