May 10 18:22:41 ip-10-232-20-108 systemd[1]: Reloading Prometheus Monitoring framework.May 10 18:22:41 ip-10-232-20-108 prometheus[19893]: time="2017-05-10T18:22:41-04:00" level=info msg="Loading configuration file /etc/prometheus/prometheus.yaml" source="main.go:251"May 10 18:22:41 ip-10-232-20-108 systemd[1]: Reloaded Prometheus Monitoring framework.May 10 18:23:07 ip-10-232-20-108 prometheus[19893]: time="2017-05-10T18:23:07-04:00" level=warning msg="Error on ingesting results from rule evaluation with different value but same timestamp" numDropped=10 source="manager.go:313"May 10 18:23:07 ip-10-232-20-108 prometheus[19893]: time="2017-05-10T18:23:07-04:00" level=error msg="Error sending alerts: bad response status 400 Bad Request" alertmanager="http:<omitted>.com:80/api/v1/alerts" count=10 source="notifier.go:370"May 10 18:23:38 ip-10-232-20-108 prometheus[19893]: time="2017-05-10T18:23:38-04:00" level=error msg="Error sending alerts: bad response status 400 Bad Request" alertmanager="http:<omitted>.com:80/api/v1/alerts" count=10 source="notifier.go:370"May 10 18:24:08 ip-10-232-20-108 prometheus[19893]: time="2017-05-10T18:24:08-04:00" level=error msg="Error sending alerts: bad response status 400 Bad Request" alertmanager="http:<omitted>.com:80/api/v1/alerts" count=10 source="notifier.go:370"May 10 18:24:38 ip-10-232-20-108 prometheus[19893]: time="2017-05-10T18:24:38-04:00" level=error msg="Error sending alerts: bad response status 400 Bad Request" alertmanager="http:<omitted>.com:80/api/v1/alerts" count=10 source="notifier.go:370"--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/71e28cfe-0e60-4c11-aea1-df71630a007e%40googlegroups.com.
May 15 13:51:13 ip-10-236-137-113 systemd[1]: Reloading Prometheus Monitoring framework.
May 15 13:51:13 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:51:13-04:00" level=info msg="Loading configuration file /etc/prometheus/prometheus.yaml" source="main.go:251"
May 15 13:51:13 ip-10-236-137-113 systemd[1]: Reloaded Prometheus Monitoring framework.
May 15 13:51:37 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:51:37-04:00" level=warning msg="Error on ingesting results from rule evaluation with different value but same timestamp" numDropped=1 source="manager.go:313"
May 15 13:51:38 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:51:38-04:00" level=error msg="Error sending alerts: bad response status 400 Bad Request" alertmanager="http://alertmanager.REDACTED:80/api/v1/alerts" count=1 source="notifier.go:370"
May 15 13:51:55 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:51:55-04:00" level=info msg="Done checkpointing in-memory metrics and chunks in 1m5.287294923s." source="persistence.go:665"
May 15 13:52:08 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:52:08-04:00" level=error msg="Error sending alerts: bad response status 400 Bad Request" alertmanager="http://alertmanager.REDACTED:80/api/v1/alerts" count=1 source="notifier.go:370"
May 15 13:52:38 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:52:38-04:00" level=error msg="Error sending alerts: bad response status 400 Bad Request" alertmanager="http://alertmanager.REDACTED:80/api/v1/alerts" count=1 source="notifier.go:370"
May 15 13:53:08 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:53:08-04:00" level=error msg="Error sending alerts: bad response status 400 Bad Request" alertmanager="http://alertmanager.REDACTED:80/api/v1/alerts" count=1 source="notifier.go:370"
May 15 13:30:38 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:30:38-04:00" level=info msg="Loading configuration file /etc/prometheus/prometheus.yaml" source="main.go:251"
May 15 13:31:07 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:31:07-04:00" level=warning msg="Error on ingesting results from rule evaluation with different value but same timestamp" numDropped=1 source="manager.go:313"
May 15 13:31:38 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:31:38-04:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
May 15 13:32:43 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:32:43-04:00" level=info msg="Done checkpointing in-memory metrics and chunks in 1m5.296435504s." source="persistence.go:665"
May 15 13:33:55 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:33:55-04:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
May 15 13:34:53 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:34:53-04:00" level=info msg="Done checkpointing in-memory metrics and chunks in 58.450524385s." source="persistence.go:665"
May 15 13:35:01 ip-10-236-137-113 CRON[533]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 15 13:35:59 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:35:59-04:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
May 15 13:36:50 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:36:50-04:00" level=info msg="Done checkpointing in-memory metrics and chunks in 50.722972084s." source="persistence.go:665"
May 15 13:38:04 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:38:04-04:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
May 15 13:39:09 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:39:09-04:00" level=info msg="Done checkpointing in-memory metrics and chunks in 1m4.270110301s." source="persistence.go:665"
May 15 13:40:14 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:40:14-04:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
May 15 13:41:17 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:41:17-04:00" level=info msg="Done checkpointing in-memory metrics and chunks in 1m2.580461724s." source="persistence.go:665"
May 15 13:42:29 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:42:29-04:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
May 15 13:43:27 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:43:27-04:00" level=info msg="Done checkpointing in-memory metrics and chunks in 57.761827228s." source="persistence.go:665"
May 15 13:44:35 ip-10-236-137-113 prometheus[17350]: time="2017-05-15T13:44:35-04:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/775613ec-13aa-40d0-96a4-ae3f5b072325%40googlegroups.com.
/etc/systemd/system/multi-user.target.wants$ cat prometheus.service
[Unit]Description=Prometheus Monitoring frameworkWants=basic.targetAfter=basic.target network.target
[Service]User=prometheusGroup=prometheusExecStart=/usr/local/bin/prometheus \Â -config.file=/etc/prometheus/prometheus.yaml\Â -storage.local.path=/data/prometheus \Â -web.console.templates=/usr/local/share/prometheus/consoles \Â -web.console.libraries=/usr/local/share/prometheus/console_libraries \Â -alertmanager.url=http://alertmanager.REDACTED.com -storage.local.memory-chunks=6290432 -storage.local.retention=336h0m0s -storage.local.series-file-shrink-ratio=0.1 -storage.local.max-chunks-to-persist=3145216 -web.external-url=http://REDACTED.comExecReload=/bin/kill -HUP $MAINPIDKillMode=processRestart=alwaysRestartSec=42s
[Install]WantedBy=multi-user.targetAh. I ran service prometheus restart to restart the service and remove the errors and it went back into crash-recovery for an hour. Will go check the /alerts on Prometheus (but 99% sure we had active alerts, yes).
We run 2x Prometheus instances and I've been fiddling on the secondary instance thankfully. Ready for more confusion?/etc/systemd/system/multi-user.target.wants$ cat prometheus.service[Unit]Description=Prometheus Monitoring frameworkWants=basic.targetAfter=basic.target network.target[Service]User=prometheusGroup=prometheusExecStart=/usr/local/bin/prometheus \ -config.file=/etc/prometheus/prometheus.yaml\ -storage.local.path=/data/prometheus \ -web.console.templates=/usr/local/share/prometheus/consoles \ -web.console.libraries=/usr/local/share/prometheus/console_libraries \ -alertmanager.url=http://alertmanager.REDACTED.com -storage.local.memory-chunks=6290432 -storage.local.retention=336h0m0s -storage.local.series-file-shrink-ratio=0.1 -storage.local.max-chunks-to-persist=3145216 -web.external-url=http://REDACTED.comExecReload=/bin/kill -HUP $MAINPIDKillMode=processRestart=alwaysRestartSec=42s[Install]WantedBy=multi-user.targetUnless I'm understanding this incorrectly, the service prometheus reload command should result in the same /bin/kill -HUP $MAINPID. Huh.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/01e3079e-d171-4643-9ddb-d0a1f96aaf18%40googlegroups.com.