Alert on no alerts?

153 views
Skip to first unread message

pin...@hioscar.com

unread,
Jan 30, 2021, 9:50:04 PM1/30/21
to Prometheus Users
Is there a way for AlertManager to fire an alert no receiving no alerts in the last like 10 mins? We recently experienced an issue when prometheus got misconfigured and stopped sending alerts, including alerts on itself. So, to detect this scenario, we cannot rely on prometheus alert rules. I hope AlertManager can fire an alert on its own when receiving no alerts for some period of time. Please advise. Thanks

Julius Volz

unread,
Jan 31, 2021, 3:56:23 AM1/31/21
to pin...@hioscar.com, Prometheus Users
Yep! What you want is an alerting heartbeat: https://www.youtube.com/watch?v=RsigFUMUHZ0

So an external service like https://deadmanssnitch.com/ that sends you a notification when it does *not* receive a message in a given time.

If you are using the Prometheus Operator to deploy Prometheus to Kubernetes using kube-prometheus, you already get a default alerting rule for that: https://github.com/prometheus-operator/kube-prometheus/blob/8588e30bd02c60ca0ed75f450014a2b1c1e8ff5a/manifests/kube-prometheus-prometheusRule.yaml#L23-L33

On Sun, Jan 31, 2021 at 3:50 AM 'pin...@hioscar.com' via Prometheus Users <promethe...@googlegroups.com> wrote:
Is there a way for AlertManager to fire an alert no receiving no alerts in the last like 10 mins? We recently experienced an issue when prometheus got misconfigured and stopped sending alerts, including alerts on itself. So, to detect this scenario, we cannot rely on prometheus alert rules. I hope AlertManager can fire an alert on its own when receiving no alerts for some period of time. Please advise. Thanks

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e8fee1c3-6e37-4b1c-8042-0fa066b6052en%40googlegroups.com.


--
Julius Volz
PromLabs - promlabs.com

Peter S

unread,
Feb 2, 2021, 5:10:59 PM2/2/21
to Prometheus Users
Thanks! Really amazing. Is there a way to do this without using third-party service? I'm hoping AlertManager can detect that the heatbeat alert has stopped and fire off an alert? Thanks!

Julius Volz

unread,
Feb 2, 2021, 5:47:10 PM2/2/21
to Peter S, Prometheus Users
No, Alertmanager cannot do this by itself. In any case it's a good idea to have a third party for this because the whole idea is that it should be as independent from your infrastructure as possible, so you even still get an alert even if you Alertmanager has a problem too.

Ben Kochie

unread,
Feb 3, 2021, 2:22:52 AM2/3/21
to Julius Volz, Peter S, Prometheus Users

l.mi...@gmail.com

unread,
Feb 3, 2021, 6:37:59 AM2/3/21
to Prometheus Users
If you use karma have a look at:
https://github.com/prymitive/karma/releases/tag/v0.78
It's not really an alert, just a warning in the UI, but depending on your exact needs might be good enough.

Matthias Rampke

unread,
Feb 3, 2021, 9:53:46 AM2/3/21
to l.mi...@gmail.com, Prometheus Users
This time, it was Prometheus that was misconfigured. In our case, it was an iptables rule that prevented Alertmanager from reaching PagerDuty. By putting the final check all the way after PagerDuty, we know when anything in the chain stops working, not just the thing that broke last time.

/MR

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.

Bharanidharan M

unread,
May 17, 2021, 1:05:33 PM5/17/21
to Prometheus Users
Hi,

I have installed karma as a deployment in minikube using this command (helm install --generate-name -f values.yaml stable/karma), with following values.yaml.

i have prometheus operator running in localhost:9090 portforward.. still i get this error. do i need to create ingress or a private url for prom operator and assign that to uri of values.yaml? please help. thanks.

alertmanager:
  interval: 60s
  servers:
    - name: local
      uri: http://localhost:9090
      timeout: 10s
      proxy: true
      readonly: false
annotations:
  default:
    hidden: false
  hidden:
    - help
  visible: []
custom:
  css: /custom.css
  js: /custom.js
debug: false
filters:
  default:
    - "@receiver=by-cluster-service"
karma:
  name: karma-prod
labels:
  color:
    static:
      - job
    unique:
      - cluster
      - instance
      - "@receiver"
  keep: []
  strip: []
listen:
  address: "0.0.0.0"
  port: 8080
  prefix: /
log:
  config: false
  level: info
silences:
  comments:
    linkDetect:
      rules:
        - regex: "(DEVOPS-[0-9]+)"
          uriTemplate: https://jira.example.com/browse/$1
receivers:
  keep: []
  strip: []
sentry:
  private: secret
  public: 123456789
silenceForm:
  strip:
    labels:
      - job
ui:
  refresh: 30s
  hideFiltersWhenIdle: true
  colorTitlebar: false
  minimalGroupWidth: 420
  alertsPerGroup: 5
  collapseGroups: collapsedOnMobile

Bharanidharan M

unread,
May 18, 2021, 10:05:19 AM5/18/21
to Prometheus Users
Could any of you please help me with karma installation thru helm..

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages