Hello, Prometheus Users!
I'm wondering if someone might be able to help me out with something I'm currently struggling with.
I have a metric based on alertmanager active alerts. It looks like "active_alerts{alert_name='cpu_load', node='node01'}". This metric is of intermittent type - if there is an alert active, then it can be found on the scrape target otherwise, it is missing. This makes it a bit complicated to handle.
I would like to count how many times an alert has triggered in a certain period of time.
The total amount of samples of this metric in a certain interval can be found with 'count_over_time'. I want to know though how many times an alert has triggered. This means I want to count only the times when the metric has appeared. So, given 2 consecutive scrapes, if the first scrape is not giving the metric and the second scrape gives it, then it should count, otherwise not.
I tried various approaches to this, but didn't get it. changes() or delta() are hard to use due to the intermittent nature of the metric.
Does anybody have a hint for me?