Custom Threshold for Particular Instance.

23 views
Skip to first unread message

yagyans...@gmail.com

unread,
Jun 24, 2020, 2:11:25 PM6/24/20
to Prometheus Users
Hi. Currently I am using a custom threshold in case of my Memory alerts. I have 2 main labels for my every node exporter target - cluster and component.
My custom threshold till now has been based on the component as I had to define that particular custom threshold for all the servers of the component. But now, I have 5 instances, all from different components and I have to set the threshold as 97. How do approach this?

My typical node exporter job.
  - job_name: 'node_exporter_JOB-A'
    static_configs:
    - targets: [ 'x.x.x.x:9100' , 'x.x.x.x:9100']
      labels:
        cluster: 'Cluster-A'
        env: 'PROD'
        component: 'Comp-A'
    scrape_interval: 10s

Recording rule for custom thresholds.
  - record: abcd_critical
    expr: 99.9
    labels:
      component: 'Comp-A'

  - record: xyz_critical
    expr: 95
    labels:
      node: 'Comp-B'

The expression for Memory Alert.
((node_memory_MemTotal_bytes - node_memory_MemFree_bytes - node_memory_Cached_bytes) / node_memory_MemTotal_bytes * 100) * on(instance) group_left(nodename) node_uname_info > on(component) group_left() (abcd_critical or xyz_critical or on(component) count by (component)((node_memory_MemTotal_bytes - node_memory_MemFree_bytes - node_memory_Cached_bytes) / node_memory_MemTotal_bytes * 100) * 0 + 90)

Now, I have 5 servers with different components. How to include that in the most optimized manner?

Thanks in advance.

sayf eddine Hammemi

unread,
Jun 24, 2020, 2:22:46 PM6/24/20
to Yagyansh S. Kumar, Prometheus Users
You can redefine another recoring rule with the same metric name but a different value and different label, that way one record rule can have multiple values depending on labels

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/58c7d88e-6538-4039-a5ae-4bd092cd8087n%40googlegroups.com.

yagyans...@gmail.com

unread,
Jun 24, 2020, 2:27:59 PM6/24/20
to Prometheus Users
Hi. Thanks for such a quick response.

Do you mean to say define 5 different recording rules with the same name for all those 5 servers?
Also, in the recording rule now, I would have to give 2 labels correct? One is the instance itself and another is the component because my whole Memory expression is based on the component given in the recording rule.

sayf eddine Hammemi

unread,
Jun 24, 2020, 3:36:02 PM6/24/20
to yagyans...@gmail.com, Prometheus Users
Correct, that is my idea

Yagyansh S. Kumar

unread,
Jun 24, 2020, 3:38:34 PM6/24/20
to Prometheus Users
Thanks for the solution.
But creating 5 different recording rules for the same custom threshold doesn't seem the best idea. It is good as a last resort I guess.

Any better suggestion to approach this?
To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages