Hi. Currently I am using a custom threshold in case of my Memory
alerts. I have 2 main labels for my every node exporter target - cluster
and component.
My custom threshold till now has been
based on the component as I had to define that particular custom
threshold for all the servers of the component. But now, I have 5
instances, all from different components and I have to set the threshold
as 97. How do approach this?
My typical node exporter job.
- job_name: 'node_exporter_JOB-A'
static_configs:
- targets: [ 'x.x.x.x:9100' , 'x.x.x.x:9100']
labels:
cluster: 'Cluster-A'
env: 'PROD'
component: 'Comp-A'
scrape_interval: 10s
Recording rule for custom thresholds.
- record: abcd_critical
expr: 99.9
labels:
component: 'Comp-A'
- record: xyz_critical
expr: 95
labels:
node: 'Comp-B'
The expression for Memory Alert.
((node_memory_MemTotal_bytes
- node_memory_MemFree_bytes - node_memory_Cached_bytes) /
node_memory_MemTotal_bytes * 100) * on(instance) group_left(nodename)
node_uname_info > on(component) group_left() (abcd_critical or xyz_critical
or on(component) count by (component)((node_memory_MemTotal_bytes -
node_memory_MemFree_bytes - node_memory_Cached_bytes) /
node_memory_MemTotal_bytes * 100) * 0 + 90)
Now, I have 5 servers with different components. How to include that in the most optimized manner?
Thanks in advance.