Query for composite metric

Igor Barsukov

unread,

May 31, 2018, 8:08:26 AM5/31/18

to Prometheus Users

I'm trying to create composite metric about common health status of microservice.

I want to calculate it based on 3 other metrics - cpu usage, memory usage, heap usage by service in %.

Basically, if cpu or memory or heap usage of service > 50% -> I need to show 'Warning' health status (on Grafana dashboard);

if cpu or memory or heap usage of service > 80% -> I need to show 'Critical' health status.

I've decided to use Prometheus recording rules to implement it.

I've created next rules to calculate basic metrics -

service:cpu_usage:percent,

service:memory_usage:percent,

service:heap_usage:percent

and based on it I've created next rules:

service:health_warning:bool = ((service:cpu_usage:percent > bool 50 and service:cpu_usage:percent < 80) or (service:memory_usage:percent > bool 50 and service:memory_usage:percent < bool 80) or (service:heap_usage:percent > bool 50 and service:heap_usage:percent < bool 80))

service:health_critical:bool = (service:cpu_usage:percent > bool 80 or service:memory_usage:percent > bool 80 or service:heap_usage:percent > bool 80)

But I can't come up with final solution - how could I combine all results in single recording rule?

It would be great if I could calculate 'health_critical' and 'health_warning' in numeric representation , e.g. 'health_warning' take values 0 (corresponds to 'not warning') and 1 (corresponds to 'warning') , and 'health_critical' - 0 (corresponds to 'not critical') and 2 (corresponds to 'critical'). And then I would simply summarize 'health_warning' and 'health_critical' and get suitable result.

But I'm not sure is it posible to implement my idea? Or may be I've choiced the wrong way and my task could be implemented differently?

Brian Brazil

unread,

May 31, 2018, 8:18:30 AM5/31/18

to Igor Barsukov, Prometheus Users

On 31 May 2018 at 13:08, Igor Barsukov <igor.s....@gmail.com> wrote:

I'm trying to create composite metric about common health status of microservice.
I want to calculate it based on 3 other metrics - cpu usage, memory usage, heap usage by service in %.
Basically, if cpu or memory or heap usage of service > 50% -> I need to show 'Warning' health status (on Grafana dashboard);
if cpu or memory or heap usage of service > 80% -> I need to show 'Critical' health status.

I've decided to use Prometheus recording rules to implement it.
I've created next rules to calculate basic metrics -

service:cpu_usage:percent,
service:memory_usage:percent,
service:heap_usage:percent

and based on it I've created next rules:
service:health_warning:bool = ((service:cpu_usage:percent > bool 50 and service:cpu_usage:percent < 80) or (service:memory_usage:percent > bool 50 and service:memory_usage:percent < bool 80) or (service:heap_usage:percent > bool 50 and service:heap_usage:percent < bool 80))
service:health_critical:bool = (service:cpu_usage:percent > bool 80 or service:memory_usage:percent > bool 80 or service:heap_usage:percent > bool 80)

But I can't come up with final solution - how could I combine all results in single recording rule?

You're most of the way there, https://www.robustperception.io/booleans-logic-and-math/ covers how to do a boolean or via "a + b > bool 0".

Brian

It would be great if I could calculate 'health_critical' and 'health_warning' in numeric representation , e.g. 'health_warning' take values 0 (corresponds to 'not warning') and 1 (corresponds to 'warning') , and 'health_critical' - 0 (corresponds to 'not critical') and 2 (corresponds to 'critical'). And then I would simply summarize 'health_warning' and 'health_critical' and get suitable result.

But I'm not sure is it posible to implement my idea? Or may be I've choiced the wrong way and my task could be implemented differently?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/78a4d2ed-a2af-4de8-91bc-a5f469a2750a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Brian Brazil

www.robustperception.io

Igor Barsukov

unread,

May 31, 2018, 9:09:24 AM5/31/18

to Prometheus Users

Brian, thanks for reply.

I familiar with this page and I've thought in that direction.

But I guess I can't solve the problem based only on boolean algebra. Because of my concluding metric should takes 3 possible values (like 0 for 'ok' status, 1 for 'warning', 2 for 'critical'), not 2.

Maybe some other suggestions?

Thank you in advance.

четверг, 31 мая 2018 г., 15:18:30 UTC+3 пользователь Brian Brazil написал:

On 31 May 2018 at 13:08, Igor Barsukov <igor.s....@gmail.com> wrote:
I'm trying to create composite metric about common health status of microservice.
I want to calculate it based on 3 other metrics - cpu usage, memory usage, heap usage by service in %.
Basically, if cpu or memory or heap usage of service > 50% -> I need to show 'Warning' health status (on Grafana dashboard);
if cpu or memory or heap usage of service > 80% -> I need to show 'Critical' health status.

I've decided to use Prometheus recording rules to implement it.
I've created next rules to calculate basic metrics -

service:cpu_usage:percent,
service:memory_usage:percent,
service:heap_usage:percent

and based on it I've created next rules:
service:health_warning:bool = ((service:cpu_usage:percent > bool 50 and service:cpu_usage:percent < 80) or (service:memory_usage:percent > bool 50 and service:memory_usage:percent < bool 80) or (service:heap_usage:percent > bool 50 and service:heap_usage:percent < bool 80))
service:health_critical:bool = (service:cpu_usage:percent > bool 80 or service:memory_usage:percent > bool 80 or service:heap_usage:percent > bool 80)

But I can't come up with final solution - how could I combine all results in single recording rule?

You're most of the way there, https://www.robustperception.io/booleans-logic-and-math/ covers how to do a boolean or via "a + b > bool 0".

Brian

It would be great if I could calculate 'health_critical' and 'health_warning' in numeric representation , e.g. 'health_warning' take values 0 (corresponds to 'not warning') and 1 (corresponds to 'warning') , and 'health_critical' - 0 (corresponds to 'not critical') and 2 (corresponds to 'critical'). And then I would simply summarize 'health_warning' and 'health_critical' and get suitable result.

But I'm not sure is it posible to implement my idea? Or may be I've choiced the wrong way and my task could be implemented differently?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/78a4d2ed-a2af-4de8-91bc-a5f469a2750a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Brian Brazil
www.robustperception.io

Brian Brazil

unread,

May 31, 2018, 9:11:20 AM5/31/18

to Igor Barsukov, Prometheus Users

On 31 May 2018 at 14:09, Igor Barsukov <igor.s....@gmail.com> wrote:

Brian, thanks for reply.
I familiar with this page and I've thought in that direction.
But I guess I can't solve the problem based only on boolean algebra. Because of my concluding metric should takes 3 possible values (like 0 for 'ok' status, 1 for 'warning', 2 for 'critical'), not 2.
Maybe some other suggestions?

A tri-state value is a bit hard to work with in Prometheus, but you could do something like: (critical == 1) * 2 or warning

where critical and warning were bools.

Brian

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/686b6000-8e0d-4362-b6c7-b3092ec6206a%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Brian Brazil

www.robustperception.io

Igor Barsukov

unread,

May 31, 2018, 3:18:20 PM5/31/18

to Prometheus Users

Thank you, that really helped!

четверг, 31 мая 2018 г., 16:11:20 UTC+3 пользователь Brian Brazil написал:

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/686b6000-8e0d-4362-b6c7-b3092ec6206a%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Brian Brazil
www.robustperception.io

Reply all

Reply to author

Forward