Prom QL

184 views
Skip to first unread message

BHARATH KUMAR

unread,
Jul 20, 2022, 3:49:20 AM7/20/22
to Prometheus Users
Hello all,

I installed node exporters on many servers (around 300). Few of the servers are unreachable. So because of that, we are unable to get the CPU, and memory values of those servers.

Now I want to add a filter in the Grafana dashboard to check the least CPU used, most CPU used servers. But due to unreachability, we are not getting values for a few servers.

My question is 
"how to compare the output of the Prometheus query is NULL"

Generally, I am comparing the output of the prom query like 
I) if the CPU usage is less than 10% then I am comparing like 
query >=0<=10%
ii) if the CPU usage is greater than 10% and less than 30% then I am comparing like
query >10<=30
similarly how to check the null values using the Prometheus query.

Thanks & regards,
Bharath Kumar.

Stuart Clark

unread,
Jul 20, 2022, 4:38:58 AM7/20/22
to BHARATH KUMAR, Prometheus Users

For servers which can't be scraped there will be no metrics, so any queries won't have any data to query.

However Prometheus itself creates certain metrics for all scrape targets, including one called "up" which is either 0 or 1 - where 0 means the scrape failed. You can therefore create dashboards and alerts that list the servers which aren't accessible (up == 0).

-- 
Stuart Clark

Brian Candler

unread,
Jul 20, 2022, 4:52:17 AM7/20/22
to Prometheus Users
And just to clarify slightly, there aren't really "null values" in prometheus. A query like "node_blah" returns a *vector* of results, that is, a variable number of values. e.g.

[
node_blah{instance="foo"} 123
node_blah{instance="bar"} 456
node_blah{instance="baz"} 789
]

If node "baz" goes down, then a query at a later point in time may return

[
node_blah{instance="foo"} 124
node_blah{instance="bar"} 457
]

If you want to test for this specific condition, i.e. there is no "node_blah" metric present for a specific instance "baz", then you can form a rather awkward join query using absent() in conjunction with the "up" metric as Stuart described.

But usually, you just want to query the "up" metric itself.

BHARATH KUMAR

unread,
Jul 20, 2022, 5:28:58 AM7/20/22
to Prometheus Users
Thanks, Clark and Brian for your reply.

I am using two data sources in my case. i.e Prometheus and Postgres.

In my dashboard, there is a table that contains both Prometheus and Postgres data. In this table, there is a column name %cpu used which will be obtained from Prometheus.

As Brain said, if the server goes down, we will not get the node level metrics and for that particular server, we will have Postgres data but, not Prometheus as the server was down.

for example, my dashboard table is as follows:
 
IP            CPU       %cpu       memory       memory_used   column1    column2    column3       
1.1.1.1     4              0.4%          40gb                 60%                a                 b                  c
1.1.1.2     8              10%            80gb                30%                d                 e                  f
1.1.1.3                                                                                         h                  i                   j


the third server goes down, so we are not able to see the CPU and memory values, my question was I want to add one filter so that I can be able to know which servers are least used or most used.

The CPU used for the third server will be no data as that server was down. can we do any comparison for these servers(servers who went down) so that I can filter these servers whose value will be null/no_data.

Thanks & regards,
Bharath Kumar.

Brian Candler

unread,
Jul 20, 2022, 9:26:58 AM7/20/22
to Prometheus Users
No idea.  You haven't said what dashboard software you're using, nor what queries you're using to build that dashboard.

> I want to add one filter so that I can be able to know which servers are least used or most used.

Not sure what you mean by a "filter" in this context.  A PromQL query using min() or max() will work over all the values which are present in the instant vector, which as I said before, is of variable size.  It doesn't have to have a fixed number of inputs.

e.g. given this data

[
node_blah{instance="foo"} 123
node_blah{instance="bar"} 456
node_blah{instance="baz"} 789
]

then

min(node_blah) => 123

BHARATH KUMAR

unread,
Jul 25, 2022, 2:30:03 AM7/25/22
to Prometheus Users
I am talking about grafana dashboard. I created a custom variable as follows:

CPU:
All : <=100,
lt 10 && gt 0 : >0<=10,
lt 30 && gt 10 : >10<=30

So this CPU filter will be added at the top of grafana dashboard. Now If I select ALL in CPU filter I am not able to find the unreachable servers list in the grafana dashboard as those servers are not present in prometheus Data Source(since these are in unreachable state).

My query:
((1 - avg(irate(node_cpu_seconds_total{mode="idle",instance=~"$ip"}[5m])) by (instance)) * 100)  $CPU 

Here we are comparing the value in $cpu

Is there any other way to compare the values that are not present in prometheus data source so that when I click ALL option I can able to see all servers data.

Thanks & regards,
Bharath Kumar.

Stuart Clark

unread,
Jul 25, 2022, 2:56:43 AM7/25/22
to BHARATH KUMAR, Prometheus Users
On 25/07/2022 07:30, BHARATH KUMAR wrote:
> I am talking about grafana dashboard. I created a custom variable as
> follows:
>
> CPU:
> All : <=100,
> lt 10 && gt 0 : >0<=10,
> lt 30 && gt 10 : >10<=30
>
> So this CPU filter will be added at the top of grafana dashboard. Now
> If I select ALL in CPU filter I am not able to find the unreachable
> servers list in the grafana dashboard as those servers are not present
> in prometheus Data Source(since these are in unreachable state).
>
> My query:
> ((1 - avg(irate(node_cpu_seconds_total{mode="idle",instance=~"$ip"}[5m])) by (instance)) * 100)
> $CPU
>
> Here we are comparing the value in $cpu
>
> Is there any other way to compare the values that are not present in
> prometheus data source so that when I click ALL option I can able to
> see all servers data.

If you are not scraping something (because the scrape is failing) then
no data exists about that server.

For the drop down selector in Grafana you could use the "up" metric,
which will still have values for servers that weren't able to be
scraped, but selecting one of those would then result in an empty
dashboard as there wouldn't be any CPU values.

--
Stuart Clark

BHARATH KUMAR

unread,
Jul 26, 2022, 12:23:55 PM7/26/22
to Prometheus Users
Thanks Sir
Reply all
Reply to author
Forward
0 new messages