Using a query to identify if a docker container was last seen using 'count'

1,827 views
Skip to first unread message

Marco Pas

unread,
Jul 1, 2016, 5:56:05 AM7/1/16
to Prometheus Developers
I am trying to get a query going that return 1 or 0 to identify if a docker container is actually running. To achieve this i used the following query

count(container_last_seen{name=~"(?i)ubuntu"}) OR vector(0)

The result of this query is that when the container goes offline i see it very late, so i was trying to get the latest 15 seconds for the container_last_seen metric using:

count(container_last_seen[15s]{name=~"(?i)ubuntu"}) OR vector(0)

But this results in an error in the query. 

Error executing query: parse error at char 31: unexpected "{" in aggregation, expected ")"

I tried several combinations but i seem to be stuck in this query. 

Is there a way to retrieve the container_last_seen and do a count in a certain timeframe?

- Marco

Matthias Rampke

unread,
Jul 1, 2016, 6:10:50 AM7/1/16
to Marco Pas, Prometheus Developers
The problem is that the `container_last_seen` time series will be kept
around for the duration of the staleness timeout (this is configurable
via a flag, but hold on).

The `[15s]` syntax won't help with count().

This will give you all containers that have not been seen in the last
15 seconds but have not fallen off the staleness cliff:

container_last_seen < (time() - 15)

It will however not return exactly 1 or 0 – why do you need that?

to check only for a single container, and also return something if it
is completely missing (staleness timeout has hit):

(container_last_seen{name=~"(?i)ubuntu"} < (time() - 15)) OR
absent(container_last_seen{name=~"(?i)ubuntu"})

/MR
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-devel...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Matthias Rampke
Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany | +49
173 6395215

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B

Marco Pas

unread,
Jul 1, 2016, 6:25:54 AM7/1/16
to Matthias Rampke, Prometheus Developers
Hi Matthias, by returning 0 or 1 I can identify in Grafana if the service is up or not. If there are better ways of doing things then i am more then happy to learn.

So my ultimate goal for this query is to identify of a service is up or not, so that i can take the output of the query to make a status panel.

- Marco

Matthias Rampke

unread,
Jul 1, 2016, 12:04:41 PM7/1/16
to Marco Pas, prometheus-developers

I see, Grafana is a bit peculiar about missing data.

I think you can force a 0/1 from the query I gave with

( (container_last_seen{name=~"(?i)ubuntu"} < (time() - 15)) OR 
absent(container_last_seen{name=~"(?i)ubuntu"}) ) *0 OR vector(1)

so whatever value we get, multiply it by zero, but if we don't get a value (everything is fine) return 1.

Hope that helps!
MR

Reply all
Reply to author
Forward
0 new messages