24H limitation for metric alerting policy ?

10 views
Skip to first unread message

Nico_Ho

unread,
Jul 15, 2024, 10:52:24 PM (7 days ago) Jul 15
to Reliability discussion group
Hi,
I am trying to create an AlertPolicy to get notified when my cluster s been up for X amount of time over Z period .
Currently it seems we can not create an alert that would read a metric over more than 24H .

What is the workaround if I need to measure uptime within the whole month ?

Do I need to create a new metric descriptor using the api ?

I know we can create log-based metric but this will not really measure uptime (rather only the event when a vm was spun up for example).

here is my MQL query :

fetch instance_group
| metric 'compute.googleapis.com/instance_group/size'
| filter resource.instance_group_name =~ '.*pool-test.*'
| group_by 1w, [value_size_max: max(value.size)]
| every 1h

it returns data but if I save it as an alert policy then I receive error :

"This MQL-based alert policy condition requires 168h of data, but alerting can only query for up to 24h of data."
Reply all
Reply to author
Forward
0 new messages