Hi, I have a set of nginx reverse proxies which I start every weekday at 8am UTC and stop every weekday at 5pm UTC.
I'd like to do some behavioural analysis on the time spent by nginx to process requests so that - for example - if
there is an anomalous spike (i.e. a spike not expected at a specific time of the day) I can be alerted and I'd like
to base this analysis on the last 30d of data.
So if I have a spike in the nginx response time and it's more-or-less expected (i.e. because of traffic profiles, for example) I won't receive false positive.
Right now I'm using the same approach as per https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/prometheus-users/sgqHb6m4z2c/bXXUv6c2AwAJ
but for 30d the query will be a bit long:
avg(
label_replace(nginx_http_request_sec{n_instance=~"localhost"} offset 1d, "offset", "1d", "__name__", ".*") or
label_replace(nginx_http_request_sec{n_instance=~"localhost"} offset 2d, "offset", "2d", "__name__", ".*") or
[...]
label_replace(nginx_http_request_sec{n_instance=~"localhost"} offset 29d, "offset", "29d", "__name__", ".*") or
label_replace(nginx_http_request_sec{n_instance=~"localhost"} offset 30d, "offset", "30d", "__name__", ".*")
) without (instance) + 3 * stddev (
label_replace(nginx_http_request_sec{n_instance=~"localhost"} offset 1d, "offset", "1d", "__name__", ".*") or
label_replace(nginx_http_request_sec{n_instance=~"localhost"} offset 2d, "offset", "2d", "__name__", ".*") or
[...]
label_replace(nginx_http_request_sec{n_instance=~"localhost"} offset 29d, "offset", "29d", "__name__", ".*") or
label_replace(nginx_http_request_sec{n_instance=~"localhost"} offset 30d, "offset", "30d", "__name__", ".*")
) without (instance) + $offset
Question I may have: is there a better way to do this?
Sorry for the double post.
Thanks,
d.