Hey,
I was just wondering if there's a way to write a unit test for an alert that includes missing data?
The context is that I'm trying to figure out if an alert we have is correct if data is missing at certain points in time. The alert is for a sync job and it should fire the job hasn't succeeded in > 24 hours. However, it's been written using a 'for: 24h' clause instead of our usual pattern, which is to include the threshold in the expression, e.g. 'time() - max_over_time(last_success_time[24h])'.
I know that the max_over_time pattern is resilient to missing data at any point during the interval, but I'm wondering in the other case, if the data goes missing for 20 minutes, will that reset the 'for' counter and make the alert useless?
Rather than specifically getting an answer to this question, I'm more interested in knowing how I might be able to write a unit test that demonstrates this problem.
Cheers and thanks in advance for your advice,
Alex