In notify/notify.go I see:
for {
i++
// Always check the context first to not notify again.
select {
case <-ctx.Done():
if iErr == nil {
iErr = ctx.Err()
}
return ctx, nil, errors.Wrapf(iErr, "%s/%s: notify retry canceled after %d attempts", r.groupName, r.integration.String(), i)
That is: it keeps retrying at exponential intervals until the overall context expires - which according to your measurements is 1 minute.
I'm not entirely sure where this limit comes from, but it might be the group_interval - see dispatch/dispatch.go:
// Give the notifications time until the next flush to
// finish before terminating them.
ctx, cancel := context.WithTimeout(ag.ctx, ag.timeout(ag.opts.GroupInterval))
I don't think it's designed to be a long-term queue. If you have a situation where the webhook endpoint really could be down for hours on end, and you don't want to lose alerts, then I think you should run a local webhook on the same server, which queues the requests and then delivers them to the *real* webhook when it becomes available.
Of course, you'd also have to be happy that you may get a splurge of alerts, many of which may already have been resolved.