Thank you both for taking the time to answer my questions!
The main use case I've been thinking is being able to differentiate between flapping alerts in the alert generator (Prometheus) and flapping alerts in the alert receiver (Alertmanager). In the former, the alert is flapping because the data is alternating around the condition without stabilizing. In the latter case, the alert generator is failing to keep the alert receiver informed about the state of the alert before its expiration time (EndsAt).
In either case I'm not proposing for alerts to be more responsive to flapping, however based on what I've learned about Prometheus and Alertmanager so far, and the answers above, being able to differentiate between the two is not a goal of Prometheus, but rather the opposite - to make them look the same.
> For example, this is important for the Alertmanager to see alerts from multiple Prometheus servers as identical if they have the same label set, even if they began and were resolved at slightly different times.
Indeed! The other use case is if we can make it easier to debug cases of flapping alerts, including when there are multiple Prometheus servers sending alerts to an Alertmanager. The motivation here is I've been debugging a number of cases of flapping alerts and it can be hard to understand where the flapping is coming from.
> An "alert" in that sense is different from an "incident" or particular time-based instance of an alert, which Prometheus does not explicitly model. The closest thing to that is the Alertmanager taking in varying alert states over time and turning them into discrete notifications while applying throttling and grouping mechanisms. Those can prevent some flapping on the notification front, and careful alerting rules (averaging over large enough durations, using "for" durations, etc.) can do their part as well.
Thanks for the explanation here! I think this was the main design choice I wanted to understand.
But, I have to ask the question of what is the purpose of Prometheus sending a StartsAt time to Alertmanager? This creates a time-based instance of an alert because alerts have definitive StartsAt times, so Prometheus is kind of modelling time-based alerts - and also not modelling time-based alerts all at the same time.
I think the StartsAt time of an alert can also go backwards when running Prometheus HA because different Prometheus servers will have different offsets for the same evaluation group depending on when the Prometheus process first started.