Hi,
i am using the amtool client in a Job inside my cluster.
An alert was fired and we got notification in our slack channel, i used the cli (in code that runs inside docker image from the Job) to create a silence according to `alertname` matcher and there was no failure.
from a look in the AlertManager UI no silence was created, and i got resolved notification after 5 minutes since the fired notification.
After ~10 minutes the alert was fired and resolved again (5 minutes difference).
I wonder why the silence wasn't able to create? (not the first time it happens)
Maybe it's some kind of a race condition? we can't silence alerts which are not in fired state right? (although the alert was in fired state while i tried to create the silence)
The Alert rule:
annotations:
dashboard_url: p-R7Hw1Iz
runbook_url: extension-orchestrator-dashboard
summary: Failed gRPC calls detected in the Envoy External Processor within the last 5 minutes. <!subteam^S06E0CPPC5S>
The code for creating the silence:
func postSilence(amCli amclient.Client, matchers []*models.Matcher) error {
startsAt := strfmt.DateTime(silenceStart)
endsAt := strfmt.DateTime(silenceStart.Add(silenceDuration))
createdBy := creatorType
comment := silenceComment
silenceParams := silence.NewPostSilencesParams().WithSilence(
&models.PostableSilence{
Silence: models.Silence{
Matchers: matchers,
StartsAt: &startsAt,
EndsAt: &endsAt,
CreatedBy: &createdBy,
Comment: &comment,
},
},
)
err := amCli.PostSilence(silenceParams)
if err != nil {
return fmt.Errorf("failed on post silence: %w", err)
}
log.Print("Silence posted successfully")
return nil
}
Thank in advance,
Saar Zur SAP Labs