Nope it was certainly down - the machine had become unresponsive, and so I gave it hard reset and it started working again (yay stateless VMs).
It is puzzling why it was down, though. I haven't had a chance to dig into the logs, but when I first checked there wasn't anything noteworthy there.
Maybe something is wrong with how the machine is configured - the keyserver runs in a Docker container running on
Container Optimized OS.
I'm not sure why it should lock up like that. I think there might be some permissions issue exporting metrics to the cloud monitoring stuff, so maybe that's piling up and choking the machine? Seems a poor default though.
I will investigate more this evening.
In the meantime I did set up some more alerting so that I will be notified immediately if the service stops responding. Apologies for the outage.
Andrew