Elevated frequency of Host Maintenance events on GCE instances with an attached GPU(s) and SSD(s)

520 views
Skip to first unread message

Google Cloud Platform Status

unread,
Nov 10, 2020, 5:33:36 PM11/10/20
to gce-ope...@googlegroups.com
Description: We are experiencing an issue with Google Compute Engine beginning in 2020-08. A firmware rollout is being created that should address the issue.

The rollout is currently expected to complete next week, but mitigation efforts are still ongoing.

We will provide more information by Tuesday, 2020-11-10 16:30 US/Pacific.

Diagnosis: Affected customers will experience elevated frequency of Host Maintenance events on GCE instances with an attached GPU(s) and SSD(s).

Workaround: Temporarily switch to use V100 GPU's which are unaffected by this issue.
https://cloud.google.com/compute/docs/gpus

Google Cloud Platform Status

unread,
Nov 10, 2020, 7:04:39 PM11/10/20
to gce-ope...@googlegroups.com
Description: Mitigation work is still underway by our engineering team. Further investigation of current impact and mitigation timeline is ongoing.

We will provide more information by Wednesday, 2020-11-11 13:00 US/Pacific.

Google Cloud Platform Status

unread,
Nov 11, 2020, 2:01:27 PM11/11/20
to gce-ope...@googlegroups.com
The issue with Google Compute Engine instances with an attached GPU(s) and SSD(s) is believed to be affecting a very small number of projects and our Engineering Team continues to work on it.

If you have questions or are impacted, please open a case with the Support Team and we will work with you until this issue is resolved.

No further updates will be provided here.

We thank you for your patience while we're working on resolving the issue.
Reply all
Reply to author
Forward
0 new messages