GCP Instance - constant disconnections

660 views
Skip to first unread message

Youcef DJEDDAR

unread,
Mar 26, 2020, 8:20:45 PM3/26/20
to gce-discussion
Hi all,

My GCP instance for DL has been constantly failing for the last 3-4 months, disconnecting every now and then and considerably handicapping my work. It used to be reliable but it is no longer the case it seems, at least with me.

Help, please.

Thank you.

Cameron Thomas Otway

unread,
Mar 27, 2020, 3:43:11 PM3/27/20
to gce-discussion
Hello Youcef 

Can you provide us with a little more details? I'm assuming you're referring to Deep Learning? what are the implementation details? for example, did you  deploy this in marketplace or via command line. [1] If Market place, did you try updating the packages? What machine type did you choose?Please elaborate on "disconnecting". have you checked  Cloud logging and the internal logs? if you see something, let us know.

Steve Lorimer

unread,
Mar 27, 2020, 6:10:02 PM3/27/20
to Youcef DJEDDAR, gce-discussion
Is it possible you have preemption enabled?

--
© 2018 Google Inc. 1600 Amphitheatre Parkway, Mountain View, CA 94043
 
Email preferences: You received this email because you signed up for the Google Compute Engine Discussion Google Group (gce-dis...@googlegroups.com) to participate in discussions with other members of the Google Compute Engine community and the Google Compute Engine Team.
---
You received this message because you are subscribed to the Google Groups "gce-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gce-discussio...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gce-discussion/19101484-3ab0-4abd-9d3c-9d4145b489aa%40googlegroups.com.

Digil (Google Cloud Platform Support)

unread,
Mar 30, 2020, 3:38:35 PM3/30/20
to gce-discussion
As the other community member was suggesting you may need to inspect whether the instance has the preemptibility policy enabled. You can find this option under 'Availability policy' of the VM. 

Under the availability policy, you may further need to check whether the instance is set to 'Terminate VM instance' during the time of a maintenance event. It is still possible that the VM instance's maintenance behavior could have also caused this. 

Another approach is to check the logging information (ie stackdriver logs of the affected GCE VM) around the incident time frame to see if there was any error logs reported over there. Using the logs, you can determine whether the issue is related with the platform itself. 

Furthermore, you can also refer this help center article which describes troubleshooting steps that you might find helpful if you run into problems using Compute Engine instances.

Reply all
Reply to author
Forward
0 new messages