Cloudlab Utah unable to terminate experiments

72 views
Skip to first unread message

Bird, Sam

unread,
Jan 15, 2026, 9:55:17 AM (6 days ago) Jan 15
to Digest recipients
Good morning:

I have 10 experiments in the expired state in Cloudlab Utah. When I attempt to terminate them manually, I get the error message 'Unable to terminate experiment at this time: The Cloudlab Utah cluster is currently offline, please try again later'. Is the cluster back online after the power outage? I would like to terminate these experiments and create new ones, but I need to release the nodes.

Thanks
Sam

ajma...@gmail.com

unread,
Jan 15, 2026, 10:33:12 AM (6 days ago) Jan 15
to cloudlab-users
Hi Sam,

There are some lingering issues at the CloudLab Utah cluster that we are still working on resolving.  I'll send an update here as soon as the cluster is back online.  Apologies for the inconvenience.

Best,
 - Aleks

ajma...@gmail.com

unread,
Jan 15, 2026, 11:40:22 AM (6 days ago) Jan 15
to cloudlab-users
We have worked out the cluster issues and are re-enabling the portal now, apologies again for the delay.

Sam Bird

unread,
Jan 15, 2026, 2:03:02 PM (6 days ago) Jan 15
to cloudlab-users
Hi Aleks,

Thanks for your reply. unfortunately there is still an issue with DNS name resolution after the restart : https://www.cloudlab.us/status.php?uuid=aee9b4e7-f23c-11f0-bc80-e4434b2381fc

I am getting 'fatal: unable to access 'https://github.com/<repo>/<repo>': Could not resolve host: github.com'
As well as sudo: unable to resolve host index-tuner.cl-5rep-1100-1.qdina-pg0.utah.cloudlab.us: Temporary failure in name resolution

This doesn't seem to be an issue on all nodes, just a few. I am not sure if this is an issue on my end or cloudlab, but it has not happened before. I can still ping 8.8.8.8 so I don't think it's a wider network issue.

Thanks
Sam

Robert Ricci

unread,
Jan 16, 2026, 10:50:29 AM (5 days ago) Jan 16
to cloudla...@googlegroups.com
I don't know if this is the issue you ran into, but I also ran into some
intermittent errors from github yesterday, they clearly had some
flakiness yesterday. Are things okay today?
> --
> You received this message because you are subscribed to the Google Groups "cloudlab-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cloudlab-user...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/cloudlab-users/0f7ad7df-d03c-4bfa-ad44-552bc4792fafn%40googlegroups.com.

Bird, Sam

unread,
Jan 16, 2026, 10:54:02 AM (5 days ago) Jan 16
to cloudla...@googlegroups.com
Yes the issue seems to have cleared up. Thank you!


From: cloudla...@googlegroups.com <cloudla...@googlegroups.com> on behalf of Robert Ricci <ri...@cs.utah.edu>
Sent: Friday, January 16, 2026 9:50:07 AM
To: cloudla...@googlegroups.com <cloudla...@googlegroups.com>
Subject: Re: [cloudlab-users] Re: Cloudlab Utah unable to terminate experiments
 
I don't know if this is the issue you ran into, but I also ran into some intermittent errors from github yesterday, they clearly had some flakiness yesterday. Are things okay today? On Thu, Jan 15, 2026 at 11: 03: 01AM -0800, Sam Bird wrote: >
ZjQcmQRYFpfptBannerStart
External Email
 
ZjQcmQRYFpfptBannerEnd
I don't know if this is the issue you ran into, but I also ran into some
intermittent errors from github yesterday, they clearly had some
flakiness yesterday. Are things okay today?

On Thu, Jan 15, 2026 at 11:03:01AM -0800, Sam Bird wrote:
> Hi Aleks,
> 
> Thanks for your reply. unfortunately there is still an issue with DNS name 
> resolution after the restart 
> : https://urldefense.com/v3/__https://www.cloudlab.us/status.php?uuid=aee9b4e7-f23c-11f0-bc80-e4434b2381fc__;!!GNU8KkXDZlD12Q!6BNRUCybwyRa3JQEdnC2qA_jo8Hiw2B5d4Qs09k5zhsdFB4xsh7XNT68VLXJIPUEMvWzyu-Kh9w2BsQ$[cloudlab[.]us]
> 
> I am getting 'fatal: unable to access 'https://urldefense.com/v3/__https://github.com/__;!!GNU8KkXDZlD12Q!6BNRUCybwyRa3JQEdnC2qA_jo8Hiw2B5d4Qs09k5zhsdFB4xsh7XNT68VLXJIPUEMvWzyu-K6VoADC4$[github[.]com]<repo>/<repo>': 
> Could not resolve host: github.com'
> As well as sudo: unable to resolve host 
> index-tuner.cl-5rep-1100-1.qdina-pg0.utah.cloudlab.us: Temporary failure in 
> name resolution
> 
> This doesn't seem to be an issue on all nodes, just a few. I am not sure if 
> this is an issue on my end or cloudlab, but it has not happened before. I 
> can still ping 8.8.8.8 so I don't think it's a wider network issue.
> 
> Thanks
> Sam
> On Thursday, January 15, 2026 at 10:40:22 AM UTC-6 ajma...@gmail.com wrote:
> 
> > We have worked out the cluster issues and are re-enabling the portal now, 
> > apologies again for the delay.
> >
> > On Thursday, January 15, 2026 at 8:33:12 AM UTC-7 ajma...@gmail.com wrote:
> >
> >> Hi Sam,
> >>
> >> There are some lingering issues at the CloudLab Utah cluster that we are 
> >> still working on resolving.  I'll send an update here as soon as the 
> >> cluster is back online.  Apologies for the inconvenience.
> >>
> >> Best,
> >>  - Aleks
> >>
> >> On Thursday, January 15, 2026 at 7:55:17 AM UTC-7 sam....@ou.edu wrote:
> >>
> >>> Good morning:
> >>>
> >>> I have 10 experiments in the expired state in Cloudlab Utah. When I 
> >>> attempt to terminate them manually, I get the error message 'Unable to 
> >>> terminate experiment at this time: The Cloudlab Utah cluster is currently 
> >>> offline, please try again later'. Is the cluster back online after the 
> >>> power outage? I would like to terminate these experiments and create new 
> >>> ones, but I need to release the nodes.
> >>>
> >>> Thanks
> >>> Sam
> >>>
> >>>
> 
> -- 
> You received this message because you are subscribed to the Google Groups "cloudlab-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cloudlab-user...@googlegroups.com.
> To view this discussion visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/cloudlab-users/0f7ad7df-d03c-4bfa-ad44-552bc4792fafn*40googlegroups.com__;JQ!!GNU8KkXDZlD12Q!6BNRUCybwyRa3JQEdnC2qA_jo8Hiw2B5d4Qs09k5zhsdFB4xsh7XNT68VLXJIPUEMvWzyu-Ki1SEoG4$[groups[.]google[.]com].

-- 
You received this message because you are subscribed to a topic in the Google Groups "cloudlab-users" group.
To unsubscribe from this topic, visit https://urldefense.com/v3/__https://groups.google.com/d/topic/cloudlab-users/tHYmIcWsxMI/unsubscribe__;!!GNU8KkXDZlD12Q!6BNRUCybwyRa3JQEdnC2qA_jo8Hiw2B5d4Qs09k5zhsdFB4xsh7XNT68VLXJIPUEMvWzyu-KgK0XiSQ$[groups[.]google[.]com].
To unsubscribe from this group and all its topics, send an email to cloudlab-user...@googlegroups.com.
To view this discussion visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/cloudlab-users/aWpd-MjU5SOjihIs*40dent__;JQ!!GNU8KkXDZlD12Q!6BNRUCybwyRa3JQEdnC2qA_jo8Hiw2B5d4Qs09k5zhsdFB4xsh7XNT68VLXJIPUEMvWzyu-K_Soceuw$[groups[.]google[.]com].

Mike Hibler

unread,
Jan 16, 2026, 10:57:41 AM (5 days ago) Jan 16
to cloudla...@googlegroups.com
I think this really was a transient issue due to a confluence of events.

At the time we were still recovering from the power outage (that never
actually happened). This involved running a bunch (100s) of nodes through the
disk reloading process which uses UDP multicast to distribute images.

At the same time, you were instantiating your experiments which are interesting
in that they fire up VMs each running one of two very large custom OS images.
Those images are downloaded in parallel to each of the physical experiment
nodes again using UDP multicast.

The gist is that there was a lot of bursty, small-ish (1024 byte packets) UDP
traffic flooding the control network at the time, and DNS also uses unreliable
UDP datagrams. It was a toxic time to be a UDP packet on the control net.

But let us know if you are still experiencing this problem.

On Thu, Jan 15, 2026 at 11:03:01AM -0800, Sam Bird wrote:
> --
> You received this message because you are subscribed to the Google Groups
> "cloudlab-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to cloudlab-user...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/cloudlab-users/
> 0f7ad7df-d03c-4bfa-ad44-552bc4792fafn%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages