Custom image stuck booting on Utah c6620 nodes

7 views
Skip to first unread message

Jaylen Wang

unread,
May 17, 2026, 11:15:08 AMMay 17
to cloudlab-users

Hi CloudLab team,

I’m having trouble booting a custom image on reserved c6620 nodes in the Utah cluster.

For many months, I’ve been able to boot this image successfully on these nodes, but starting sometime in the past week, new experiments using this image get stuck in Booting and never finish before the initial 16-hour experiment expires. Since the experiment remains in Booting, I’m also unable to extend it.

Current stuck experiment:

https://www.cloudlab.us/status.php?uuid=0dd50881-c452-49ef-97d0-aa19c36f51ed

Relevant details:

  • User: jaylenw

  • Project: sustainable-comp

  • Experiment: jaylenw-306068

  • Cluster: Utah

  • Node type: c6620

  • Custom image: urn:publicid:IDN+wisc.cloudlab.us+image+sustainable-comp-PG0:ubuntu24-v6.18-patched

  • Image size: about 6914 MB

From the portal logs, it looks like the experiment resolves and starts imaging, but the Frisbee progress appears to be slow enough that the nodes never finish booting before the experiment expires. Would someone be able to take a look at the server-side logs and let me know what may be going wrong? Or let me know if you need any information from me.

Thanks very much,
Jaylen

Mike Hibler

unread,
May 17, 2026, 12:56:02 PMMay 17
to cloudla...@googlegroups.com
Something seems to have gotten stuck along the image reload path, which in
your case involved NFS between a couple of our servers. I restarted the
NFS services and it seems to be moving again.

And it appears your experiment is ready.
> --
> You received this message because you are subscribed to the Google Groups
> "cloudlab-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to cloudlab-user...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/cloudlab-users/
> 337c6b18-c3c3-4bd4-a830-969ea35fb489n%40googlegroups.com.

Jaylen Wang

unread,
May 17, 2026, 6:11:54 PMMay 17
to cloudlab-users

Hi,

Thank you very much. The first experiment did come up successfully after the NFS restart.

I’m trying to start a second experiment with the same profile/image now, since I need two separate 2-node experiments. This second one appears to be stuck in the same booting/imaging behavior: it has been in Booting for about 3 hours so far.

Relevant details:

  • Experiment: jaylenw-306082

  • Project: sustainable-comp

  • Profile: core-sched-experiment

  • Cluster: Utah

  • Node type: c6620

  • Custom image: urn:publicid:IDN+wisc.cloudlab.us+image+sustainable-comp-PG0:ubuntu24-v6.18-patched

  • Started: May 17, 2026 3:31 PM

  • Expires: May 18, 2026 7:31 AM

Could you check whether this one is hitting the same image reload / NFS issue? Please let me know if there is anything I should do differently when launching the second experiment.

Thanks again,
Jaylen

Reply all
Reply to author
Forward
0 new messages