Wisconsin Cluster Down?

38 views
Skip to first unread message

Jaylen Wang

unread,
Dec 21, 2025, 11:33:13 AM (20 hours ago) Dec 21
to cloudlab-users
Hi Cloudlab Admins,

I was having trouble reaching nodes in my experiment from the Wisconsin cluster (experiment: https://www.cloudlab.us/status.php?uuid=0534976b-dceb-11f0-bc80-e4434b2381fc). Their statuses are unknown and unreachable. On the reservation page as well for me, it says the cluster is currently unavailable (I've attached an image). Apologies if this is planned (though I was unable to find information about any planned outage in Wisconsin for today). Screenshot 2025-12-21 at 11.30.46 AM.png

Thanks,
Jaylen

Johannes Freischuetz

unread,
Dec 21, 2025, 2:03:30 PM (18 hours ago) Dec 21
to cloudlab-users
I am also having the same issue. I cannot ssh into my machines, and am not able to power cycle the nodes.

Thanks,
Johannes

ajma...@gmail.com

unread,
Dec 21, 2025, 3:14:38 PM (16 hours ago) Dec 21
to cloudlab-users
Hi Jaylen, Johannes,

This was not planned (at least to my knowledge).  I'm not sure what happened, but I am trying to find out more information from our contacts at Wisconsin.  From what I can make of things, there was a power outage around 10 AM CST that lasted until around 11:20 AM CST.  While many nodes appear to have come back following this, there are some that did not.  For example, Johannes, your node did not come back up and I am unable to reach it even over the management interface.  Jaylen, ditto for your node0 in your experiment, but your node1 is completely reachable.  I'll send an update here once I have more information, and apologies for the inconvenience.

Best,
 - Aleks 

ajma...@gmail.com

unread,
12:41 AM (7 hours ago) 12:41 AM
to cloudlab-users
Just as an update, there are still a handful of switches that have not come back up and require in-person investigation.  We hope to have somebody there tomorrow to bring them back online.  This appears to be impacting access to at least some of the c220g5 nodes as well as some experiment net connectivity.  The switch state on the rest of the switches should have been restored this afternoon.

Johannes Freischuetz

unread,
1:13 AM (6 hours ago) 1:13 AM
to cloudlab-users
My node is back up, but my drive seems to be reset back to the original image. Is this related?
If it isn't, is there any way to recover this?

Reply all
Reply to author
Forward
0 new messages