After grow bash node is not connecting anymore

36 views
Skip to first unread message

maddie sal

unread,
Jan 31, 2025, 1:28:35 PMJan 31
to cloudlab-users
hello 

after i ran the following

   # Then in the remote host do the following to execute the script: 
 chmod +x setup-grow-rootfs.sh 
sudo RESIZEROOT=0 ./setup-grow-rootfs.sh
 sudo reboot

The shell doesnt connect anymore (neither the online one nor ssh)


   

David M Johnson

unread,
Jan 31, 2025, 2:28:45 PMJan 31
to cloudla...@googlegroups.com
By the time I was able to look, your node had booted fine (based on
console output), but it was unresponsive. I power-cycled it and looked
at the last boot's system log (`sudo journalctl -u-1 -x`) and discovered
the node had suspended (last log: `kernel: PM: suspend entry`). I think
we've seen this as a side effect of installing nvidia tools; see
https://groups.google.com/g/cloudlab-users/c/Dyn1HYUEkqc/m/IkYcovRBAgAJ
for a workaround (`sudo systemctl mask sleep.target suspend.target
hibernate.target hybrid-sleep.target`). No problem from
setup-grow-rootfs.sh on clnode119 (which I believe is your experiment).
The only issue is that there was a stale swap device in /etc/fstab
slowing down the boot. Looks like NetworkManager also got installed,
and that was further slowing down the boot (networkmanager was already
masked, so networkmanager didn't stop the node from becoming connected;
but its companion NetworkManager-wait-online service was not, and that
was waiting to no purpose). To disable NetworkManager, see
https://groups.google.com/g/cloudlab-users/c/B6rNj7Vhltk/m/rwkHf_kwAgAJ
(`sudo systemctl disable NetworkManager NetworkManager-wait-online`).

I ran all these commands on your node and rebooted, so you should be
good now.

David
Reply all
Reply to author
Forward
0 new messages