On 3/2/22 11:24 AM, Yueying Li wrote:
> Hi Leigh,
>
> Thank you for taking a look in advance. I was recently trying to install
> CUDA on
c240g5-110225.wisc.cloudlab.us. And because we need a reboot
> after that, we cannot connect to the machines now. The console shows
> Trying 127.0.0.1... Connected to localhost. Escape character is 'off'.
>
> Could you help take a look?
If you look at the console log, you will see that NetworkManager has
taken over the network init path, instead of our systemd-networkd path.
This is a known problem with various nvidia toolkits' package
dependencies (e.g. see search results on this list,
https://groups.google.com/g/cloudlab-users/search?q=networkmanager).
The fix is either to `systemctl disable NetworkManager`, or do something
like `sudo ln -s /dev/null /etc/systemd/system/NetworkManager.service`
before installing.
If you want to rescue these particular machines, you will need to boot
them into the Recovery MSF via the node popup menus in the Topology
View, then mount the disks and chroot into the on-disk root, and disable
NetworkManager that way. See
https://gitlab.flux.utah.edu/emulab/emulab-devel/-/wikis/faq/Using-the-Testbed/Using-the-Recovery-MFS
for instructions.
David
> <
http://c4130-110133.wisc.cloudlab.us>. It goes down once in a while.
> > The icon shows that the machine is ready and ISUP, but I am unable
> to connect to the console or ssh.
> > After my manual rebooting it, it can back to normal for a while,
> but after a while the same problem will happen again.
>
> Hi. Next time this happens, please do not reboot the machine.
> Instead, send us email so that we can diagnose the problem.
>
> Thanks
> Leigh
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "cloudlab-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
cloudlab-user...@googlegroups.com
> <mailto:
cloudlab-user...@googlegroups.com>.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/cloudlab-users/af5ee7e6-22e6-49de-a7f0-fca5cf9a3262n%40googlegroups.com
> <
https://groups.google.com/d/msgid/cloudlab-users/af5ee7e6-22e6-49de-a7f0-fca5cf9a3262n%40googlegroups.com?utm_medium=email&utm_source=footer>.