This generally indicates that the processor on the BlueField2 smart NIC
has become inaccessible to the system and reboots can takes 10s of minutes
as a result. The solution is to reinitialize the BF2 card which itself is
a lengthy process. We have a process that involves loading a custom disk
image with all the necessary packages installed and a startup script to
fire off the necessary scripts. We generally only do this between experiments
since it wipes out the contents of the local disk.
If it is not affecting your experiment now, then we can just let it be,
though any reboot of the node will take a long time.
> --
> You received this message because you are subscribed to the Google Groups
> "cloudlab-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to
cloudlab-user...@googlegroups.com.
> To view this discussion visit
https://groups.google.com/d/msgid/cloudlab-users/
> 80b64575-ea31-407e-865b-4dff1188f55fn%
40googlegroups.com.