On 11/14/24 18:43, 'Nurlan Nazaraliyev' via cloudlab-users wrote:
> Hello,
>
> I am running 2 xl170 nodes. Every time I update the driver or I
> downgrade the linux version, the connection goes off (I need to reload
> in this case). It is because of the network mismatch on the control link
> (I believe). Is there any way to have an independent/separate control
> link (than the link between the nodes)?
>
> I don't want my control network to go down as I make modifications to
> the system. Is that possible?
I'm not sure exactly which link you mean. What we call the "control
net" are the interfaces with public IPs; the "experiment net" is
collectively any links you define in your profile (in this case, a
single 25Gbps link between two nodes). On the xl170s, there are two
dual-port Mellanox ConnectX-4s, one with a 10Gbps control net port and
10Gbps port for experiment networks; and another with a 25Gbps port that
can be used for experiment networks. This means the same ethernet
driver is used for both NICs, so you do have to be careful updating it.
The loss of the control net connection after upgrade/reboot can result
from many things: e.g. old kernel with insufficient driver support,
installation of NetworkManager due to package dependencies. For the
latter, please search the forum for old posts that tell you how to
disable NetworkManager, e.g.
https://groups.google.com/g/cloudlab-users/c/B6rNj7Vhltk/m/rwkHf_kwAgAJ .
> status
> page:
https://www.cloudlab.us/status.php?uuid=610603da-a2b7-11ef-af1a-e4434b2381fc
Sorry we didn't get to look at this before it expired during the night.
> Nurlan
David