Ubuntu 16 & 18 NIC renaming

14 views
Skip to first unread message

Jason Boles

unread,
Mar 13, 2020, 1:45:48 PM3/13/20
to emulab-admins
Hello,
  We've been seeing some quirks with Ubuntu and a new type of nodes (Intel S2600JF server boards).
There are 2 on-board GbE NICs (intel I350), only 1 interface is connected.

During boot, there's a long (30-45s) delay, I suspect that systemd has a race condition between services (`systemd-analyze` didn't show anything obvious)
After it's booted, the first igb NIC is called "rename2".

adding GRUB_CMDLINE_LINUX_DEFAULT="net.ifnames=0 biosdevname=0" to grub solves the problem - boots fast and it gets called "eth0" , however on other node types we have, the additional NICs, such as 40GbE also get the "ethX" name, and we'd rather have those interfaces distinguishable.

booting Ubuntu from USB to the same node type doesn't show this behavior.

Is Emulab running any startup scripts or inserting any udev rules that would cause the "rename2" to show up?

Thanks in advance!
-Jason Boles

David M. Johnson

unread,
Mar 13, 2020, 2:54:15 PM3/13/20
to emulab...@googlegroups.com, Jason Boles
Interesting, thanks for the report. The network init path is different
on 16 (the old ifup /etc/network/interfaces style) vs 18 (udev rules
that create a dynamic systemd-networkd configuration).

Overall, how we bring up the control net for different linux
distro/versions is always a bit complicated. Few (or no!) network
managers support what we need, which is a
dhcp-on-all-ethernet-ifaces-with-carrier-until-one-recvs-address search,
so that we can dynamically find the control net device.

We don't do any renaming via udev. On 18, you can see our rules in
/etc/udev/rules.d/99-emulab-networkd.rules . All they do is add a
script that should be run
(/usr/local/etc/emulab/emulab-networkd-udev-helper.sh) that tells
systemd-networkd about a new interface; there are comments in that file
to explain more what happens (e.g., how we find the correct interface).

I could certainly believe a udev race in 18; but because this happens on
both 16 and 18, it's not so clear. We have seen more recently that
onboard ipmi devices can be quite confusing to this search, so newest
versions of our 18 image have that code. If you don't have the latest
version, can you please do something like (`/usr/testbed/sbin/wap
/usr/testbed/sbin/image_import -gr emulab-ops/UBUNTU18-64-STD`), then
try again and see if the problem remains.

I'd be happy to poke at these nodes in an expt on Monday if you can get
me both remote ssh and serial console access. Or if that's not
possible, maybe send me logs: tarball up /var/emulab, and the output of
`journalctl -b0`, right after boot.

> Thanks in advance!
> -Jason Boles

David

Jason Boles

unread,
Mar 13, 2020, 3:14:14 PM3/13/20
to emulab-admins

Thanks David,
  I think we found the culprit - this bug:  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1578141 
It's actually caused by BIOS having 2 or more identical biosdevnames for the network devices.
Chuck found that our problem nodes have 2 called "em1" whereas other node types were all unique.

I'll try an intel BIOS update (release notes mention fixing the ID_NET_NAME_ONBOARD variable), and respond again here.

regards,
--Jason

Jason Boles

unread,
Mar 13, 2020, 8:16:24 PM3/13/20
to emulab-admins

I can confirm that the updated BIOS fixed this issue - now the NICs are eno1 and eno2 in both Ubuntu 18 & 16.

Regards,
Jason
Reply all
Reply to author
Forward
0 new messages