Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1031839: network devices do not work after kernel upgrade

222 views
Skip to first unread message

Thorsten Alteholz

unread,
Feb 23, 2023, 6:40:05 PM2/23/23
to
Package: linux-image-amd64
Version: 6.1.8-1
Severity: important

Hi,

upgrading from kernel 5.18.5-1 to 6.1.8-1 (and 6.1.12-1) network devices using driver module ixgbe stop working.
The devices are recognized and can be configured but "ip a" shows "no-carrier" for them.
With kernel 5.18.5-1 everthing works fine.

Thorsten

Daniel Baumann

unread,
Feb 24, 2023, 5:10:04 AM2/24/23
to
Hi Thorsten,

On 2/24/23 00:28, Thorsten Alteholz wrote:
> The devices are recognized and can be configured but "ip a" shows
> "no-carrier" for them.

I can't reproduce it, the nics works for fine for me with both 6.1.8 and
6.1.12.

Can you rule out that you were hit by the "interface names
changed"-problem (here, enp* changed to ens*)? Between 5.x and 6.x this
cought us a couple of times by surprise too..

Regards,
Daniel

Thorsten Alteholz

unread,
Feb 24, 2023, 7:40:04 PM2/24/23
to
Hi Daniel,

On Fri, 24 Feb 2023, Daniel Baumann wrote:
> Can you rule out that you were hit by the "interface names
> changed"-problem (here, enp* changed to ens*)? Between 5.x and 6.x this
> cought us a couple of times by surprise too..

yes, the enp*/ens* interfaces are handled by driver igc.
They are "normal" 1 Gigabit interfaces and are working fine, independent
of the kernel version.

The problem is related to the 10GB interfaces that are handled by module
ixgbe. They are called eno*.

The first log shows the not working kernel 6.x and the second part shows
the working kernel 5.x. "ip a" shows in the first case "no-carrier" and
all LEDs of that interface remain off.

Thorsten


root@tor-tc:~# uname -a
Linux tor-tc 6.1.0-3-rt-amd64 #1 SMP PREEMPT_RT Debian 6.1.8-1 (2023-01-29) x86_64 GNU/Linux

[ 1.742034] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver
[ 1.742039] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 2.390796] ixgbe 0000:0a:00.0: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[ 2.525760] ixgbe 0000:0a:00.0: MAC: 6, PHY: 0, PBA No: 000700-000
[ 2.525766] ixgbe 0000:0a:00.0: 00:f0:cb:ee:aa:49
[ 2.793547] ixgbe 0000:0a:00.0: Intel(R) 10 Gigabit Network Connection
[ 3.194989] ixgbe 0000:0a:00.1: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[ 3.324206] ixgbe 0000:0a:00.1: MAC: 6, PHY: 17, SFP+: 6, PBA No: 000700-000
[ 3.324211] ixgbe 0000:0a:00.1: 00:f0:cb:ee:aa:4a
[ 3.458876] ixgbe 0000:0a:00.1: Intel(R) 10 Gigabit Network Connection
[ 4.108945] ixgbe 0000:0b:00.0: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[ 4.238135] ixgbe 0000:0b:00.0: MAC: 6, PHY: 0, PBA No: 000700-000
[ 4.238140] ixgbe 0000:0b:00.0: 00:f0:cb:ee:aa:4b
[ 4.505385] ixgbe 0000:0b:00.0: Intel(R) 10 Gigabit Network Connection
[ 4.909029] ixgbe 0000:0b:00.1: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[ 5.038230] ixgbe 0000:0b:00.1: MAC: 6, PHY: 17, SFP+: 6, PBA No: 000700-000
[ 5.038235] ixgbe 0000:0b:00.1: 00:f0:cb:ee:aa:4c
[ 5.172833] ixgbe 0000:0b:00.1: Intel(R) 10 Gigabit Network Connection
[ 5.175251] ixgbe 0000:0a:00.1 eno2: renamed from eth1
[ 5.219307] ixgbe 0000:0a:00.0 eno1: renamed from eth0
[ 5.243600] ixgbe 0000:0b:00.1 eno4: renamed from eth3
[ 5.263641] ixgbe 0000:0b:00.0 eno3: renamed from eth2
[ 7.714775] ixgbe 0000:0a:00.1: registered PHC device on eno2
[ 7.787883] ixgbe 0000:0b:00.1: registered PHC device on eno4
[ 7.821476] ixgbe 0000:0a:00.1 eno2: detected SFP+: 6
[ 8.851220] ixgbe 0000:0b:00.1 eno4: detected SFP+: 6


root@tor-tc:~# uname -a
Linux tor-tc 5.19.0-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.19.11-1 (2022-09-24) x86_64 GNU/Linux

[ 1.565523] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver
[ 1.565531] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 2.209787] ixgbe 0000:0a:00.0: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[ 2.337322] ixgbe 0000:0a:00.0: MAC: 6, PHY: 0, PBA No: 000700-000
[ 2.337327] ixgbe 0000:0a:00.0: 00:f0:cb:ee:aa:49
[ 2.606593] ixgbe 0000:0a:00.0: Intel(R) 10 Gigabit Network Connection
[ 3.001721] ixgbe 0000:0a:00.1: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[ 3.131202] ixgbe 0000:0a:00.1: MAC: 6, PHY: 17, SFP+: 6, PBA No: 000700-000
[ 3.131207] ixgbe 0000:0a:00.1: 00:f0:cb:ee:aa:4a
[ 3.264814] ixgbe 0000:0a:00.1: Intel(R) 10 Gigabit Network Connection
[ 3.909727] ixgbe 0000:0b:00.0: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[ 4.039202] ixgbe 0000:0b:00.0: MAC: 6, PHY: 0, PBA No: 000700-000
[ 4.039207] ixgbe 0000:0b:00.0: 00:f0:cb:ee:aa:4b
[ 4.306556] ixgbe 0000:0b:00.0: Intel(R) 10 Gigabit Network Connection
[ 4.705443] ixgbe 0000:0b:00.1: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
[ 4.833320] ixgbe 0000:0b:00.1: MAC: 6, PHY: 17, SFP+: 6, PBA No: 000700-000
[ 4.833325] ixgbe 0000:0b:00.1: 00:f0:cb:ee:aa:4c
[ 4.966519] ixgbe 0000:0b:00.1: Intel(R) 10 Gigabit Network Connection
[ 4.968319] ixgbe 0000:0a:00.1 eno2: renamed from eth1
[ 4.968319] ixgbe 0000:0a:00.1 eno2: renamed from eth1
[ 4.981671] ixgbe 0000:0b:00.1 eno4: renamed from eth3
[ 5.017351] ixgbe 0000:0b:00.0 eno3: renamed from eth2
[ 5.033434] ixgbe 0000:0a:00.0 eno1: renamed from eth0
[ 7.492177] ixgbe 0000:0a:00.1: registered PHC device on eno2
[ 7.595273] ixgbe 0000:0a:00.1 eno2: detected SFP+: 6
[ 7.604203] ixgbe 0000:0b:00.1: registered PHC device on eno4
[ 8.241189] ixgbe 0000:0a:00.1 eno2: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 8.343194] ixgbe 0000:0b:00.1 eno4: detected SFP+: 6
[ 8.997172] ixgbe 0000:0b:00.1 eno4: NIC Link is Up 10 Gbps, Flow Control: RX/TX

Daniel Baumann

unread,
Feb 25, 2023, 12:50:04 AM2/25/23
to
On 2/25/23 00:43, Thorsten Alteholz wrote:
> The problem is related to the 10GB interfaces that are handled by module
> ixgbe. They are called eno*.

jftr, that depends on the board/card configuration (in the majority of
our supermicro servers, the ixgbe are usually enp175* and ens2*)

> The first log shows the not working kernel 6.x and the second part shows
> the working kernel 5.x. "ip a" shows in the first case "no-carrier" and
> all LEDs of that interface remain off.

what specific nic or mainboard do you have (onboard or discrete nic)?

do you use the correct branded sfps, did you try
"allow_unsupported_sfp=1" already?

Regards,
Daniel

Thorsten Alteholz

unread,
Feb 25, 2023, 6:10:03 AM2/25/23
to


On 25.02.23 06:42, Daniel Baumann wrote:
jftr, that depends on the board/card configuration (in the majority of
our supermicro servers, the ixgbe are usually enp175* and ens2*)

oh, thanks, I thought the interface names just depend on the used driver module.


what specific nic or mainboard do you have (onboard or discrete nic)?

It is an onboard one:
root@tor-tc:~# lspci|grep Ethernet
02:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 03)
03:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 03)
0a:00.0 Ethernet controller: Intel Corporation Ethernet Connection X553 10 GbE SFP+ (rev 11)
0a:00.1 Ethernet controller: Intel Corporation Ethernet Connection X553 10 GbE SFP+ (rev 11)
0b:00.0 Ethernet controller: Intel Corporation Ethernet Connection X553 10 GbE SFP+ (rev 11)
0b:00.1 Ethernet controller: Intel Corporation Ethernet Connection X553 10 GbE SFP+ (rev 11)


The I225-V are working fine, the other four make trouble.
I am using transceiver modules AXS85-192-M3 from 10Gtek.



do you use the correct branded sfps, did you try
"allow_unsupported_sfp=1" already?

As they work with kernel 5.x I assumed they are fine.
The allow_unsupported_sfp=1 does not make a difference. Shouldn't there be at least a syslog message if an unsupported sfp is detected?

  Thorsten

Daniel Baumann

unread,
Feb 27, 2023, 9:40:04 AM2/27/23
to
Hi Thorsten,

On 2/25/23 11:56, Thorsten Alteholz wrote:
> The I225-V are working fine, the other four make trouble.

right, but those are copper interfaces.

> I am using transceiver modules AXS85-192-M3 from 10Gtek.

It looks like they are not flashable (like flexoptix and others), so I
presume these are "non-Intel"-branded.

> The allow_unsupported_sfp=1 does not make a difference. Shouldn't there
> be at least a syslog message if an unsupported sfp is detected?

Just to be extra sure, you've added:

ixgbe.allow_unsupported_sfp=1

to your kernel cmdline, right?


I've checked with an up2date bookworm test-server and kernel 6.1.12-1..

when inserting a Intel-branded flexoptix SFP, I'll get this message:

Feb 27 14:59:06 xxx kernel: ixgbe 0000:5e:00.0 ens2f0: detected SFP+: 5

whereas with a Arista-branded one, I'll get the same:

Feb 27 15:00:30 xxx kernel: ixgbe 0000:5e:00.0 ens2f0: detected SFP+: 5

Just to document it.. this is dmesg from after rebooting the machine,
there's no allow_unsupported_sfp set at all, and there's an
Intel-branded SFP in one slot (ens2f0), an Arista-branded one in the
other (ens2f1):

[ 3.178235] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver
[ 3.178385] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 3.927189] ixgbe 0000:5e:00.0: Multiqueue Enabled: Rx Queue count =
63, Tx Queue count = 63 XDP Queue count = 0
[ 3.927726] ixgbe 0000:5e:00.0: 32.000 Gb/s available PCIe bandwidth
(5.0 GT/s PCIe x8 link)
[ 3.928058] ixgbe 0000:5e:00.0: MAC: 2, PHY: 19, SFP+: 5, PBA No:
0210FF-0FF
[ 3.928320] ixgbe 0000:5e:00.0: 0c:c4:7a:8f:48:f6
[ 3.940393] ixgbe 0000:5e:00.0: Intel(R) 10 Gigabit Network Connection
[ 4.121469] ixgbe 0000:5e:00.1: Multiqueue Enabled: Rx Queue count =
63, Tx Queue count = 63 XDP Queue count = 0
[ 4.122320] ixgbe 0000:5e:00.1: 32.000 Gb/s available PCIe bandwidth
(5.0 GT/s PCIe x8 link)
[ 4.122938] ixgbe 0000:5e:00.1: MAC: 2, PHY: 20, SFP+: 6, PBA No:
0210FF-0FF
[ 4.123483] ixgbe 0000:5e:00.1: 0c:c4:7a:8f:48:f7
[ 4.134330] ixgbe 0000:5e:00.1: Intel(R) 10 Gigabit Network Connection
[ 4.315894] ixgbe 0000:5e:00.0 ens2f0: renamed from eth2
[ 4.411504] ixgbe 0000:5e:00.1 ens2f1: renamed from eth3
[ 8.043579] ixgbe 0000:5e:00.0: registered PHC device on ens2f0
[ 8.227099] ixgbe 0000:5e:00.0 ens2f0: detected SFP+: 5

it's not obvious from the messages that one SFP is working and the other
one is not.

the only difference I can see is that the one with the Intel-branded SFP
has this line:

[ 8.043579] ixgbe 0000:5e:00.0: registered PHC device on ens2f0

there's also no difference wrt/ the debug level (I've testet with printk
set to 7 and 8, no additional messages are shown).

Hope that helps - my guess would be to try and verify with an Intel or
Intel-flashed SFP to rule out.

Hope that helps.

Regards,
Daniel

Daniel Baumann

unread,
Feb 27, 2023, 9:50:05 AM2/27/23
to
On 2/27/23 15:25, Daniel Baumann wrote:
> ixgbe.allow_unsupported_sfp=1

I've rebootet the server again just to check and indeed, the above
override doesn't work anymore (also tried with ...=1,1 because it's a
two slots adapter, but doesn't make any difference).

So, seems the reason for your trouble is that allow_unsupported_sfp
broke somewhen in between 5.x and 6.x, at least for ixgbe. I didn't
check with i40e, but could do so if needed/wanted.

Any chance you could confirm it by testing with an Intel-branded SFP?

Regards,
Daniel

Diederik de Haas

unread,
Feb 27, 2023, 12:20:05 PM2/27/23
to
On Friday, 24 February 2023 00:28:33 CET Thorsten Alteholz wrote:
> upgrading from kernel 5.18.5-1 to 6.1.8-1 (and 6.1.12-1) network devices
> using driver module ixgbe stop working. The devices are recognized and can
> be configured but "ip a" shows "no-carrier" for them.
> With kernel 5.18.5-1 everthing works fine.

There have been quite a few Debian kernel between those version and it would
be useful to narrow that range. You can find various deb files for amd64 here:
https://snapshot.debian.org/binary/linux-image-amd64/
signature.asc

Thorsten Alteholz

unread,
Feb 28, 2023, 1:10:03 PM2/28/23
to
Hi Daniel,

On 27.02.23 15:45, Daniel Baumann wrote:
> On 2/27/23 15:25, Daniel Baumann wrote:
>> ixgbe.allow_unsupported_sfp=1

yes, I tried it this way and as option in the module configuration.

>
> Any chance you could confirm it by testing with an Intel-branded SFP?

I haven't one yet. As they look a bit expensive, which one would you
recommend?

  Thorsten

Thorsten Alteholz

unread,
Feb 28, 2023, 1:20:04 PM2/28/23
to


On 27.02.23 18:05, Diederik de Haas wrote:
>
> There have been quite a few Debian kernel between those version and it would
> be useful to narrow that range. You can find various deb files for amd64 here:
> https://snapshot.debian.org/binary/linux-image-amd64/

Ok, I will try other kernels ...

Daniel Baumann

unread,
Feb 28, 2023, 1:40:04 PM2/28/23
to
On 2/28/23 18:58, Thorsten Alteholz wrote:
> which one would you recommend?

we use those extensively (several thousands), ymmv:
https://www.flexoptix.net/de/p-8596-02.html

Regards,
Daniel
0 new messages