Re: LOR on em in HEAD ( was Re: em driver regression

5 views
Skip to first unread message

Jack Vogel

unread,
Apr 9, 2010, 3:09:24 PM4/9/10
to Mike Tancsa, pyu...@gmail.com, freebsd...@freebsd.org
Someone else also pointed this out. I'm dubious about its claim.
This happens because there is an RX lock taken in rxeof, its held
thru the call into the stack, it then encounters another lock there
and hence this complaint. I've had the RX hold as it is for a long
while and would rather not have to give it up, can someone look
at it and advise?

Cheers,

Jack


On Fri, Apr 9, 2010 at 12:00 PM, Mike Tancsa <mi...@sentex.net> wrote:

> While testing an i5 box with HEAD checked out from this morning, bringing
> up the second NIC generated this LOR on the console
>
> em1: link state changed to UP
> lock order reversal:
> 1st 0xc5dc7c10 em1:rx(1) (em1:rx(1)) @
> /usr/HEAD/src/sys/modules/em/../../dev/e1000/if_em.c:4089
> 2nd 0xc0f7e88c udp (udp) @ /usr/HEAD/src/sys/netinet/udp_usrreq.c:454
> KDB: stack backtrace:
> db_trace_self_wrapper(c0cb4d33,c5b72a70,c08e4d65,c08d50db,c0cb7d58,...) at
> db_trace_self_wrapper+0x26
> kdb_backtrace(c08d50db,c0cb7d58,c5d31a98,c5d2cb60,c5b72acc,...) at
> kdb_backtrace+0x29
> _witness_debugger(c0cb7d58,c0f7e88c,c0c9cf0b,c5d2cb60,c0cd04ca,...) at
> _witness_debugger+0x25
> witness_checkorder(c0f7e88c,1,c0cd04ca,1c6,0,...) at
> witness_checkorder+0x839
> _rw_rlock(c0f7e88c,c0cd04ca,1c6,c5d33088,c5e8be24,...) at _rw_rlock+0x9c
> udp_input(c67faa00,14,0,c5e8bd80,c0dfa860,...) at udp_input+0x246
> ip_input(c67faa00,c5f2f380,c5b72be8,c07463b6,c0dfa860,...) at
> ip_input+0x606
> netisr_dispatch_src(1,0,c67faa00,c5b72c20,c0954bc1,...) at
> netisr_dispatch_src+0xcd
> netisr_dispatch(1,c67faa00,c6018c00,c6018c00,c6852800,...) at
> netisr_dispatch+0x20
> ether_demux(c6018c00,c67faa00,3,0,3,...) at ether_demux+0x1a1
> ether_input(c6018c00,c67faa00,c11a0e17,ff9,64,...) at ether_input+0x365
> em_rxeof(c5e8bd80,109,c6016180,0,c5b72cc8,...) at em_rxeof+0x13c
> em_msix_rx(c5dc7c00,c5b72cc8,c088eb14,c0e133c0,c60342b8,...) at
> em_msix_rx+0x25
> intr_event_execute_handlers(c5d807f8,c6034280,c0cacd7e,533,c60342f0,...) at
> intr_event_execute_handlers+0x125
> ithread_loop(c603b4a0,c5b72d38,c0cacaed,343,c5d807f8,...) at
> ithread_loop+0x9f
> fork_exit(c0877800,c603b4a0,c5b72d38) at fork_exit+0xb8
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0, eip = 0, esp = 0xc5b72d70, ebp = 0 ---
>
> 0(i5b)# uname -a
> FreeBSD i5b.sentex.ca 9.0-CURRENT FreeBSD 9.0-CURRENT #2: Fri Apr 9
> 11:56:25 EDT 2010 mdta...@ich10.sentex.ca:/usr/HEAD/obj/usr/HEAD/src/sys/GENERIC
> i386
> 0(i5b)#
>
> em0@pci0:0:25:0: class=0x020000 card=0x34ec8086 chip=0x10ef8086
> rev=0x05 hdr=0x00
> vendor = 'Intel Corporation'
> class = network
> subclass = ethernet
> cap 01[c8] = powerspec 2 supports D0 D3 current D0
> cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> cap 13[e0] = PCI Advanced Features: FLR TP
> em1@pci0:3:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 rev=0x00
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> class = network
> subclass = ethernet
> cap 01[c8] = powerspec 2 supports D0 D3 current D0
> cap 05[d0] = MSI supports 1 message, 64 bit
> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>
>
> 0(i5b)# vmstat -i
> interrupt total rate
> irq4: uart0 6156 3
> irq8: rtc 224879 127
> irq21: ehci0 2662 1
> irq23: ehci1 3674 2
> cpu0: timer 1754210 998
> irq256: em0 10778 6
> irq257: em1 331 0
> irq258: em1 4 0
> irq260: em1 4 0
> irq261: em1 8 0
> irq262: ahci0 69 0
> cpu3: timer 1753938 998
> cpu2: timer 1753932 998
> cpu1: timer 1753886 998
> Total 7264531 4134
> 0(i5b)#
>
> CPU: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz (2666.65-MHz 686-class
> CPU)
> Origin = "GenuineIntel" Id = 0x106e5 Family = 6 Model = 1e Stepping =
> 5
>
> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>
> Features2=0x98e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT>
> AMD Features=0x28100000<NX,RDTSCP,LM>
> AMD Features2=0x1<LAHF>
> TSC: P-state invariant
> real memory = 4294967296 (4096 MB)
> avail memory = 2577711104 (2458 MB)
> ACPI APIC Table: <INTEL S3420GPC>
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>
>
>
>
> --------------------------------------------------------------------
> Mike Tancsa, tel +1 519 651 3400
> Sentex Communications, mi...@sentex.net
> Providing Internet since 1994 www.sentex.net
> Cambridge, Ontario Canada www.sentex.net/mike
>
>

Mike Tancsa

unread,
Apr 9, 2010, 3:00:30 PM4/9/10
to Jack Vogel, pyu...@gmail.com, freebsd...@freebsd.org

Jack Vogel

unread,
Apr 9, 2010, 4:16:40 PM4/9/10
to pyu...@gmail.com, freebsd...@freebsd.org, Mike Tancsa
On Fri, Apr 9, 2010 at 1:13 PM, Pyun YongHyeon <pyu...@gmail.com> wrote:

> On Fri, Apr 09, 2010 at 12:09:24PM -0700, Jack Vogel wrote:
> > Someone else also pointed this out. I'm dubious about its claim.
>

> I can't reproduce the LOR with latest em(4)(r206429).
>
>
Hmmm, wonder what changed that effected that, oh well that's good :)


Jack

Pyun YongHyeon

unread,
Apr 9, 2010, 4:13:07 PM4/9/10
to Jack Vogel, freebsd...@freebsd.org, Mike Tancsa
On Fri, Apr 09, 2010 at 12:09:24PM -0700, Jack Vogel wrote:
> Someone else also pointed this out. I'm dubious about its claim.

I can't reproduce the LOR with latest em(4)(r206429).

> This happens because there is an RX lock taken in rxeof, its held

Jack Vogel

unread,
Apr 9, 2010, 5:11:54 PM4/9/10
to Mike Tancsa, pyu...@gmail.com, freebsd...@freebsd.org
Don't know, but I would just ignore it, I think its a false warning anyway.

Jack


On Fri, Apr 9, 2010 at 2:05 PM, Mike Tancsa <mi...@sentex.net> wrote:

> At 04:13 PM 4/9/2010, Pyun YongHyeon wrote:
>
>> On Fri, Apr 09, 2010 at 12:09:24PM -0700, Jack Vogel wrote:
>> > Someone else also pointed this out. I'm dubious about its claim.
>>
>> I can't reproduce the LOR with latest em(4)(r206429).
>>
>
>

> I still get it for some reason
>
> 1st 0xc5dc7610 em1:rx(1) (em1:rx(1)) @
> /usr/HEAD/src/sys/modules/em/../../dev/e1000/if_em.c:4087
> 2nd 0xc0f7df0c udp (udp) @ /usr/HEAD/src/sys/netinet/udp_usrreq.c:454
> KDB: stack backtrace:
> db_trace_self_wrapper(c0cb4313,c5b72a64,c08e4305,c08d467b,c0cb7338,...) at
> db_trace_self_wrapper+0x26
> kdb_backtrace(c08d467b,c0cb7338,c5d31a98,c5d2cb60,c5b72ac0,...) at
> kdb_backtrace+0x29
> _witness_debugger(c0cb7338,c0f7df0c,c0c9c4eb,c5d2cb60,c0ccfaf7,...) at
> _witness_debugger+0x25
> witness_checkorder(c0f7df0c,1,c0ccfaf7,1c6,0,...) at
> witness_checkorder+0x839
> _rw_rlock(c0f7df0c,c0ccfaf7,1c6,c5d33088,c5e8be24,...) at _rw_rlock+0x9c
> udp_input(c6905900,14,0,c5e8bd80,c0df9ee0,...) at udp_input+0x246
> ip_input(c6905900,c5f39240,c5b72bdc,c0745956,c0df9ee0,...) at
> ip_input+0x606
> netisr_dispatch_src(1,0,c6905900,c5b72c14,c0954161,...) at
> netisr_dispatch_src+0xcd
> netisr_dispatch(1,c6905900,c6018c00,c6018c00,c6925800,...) at
> netisr_dispatch+0x20
> ether_demux(c6018c00,c6905900,3,0,3,...) at ether_demux+0x1a1
> ether_input(c6018c00,c6905900,c119f2d7,ff7,63,...) at ether_input+0x365
> em_rxeof(c5e8bd80,109,c6016180,0,c5b72cc8,...) at em_rxeof+0x133
> em_msix_rx(c5dc7600,c5b72cc8,c088e0b4,c0e13740,c60342b8,...) at
> em_msix_rx+0x25
> intr_event_execute_handlers(c5d807f8,c6034280,c0cac35e,533,c60342f0,...) at
> intr_event_execute_handlers+0x125
> ithread_loop(c603b4a0,c5b72d38,c0cac0cd,343,c5d807f8,...) at
> ithread_loop+0x9f
> fork_exit(c0876da0,c603b4a0,c5b72d38) at fork_exit+0xb8


>
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0, eip = 0, esp = 0xc5b72d70, ebp = 0 ---
>

> Perhaps because multi-queue is still enabled ?
>
>
> 0(i5b)% vmstat -i
> interrupt total rate
> irq4: uart0 6637 28
> irq21: ehci0 382 1
> irq23: ehci1 615 2
> cpu0: timer 462573 1993
> irq256: em0 7640 32
> irq257: em1 5 0
> irq258: em1 6 0
>
> irq261: em1 2 0
> irq262: ahci0 69 0
> cpu3: timer 461507 1989
> cpu2: timer 460985 1987
> cpu1: timer 461545 1989
> Total 1861966 8025
> 0(i5b)%
>
>
> em0: <Intel(R) PRO/1000 Network Connection 7.0.3> port 0x3040-0x305f mem
> 0xc1b00000-0xc1b1ffff,0xc1b25000-0xc1b25fff irq 20 at device 25.0 on pci0
> em0: Using MSI interrupt
> em0: [FILTER]
> em0: Ethernet address: 00:15:17:c8:4b:99
> ehci0: <Intel PCH USB 2.0 controller USB-B> mem 0xc1b22000-0xc1b223ff irq
> 21 at device 26.0 on pci0
> ehci0: [ITHREAD]
> usbus0: EHCI version 1.0
> usbus0: <Intel PCH USB 2.0 controller USB-B> on ehci0
> pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
> pci2: <ACPI PCI bus> on pcib2
> pcib3: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0
> pci3: <ACPI PCI bus> on pcib3
> em1: <Intel(R) PRO/1000 Network Connection 7.0.3> port 0x1000-0x101f mem
> 0xc1900000-0xc191ffff,0xc1920000-0xc1923fff irq 16 at device 0.0 on pci3
> em1: Using MSIX interrupts with 5 vectors
> em1: [ITHREAD]
> em1: [ITHREAD]
> em1: [ITHREAD]
> em1: [ITHREAD]
> em1: [ITHREAD]
> em1: Ethernet address: 00:15:17:c8:4b:98

Mike Tancsa

unread,
Apr 9, 2010, 5:05:22 PM4/9/10
to pyu...@gmail.com, Jack Vogel, freebsd...@freebsd.org
At 04:13 PM 4/9/2010, Pyun YongHyeon wrote:
>On Fri, Apr 09, 2010 at 12:09:24PM -0700, Jack Vogel wrote:
> > Someone else also pointed this out. I'm dubious about its claim.
>
>I can't reproduce the LOR with latest em(4)(r206429).

I still get it for some reason

1st 0xc5dc7610 em1:rx(1) (em1:rx(1)) @
/usr/HEAD/src/sys/modules/em/../../dev/e1000/if_em.c:4087
2nd 0xc0f7df0c udp (udp) @ /usr/HEAD/src/sys/netinet/udp_usrreq.c:454
KDB: stack backtrace:


db_trace_self_wrapper(c0cb4313,c5b72a64,c08e4305,c08d467b,c0cb7338,...)
at db_trace_self_wrapper+0x26
kdb_backtrace(c08d467b,c0cb7338,c5d31a98,c5d2cb60,c5b72ac0,...) at
kdb_backtrace+0x29
_witness_debugger(c0cb7338,c0f7df0c,c0c9c4eb,c5d2cb60,c0ccfaf7,...)
at _witness_debugger+0x25
witness_checkorder(c0f7df0c,1,c0ccfaf7,1c6,0,...) at witness_checkorder+0x839
_rw_rlock(c0f7df0c,c0ccfaf7,1c6,c5d33088,c5e8be24,...) at _rw_rlock+0x9c
udp_input(c6905900,14,0,c5e8bd80,c0df9ee0,...) at udp_input+0x246
ip_input(c6905900,c5f39240,c5b72bdc,c0745956,c0df9ee0,...) at ip_input+0x606
netisr_dispatch_src(1,0,c6905900,c5b72c14,c0954161,...) at
netisr_dispatch_src+0xcd
netisr_dispatch(1,c6905900,c6018c00,c6018c00,c6925800,...) at
netisr_dispatch+0x20
ether_demux(c6018c00,c6905900,3,0,3,...) at ether_demux+0x1a1
ether_input(c6018c00,c6905900,c119f2d7,ff7,63,...) at ether_input+0x365
em_rxeof(c5e8bd80,109,c6016180,0,c5b72cc8,...) at em_rxeof+0x133
em_msix_rx(c5dc7600,c5b72cc8,c088e0b4,c0e13740,c60342b8,...) at em_msix_rx+0x25
intr_event_execute_handlers(c5d807f8,c6034280,c0cac35e,533,c60342f0,...)
at intr_event_execute_handlers+0x125
ithread_loop(c603b4a0,c5b72d38,c0cac0cd,343,c5d807f8,...) at ithread_loop+0x9f

fork_exit(c0876da0,c603b4a0,c5b72d38) at fork_exit+0xb8


fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xc5b72d70, ebp = 0 ---

Perhaps because multi-queue is still enabled ?

0(i5b)% vmstat -i
interrupt total rate


irq4: uart0 6637 28
irq21: ehci0 382 1
irq23: ehci1 615 2
cpu0: timer 462573 1993
irq256: em0 7640 32
irq257: em1 5 0
irq258: em1 6 0

irq261: em1 2 0
irq262: ahci0 69 0

Mike Tancsa

unread,
Apr 10, 2010, 10:46:21 AM4/10/10
to Jack Vogel, pyu...@gmail.com, freebsd...@freebsd.org
At 05:11 PM 4/9/2010, Jack Vogel wrote:
>Don't know, but I would just ignore it, I think its a false warning anyway.

OK. I updated to HEAD as of this AM, but now I get a panic at bootup

odule pci/rl failed to register: 17
Timecounter "i8254" frequency 1193182 Hz quality 0


CPU: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz (2666.65-MHz
686-class CPU)
Origin = "GenuineIntel" Id = 0x106e5 Family = 6 Model = 1e Stepping = 5
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x98e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT>
AMD Features=0x28100000<NX,RDTSCP,LM>
AMD Features2=0x1<LAHF>
TSC: P-state invariant
real memory = 4294967296 (4096 MB)
avail memory = 2577711104 (2458 MB)
ACPI APIC Table: <INTEL S3420GPC>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs

FreeBSD/SMP: 1 package(s) x 4 core(s)
cpu0 (BSP): APIC ID: 0
cpu1 (AP): APIC ID: 2
cpu2 (AP): APIC ID: 4
cpu3 (AP): APIC ID: 6
ioapic0 <Version 2.0> irqs 0-23 on motherboard
lapic0: Forcing LINT1 to edge trigger
kbd1 at kbdmux0
acpi0: <INTEL S3420GPC> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 5.0 on pci0
pci1: <ACPI PCI bus> on pcib1
vgapci0: <VGA-compatible display> port 0x2000-0x20ff mem
0xb0000000-0xbfffffff,0xc1a00000-0xc1a0ffff irq 16 at device 0.0 on pci1
pci1: <multimedia, HDA> at device 0.1 (no driver attached)
pci0: <base peripheral> at device 8.0 (no driver attached)
pci0: <base peripheral> at device 8.1 (no driver attached)
pci0: <base peripheral> at device 8.2 (no driver attached)
pci0: <base peripheral> at device 8.3 (no driver attached)
pci0: <base peripheral> at device 16.0 (no driver attached)
pci0: <base peripheral> at device 16.1 (no driver attached)
em0: <Intel(R) PRO/1000 Network Connection 7.0.4> port 0x3040-0x305f

mem 0xc1b00000-0xc1b1ffff,0xc1b25000-0xc1b25fff irq 20 at device 25.0 on pci0
em0: Using MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:15:17:c8:4b:99
ehci0: <Intel PCH USB 2.0 controller USB-B> mem 0xc1b22000-0xc1b223ff
irq 21 at device 26.0 on pci0
ehci0: [ITHREAD]
usbus0: EHCI version 1.0
usbus0: <Intel PCH USB 2.0 controller USB-B> on ehci0
pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0
pci3: <ACPI PCI bus> on pcib3

em1: <Intel(R) PRO/1000 Network Connection 7.0.4> port 0x1000-0x101f

mem 0xc1900000-0xc191ffff,0xc1920000-0xc1923fff irq 16 at device 0.0 on pci3
em1: Using MSIX interrupts with 5 vectors
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: Ethernet address: 00:15:17:c8:4b:98

pcib4: <ACPI PCI-PCI bridge> irq 18 at device 28.6 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 19 at device 28.7 on pci0
pci5: <ACPI PCI bus> on pcib5
ehci1: <Intel PCH USB 2.0 controller USB-A> mem 0xc1b21000-0xc1b213ff
irq 23 at device 29.0 on pci0
ehci1: [ITHREAD]
usbus1: EHCI version 1.0
usbus1: <Intel PCH USB 2.0 controller USB-A> on ehci1
pcib6: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci6: <ACPI PCI bus> on pcib6
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
ahci0: <Intel PCH AHCI SATA controller> port
0x3068-0x306f,0x3074-0x3077,0x3060-0x3067,0x3070-0x3073,0x3020-0x303f
mem 0xc1b20000-0xc1b207ff irq 18 at device 31.2 on pci0
ahci0: [ITHREAD]
ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich0: [ITHREAD]
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich1: [ITHREAD]
ahcich2: <AHCI channel> at channel 2 on ahci0
ahcich2: [ITHREAD]
ahcich3: <AHCI channel> at channel 3 on ahci0
ahcich3: [ITHREAD]
ahcich4: <AHCI channel> at channel 4 on ahci0
ahcich4: [ITHREAD]
ahcich5: <AHCI channel> at channel 5 on ahci0
ahcich5: [ITHREAD]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Sleep Button> on acpi0
acpi_button0: enable wake failed
atrtc0: <AT realtime clock> port 0x70-0x71,0x74-0x77 irq 8 on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
uart0: console (115200,n,8,1)
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
uart1: [FILTER]
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xd3000-0xd3fff,0xd4000-0xd4fff
pnpid ORM0000 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0
ata0: [ITHREAD]
ata1 at port 0x170-0x177,0x376 irq 15 on isa0
ata1: [ITHREAD]
ppc0: parallel port not found.
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
p4tcc1: <CPU Frequency Thermal Control> on cpu1
est2: <Enhanced SpeedStep Frequency Control> on cpu2
p4tcc2: <CPU Frequency Thermal Control> on cpu2
est3: <Enhanced SpeedStep Frequency Control> on cpu3
p4tcc3: <CPU Frequency Thermal Control> on cpu3
Timecounters tick every 1.000 msec
usbus0: 480Mbps High Speed USB v2.0
usbus1: 480Mbps High Speed USB v2.0
ugen0.1: <Intel> at usbus0
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
ugen1.1: <Intel> at usbus1
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <ST3250318AS CC35> ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada1 at ahcich3 bus 0 scbus3 target 0 lun 0
ada1: <ST3250318AS CC35> ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
lapic4: Forcing LINT1 to edge trigger
SMP: AP CPU #2 Launched!
lapic2: Forcing LINT1 to edge trigger
SMP: AP CPU #1 Launched!
lapic6: Forcing LINT1 to edge trigger
SMP: AP CPU #3 Launched!
WARNING: WITNESS option enabled, expect reduced performance.
Root mount waiting for: usbus1 usbus0
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
ugen0.2: <vendor 0x8087> at usbus0
ugen1.2: <vendor 0x8087> at usbus1
uhub2: <vendor 0x8087 product 0x0020, class 9/0, rev 2.00/0.00, addr
2> on usbus0
uhub3: <vendor 0x8087 product 0x0020, class 9/0, rev 2.00/0.00, addr
2> on usbus1
Root mount waiting for: usbus1 usbus0
uhub2: 6 ports with 6 removable, self powered
uhub3: 6 ports with 6 removable, self powered
ugen1.3: <American Megatrends Inc.> at usbus1
ukbd0: <Keyboard Interface> on usbus1
kbd0 at ukbd0
ums0: <Mouse Interface> on usbus1
ums0: 3 buttons and [Z] coordinates ID=0
Root mount waiting for: usbus1
ugen1.4: <GASIA> at usbus1
ukbd1: <GASIA PS2toUSB Adapter, class 0/0, rev 1.10/1.00, addr 4> on usbus1
kbd2 at ukbd1
ums1: <GASIA PS2toUSB Adapter, class 0/0, rev 1.10/1.00, addr 4> on usbus1
ums1: 5 buttons and [XYZ] coordinates ID=1
Trying to mount root from nfs:
NFS ROOT: 10.255.255.1:/usr/home/pxe9/
panic: mutex em0:rx(0) not owned at
/usr/HEAD/src/sys/modules/em/../../dev/e1000/if_em.c:4093
cpuid = 3
KDB: enter: panic
[ thread pid 0 tid 100032 ]
Stopped at kdb_enter+0x3a: movl $0,kdb_why
db> bt
Tracing pid 0 tid 100032 td 0xc5f5bb40
kdb_enter(c0cb0e9d,c0cb0e9d,c0caf56e,c5b2ac28,3,...) at kdb_enter+0x3a
panic(c0caf56e,c6002024,c11a0357,ffd,c5b2ac7c,...) at panic+0x136
_mtx_assert(c6002010,4,c11a0357,ffd,64,...) at _mtx_assert+0x87
em_rxeof(246,c5ff7d98,c5b2aca8,c088e194,c5ff7d98,...) at em_rxeof+0x3b
em_handle_que(c6006000,1,c0cb5c9c,4f,c5ff7d98,...) at em_handle_que+0x38
taskqueue_run(c5ff7d80,c5ff7d98,c0ca6410,0,c0caf5f6,...) at taskqueue_run+0x103
taskqueue_thread_loop(c600a520,c5b2ad38,c0cac192,343,c0e0ce20,...) at
taskqueue_thread_loop+0x68
fork_exit(c08dcde0,c600a520,c5b2ad38) at fork_exit+0xb8
fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xc5b2ad70, ebp = 0 ---
db>

><mailto:mi...@sentex.net>mi...@sentex.net
>Providing Internet since
>1994 <http://www.sentex.net>www.sentex.net
>Cambridge, Ontario
>Canada <http://www.sentex.net/mike>www.sentex.net/mike

Bjoern A. Zeeb

unread,
Apr 10, 2010, 12:05:31 PM4/10/10
to Mike Tancsa, pyu...@gmail.com, freebsd...@freebsd.org, Jack Vogel
On Sat, 10 Apr 2010, Mike Tancsa wrote:

Hi Mike,

> At 05:11 PM 4/9/2010, Jack Vogel wrote:
>> Don't know, but I would just ignore it, I think its a false warning anyway.
>
> OK. I updated to HEAD as of this AM, but now I get a panic at bootup
>

...


> Trying to mount root from nfs:
> NFS ROOT: 10.255.255.1:/usr/home/pxe9/
> panic: mutex em0:rx(0) not owned at
> /usr/HEAD/src/sys/modules/em/../../dev/e1000/if_em.c:4093
> cpuid = 3
> KDB: enter: panic
> [ thread pid 0 tid 100032 ]
> Stopped at kdb_enter+0x3a: movl $0,kdb_why
> db> bt
> Tracing pid 0 tid 100032 td 0xc5f5bb40
> kdb_enter(c0cb0e9d,c0cb0e9d,c0caf56e,c5b2ac28,3,...) at kdb_enter+0x3a
> panic(c0caf56e,c6002024,c11a0357,ffd,c5b2ac7c,...) at panic+0x136
> _mtx_assert(c6002010,4,c11a0357,ffd,64,...) at _mtx_assert+0x87
> em_rxeof(246,c5ff7d98,c5b2aca8,c088e194,c5ff7d98,...) at em_rxeof+0x3b
> em_handle_que(c6006000,1,c0cb5c9c,4f,c5ff7d98,...) at em_handle_que+0x38
> taskqueue_run(c5ff7d80,c5ff7d98,c0ca6410,0,c0caf5f6,...) at
> taskqueue_run+0x103
> taskqueue_thread_loop(c600a520,c5b2ad38,c0cac192,343,c0e0ce20,...) at
> taskqueue_thread_loop+0x68
> fork_exit(c08dcde0,c600a520,c5b2ad38) at fork_exit+0xb8
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0, eip = 0, esp = 0xc5b2ad70, ebp = 0 ---
> db>


This is a bug that seems to only happen in the Kitchener area as I hit
it with a machine there just a few minutes ago as well.

This one has fixed it for me:
http://lists.freebsd.org/pipermail/svn-src-head/2010-April/016249.html

/bz

--
Bjoern A. Zeeb It will not break if you know what you are doing.

Jack Vogel

unread,
Apr 10, 2010, 3:29:35 PM4/10/10
to Bjoern A. Zeeb, pyu...@gmail.com, freebsd...@freebsd.org, Mike Tancsa
Added the missing locks around calls to rxeof and checked it in a minute
ago.

Sorry guys!

Jack

Mike Tancsa

unread,
Apr 10, 2010, 6:07:52 PM4/10/10
to Jack Vogel, Bjoern A. Zeeb, pyu...@gmail.com, freebsd...@freebsd.org
At 03:29 PM 4/10/2010, Jack Vogel wrote:
>Added the missing locks around calls to rxeof and checked it in a minute ago.
>
>Sorry guys!

Looks good for me now. BTW, I thought the multi-queue was supposed
to be disabled on the em nic ?


em0: <Intel(R) PRO/1000 Network Connection 7.0.4> port 0x3040-0x305f
mem 0xc1b00000-0xc1b1ffff,0xc1b25000-0xc1b25fff irq 20 at device 25.0 on pci0
em0: Using MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:15:17:c8:4b:99

em1: <Intel(R) PRO/1000 Network Connection 7.0.4> port 0x1000-0x101f
mem 0xc1900000-0xc191ffff,0xc1920000-0xc1923fff irq 16 at device 0.0 on pci3
em1: Using MSIX interrupts with 5 vectors
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: Ethernet address: 00:15:17:c8:4b:98

0(i5b)% vmstat -i
interrupt total rate
irq4: uart0 6285 13
irq21: ehci0 728 1
irq23: ehci1 1078 2
cpu0: timer 924321 1992
irq256: em0 9375 20
irq257: em1 127 0
irq258: em1 7 0


irq261: em1 2 0
irq262: ahci0 69 0

cpu3: timer 923686 1990
cpu1: timer 923651 1990
cpu2: timer 923597 1990
Total 3712926 8001
0(i5b)%

em0@pci0:0:25:0: class=0x020000 card=0x34ec8086
chip=0x10ef8086 rev=0x05 hdr=0x00
vendor = 'Intel Corporation'
class = network
subclass = ethernet
cap 01[c8] = powerspec 2 supports D0 D3 current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 13[e0] = PCI Advanced Features: FLR TP

em1@pci0:3:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086
rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
class = network
subclass = ethernet
cap 01[c8] = powerspec 2 supports D0 D3 current D0
cap 05[d0] = MSI supports 1 message, 64 bit
cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled

>Jack

><http://lists.freebsd.org/pipermail/svn-src-head/2010-April/016249.html>http://lists.freebsd.org/pipermail/svn-src-head/2010-April/016249.html


>
>/bz
>
>--
>Bjoern A. Zeeb It will not break if you know what you are doing.
>

--------------------------------------------------------------------

Jack Vogel

unread,
Apr 10, 2010, 8:09:01 PM4/10/10
to Mike Tancsa, pyu...@gmail.com, Bjoern A. Zeeb, freebsd...@freebsd.org
What's disabled is the drbr stuff in the stack, that will not keep the 82574
from
initializing MSIX and multiple internal queues, if you have that adapter I
would
suggest you #define EM_MULTIQUEUE somewhere (Makefile, if_em.h or if_em.c)
since I believe its the one place where you will benefit. At least try it
and see.

Jack

John Baldwin

unread,
Apr 12, 2010, 10:52:42 AM4/12/10
to freebsd...@freebsd.org, pyu...@gmail.com, Jack Vogel, Mike Tancsa
On Friday 09 April 2010 3:09:24 pm Jack Vogel wrote:
> Someone else also pointed this out. I'm dubious about its claim.
> This happens because there is an RX lock taken in rxeof, its held
> thru the call into the stack, it then encounters another lock there
> and hence this complaint. I've had the RX hold as it is for a long
> while and would rather not have to give it up, can someone look
> at it and advise?

I've seen it happen with igb. I suspect it is a transitive lock order. That
is, you probably never have the UDP lock acquired before an em/igb RX lock.
However, if you have an em/igb adapter TX lock held when you acquire an em/igb
RX lock in one place, and in if_start() you acquire the TX lock while the UDP
lock is held, that can trigger the LOR. Specifically, those two paths would
give you these two orders:

TX -> RX
UDP -> TX

which implies the order

UDP -> RX

(lock order relationsips are transitive, just like a > b and b > c implies
a > c).

However, I haven't been able to track down what the raw orders are that might
lead to this transitive order. Attilio added some sysctls to dump all the raw
lock orders in one of the debug.witness sysctls. You can also try hardcoding
the 'RX -> UDP' order using WITNESS_DEFINEORDER() before any of the em/igb
RX/TX locks are acquired to see what different LOR is triggered. If that LOR
looks valid then you can keep hardcoding valid orders until you find the
invalid one.

> _______________________________________________
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-curre...@freebsd.org"
>

--
John Baldwin

Jack Vogel

unread,
Apr 12, 2010, 12:26:06 PM4/12/10
to John Baldwin, pyu...@gmail.com, freebsd...@freebsd.org, Mike Tancsa

> Do you think releasing the RX lock before the stack entry would get rid of
the problem?

Other ideas?

Jack

John Baldwin

unread,
Apr 12, 2010, 12:56:12 PM4/12/10
to Jack Vogel, pyu...@gmail.com, freebsd...@freebsd.org, Mike Tancsa

Well, while that might quiet the LOR, I suspect it would be masking another
problem that is the "real" LOR.

--
John Baldwin

Reply all
Reply to author
Forward
0 new messages