Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

FreeBSD 7.3/i386 libalias related panic

4 views
Skip to first unread message

Artem Kim

unread,
Apr 5, 2010, 4:37:51 PM4/5/10
to freebsd...@freebsd.org
Hi,
I have a machine that acts as a NAS (mpd5 PPPoE).

Also on the same machine using NAT (ipfw + ng_nat).

Not so long ago, during one hour, I have two identical kernel panic:


FreeBSD nas3.xxx.ru 7.3-RELEASE FreeBSD 7.3-RELEASE #0: Sun Mar 21 17:55:26
MSK 2010 i386


nas3# kgdb kernel.debug /var/crash/vmcore.1
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address = 0x7d4c
fault code = supervisor read, page not present
instruction pointer = 0x20:0x8069ac41
stack pointer = 0x28:0xd259a8b0
frame pointer = 0x28:0xd259a8c8
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 27 (irq17: bge1)
trap number = 12
panic: page fault
cpuid = 1
Uptime: 1h14m2s
Physical memory: 1014 MB
Dumping 103 MB: 88 72 56 40 24bge1: watchdog timeout -- resetting
8
<5>bge1: link state changed to DOWN

Reading symbols from /boot/kernel/acpi.ko...Reading symbols from
/boot/kernel/acpi.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/acpi.ko
#0 doadump () at pcpu.h:196
196 __asm __volatile("movl %%fs:0,%0" : "=r" (td));


(kgdb) list *0x8069ac41
0x8069ac41 is in DeleteLink (/usr/src/sys/netinet/libalias/alias_db.c:857).
852 {
853 struct libalias *la = lnk->la;
854
855 LIBALIAS_LOCK_ASSERT(la);
856 /* Don't do anything if the link is marked permanent */
857 if (la->deleteAllLinks == 0 && lnk->flags & LINK_PERMANENT)
858 return;
859
860 #ifndef NO_FW_PUNCH
861 /* Delete associated firewall hole, if any */
(kgdb)

(kgdb) bt
#0 doadump () at pcpu.h:196
#1 0x8059ce94 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
#2 0x8059d31a in panic (fmt=0x104 <Address 0x104 out of bounds>) at
/usr/src/sys/kern/kern_shutdown.c:574
#3 0x807855dd in trap_fatal (frame=0xd259a870, eva=40) at
/usr/src/sys/i386/i386/trap.c:950
#4 0x8078595a in trap_pfault (frame=0xd259a870, usermode=0, eva=32076) at
/usr/src/sys/i386/i386/trap.c:863
#5 0x80786277 in trap (frame=0xd259a870) at /usr/src/sys/i386/i386/trap.c:541
#6 0x8076b0eb in calltrap () at /usr/src/sys/i386/i386/exception.s:166
#7 0x8069ac41 in DeleteLink (lnk=0x84e0f980) at
/usr/src/sys/netinet/libalias/alias_db.c:853
#8 0x8069ae3e in HouseKeeping (la=0x84874000) at
/usr/src/sys/netinet/libalias/alias_db.c:843
#9 0x8069947b in LibAliasInLocked (la=0x84874000, ptr=0x8458e810 "E",
maxpacketsize=2032) at /usr/src/sys/netinet/libalias/alias.c:1246
#10 0x8069a225 in LibAliasIn (la=0x84874000, ptr=0x8458e810 "E",
maxpacketsize=2032) at /usr/src/sys/netinet/libalias/alias.c:1228
#11 0x8065fd91 in ng_nat_rcvdata (hook=0x84842900, item=0x84cebba0) at
/usr/src/sys/netgraph/ng_nat.c:707
#12 0x80658606 in ng_apply_item (node=0x847de780, item=0x84cebba0, rw=1) at
/usr/src/sys/netgraph/ng_base.c:2336
#13 0x80657607 in ng_snd_item (item=0x84cebba0, flags=Variable "flags" is not
available.
) at /usr/src/sys/netgraph/ng_base.c:2254
#14 0x8067e4b6 in ipfw_check_in (arg=0x0, m0=0xd259aba8, ifp=0x84179800,
dir=1, inp=0x0) at /usr/src/sys/netinet/ip_fw_pfil.c:189
#15 0x8064af6f in pfil_run_hooks (ph=0x80847c00, mp=0xd259ac00,
ifp=0x84179800, dir=1, inp=0x0) at /usr/src/sys/net/pfil.c:78
#16 0x806812bd in ip_input (m=0x87135900) at
/usr/src/sys/netinet/ip_input.c:416
#17 0x8063efba in ether_demux (ifp=0x84179800, m=0x87135900) at
/usr/src/sys/net/if_ethersubr.c:834
#18 0x8063f1d6 in ether_input (ifp=0x84179800, m=0x87135900) at
/usr/src/sys/net/if_ethersubr.c:692
#19 0x80490c8f in bge_rxeof (sc=0x84187000, rx_prod=465, holdlck=1) at
/usr/src/sys/dev/bge/if_bge.c:3392
#20 0x80492d67 in bge_intr (xsc=0x84187000) at
/usr/src/sys/dev/bge/if_bge.c:3653
#21 0x8057c7bb in ithread_loop (arg=0x84180500) at
/usr/src/sys/kern/kern_intr.c:1181
#22 0x80578f25 in fork_exit (callout=0x8057c698 <ithread_loop>,
arg=0x84180500, frame=0xd259ad38) at /usr/src/sys/kern/kern_fork.c:811
#23 0x8076b160 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:271


Thanks for any help !

Peter Jeremy

unread,
Apr 6, 2010, 3:24:52 PM4/6/10
to Artem Kim, freebsd...@freebsd.org
On 2010-Apr-06 00:37:51 +0400, Artem Kim <arte...@inbox.ru> wrote:
>Fatal trap 12: page fault while in kernel mode
>cpuid = 1; apic id = 01
>fault virtual address = 0x7d4c

This suggests an offset from a NULL pointer.

>0x8069ac41 is in DeleteLink (/usr/src/sys/netinet/libalias/alias_db.c:857).
>852 {
>853 struct libalias *la = lnk->la;
>854
>855 LIBALIAS_LOCK_ASSERT(la);
>856 /* Don't do anything if the link is marked permanent */
>857 if (la->deleteAllLinks == 0 && lnk->flags & LINK_PERMANENT)
>858 return;

>(kgdb) bt


>#7 0x8069ac41 in DeleteLink (lnk=0x84e0f980) at /usr/src/sys/netinet/libalias/alias_db.c:853
>#8 0x8069ae3e in HouseKeeping (la=0x84874000) at /usr/src/sys/netinet/libalias/alias_db.c:843

In the absence of someone who's seen this before, my initial guess is
that lnk->la is corrupted in frame #7. I'd start with 'print *lnk' at
frame #7 to confirm this. If so, you could go up to frame #8 and work
through the linkTableOut chain to find which entry is corrupt - but
actually finding _why_ it's corrupt will take a lot more work.

If this is repeatable, I'd suggest adding WITNESS, WITNESS_SKIPSPIN
and INVARIANTS and see if you can get the problem to show up closer
to its cause.

--
Peter Jeremy

Artem Kim

unread,
Apr 8, 2010, 12:32:44 PM4/8/10
to freebsd...@freebsd.org, Peter Jeremy

I have three almost nearly identical machines (two HP DL-140G3 and a HP
DL-160G5). These machines have approximately the same setting.

Problem occurred only on one (140G3).

Two errors occurred in intervals of one hour. Last error happened three days
ago. Until now, the problem is not repeated.
Introducing additional options to debug the kernel - it is very difficult to
machine is under heavy load. On a test desk, I can not reproduce the problem.

(kgdb) f 7


#7 0x8069ac41 in DeleteLink (lnk=0x84e0f980) at
/usr/src/sys/netinet/libalias/alias_db.c:853

853 struct libalias *la = lnk->la;

(kgdb) print *lnk
$1 = {la = 0x0, src_addr = {s_addr = 1}, dst_addr = {s_addr = 0}, alias_addr =
{s_addr = 0}, proxy_addr = {s_addr = 0}, src_port = 0, dst_port = 0,
alias_port = 0, proxy_port = 0, server = 0x0, link_type = 0, flags = 0,
pflags = 0, timestamp = 0, expire_time = 0, list_out = {le_next = 0x0,
le_prev = 0x853dcdb4}, list_in = {le_next = 0x0, le_prev = 0x84861c48},
data = {frag_ptr = 0x0, frag_addr = {s_addr = 0}, tcp = 0x0}}


I'm sorry I do not understand what I should do next.

0 new messages