hi folks.
i've got all the SMP fixes merged into the netbsd-5 branch now.
anyone running a UP kernel on netbsd-5 for this problem is now
encouraged to switch back to MP.
the daily builds also have these fixes now, so you can grab a
fresh build to test from:
http://nyftp.netbsd.org/pub/NetBSD-daily/netbsd-5/
i'd love to hear reports of success or failure.
thanks.
.mrg.
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-...@muc.de
> i've got all the SMP fixes merged into the netbsd-5 branch now.
> anyone running a UP kernel on netbsd-5 for this problem is now
> encouraged to switch back to MP.
> i'd love to hear reports of success or failure.
A failure report unfortunately :(
I did a fresh install of the 201103090800Z daily build on a Sparc 20
with dual SM51 CPUs and have seen two panics so far. The first one,
the dma0 error, happened when unpacking a gzipped tar file and the
second, tp->t_lastm, when compiling Tcl 8.5.7. Console capture is
included below.
I don't know if it matters but the two SM51s are not identical, CPU0 is
a 501-2607 and CPU1 is a 501-2352.
George
== 1st panic ==
login: panic: dma0: cannot allocate DVMA address
Begin traceback...
0x0(0xf3d5ac00, 0xf3d82148, 0xf3d8214c, 0x0, 0xf03898c4, 0xb5ffd110) at
netbsd:esp_dma_setup+0x1c esp_dma_setup(0xf3d82000, 0xf3d82148,
0xf3d8214c, 0x0, 0xf03898c4, 0xf03dcfc8) at netbsd:ncr53c9x_intr+0x12b8
ncr53c9x_intr(0x0, 0xf00cc184, 0x700, 0x408000e6, 0xf03efcb4, 0x2) at
netbsd:sparc_interrupt44c+0x160 sparc_interrupt44c(0xf04ae000,
0xf04ae000, 0xf03eb800, 0x0, 0xf03eb914, 0x2) at
netbsd:sched_curcpu_runnable_p+0x10 sched_curcpu_runnable_p(0xf0002000,
0x0, 0x1, 0xf3ba4e80, 0xf03efcb4, 0xf00027a0) at netbsd:idle_loop+0x138
idle_loop(0xf3ba7c80, 0xf3ba7c80, 0xf03d9c00, 0xf03d9c00, 0xf03d9c00,
0xf03d9c00) at netbsd:lwp_setfunc_trampoline End traceback... Frame
pointer is at 0xf0389630 Call traceback: pc = 0xf02e2f9c args = (0x11,
0x5, 0x0, 0x0, 0xf0389750, 0x700, 0xf0389698) fp = 0xf0389698 pc =
0xf01f8a1c args = (0x104, 0x0, 0x1, 0xf3b9df20, 0xf01f8730, 0x700,
0xf0389708) fp = 0xf0389708 pc = 0xf00d8198 args = (0xf034e630,
0xf3d6a87c, 0xf03eb800, 0xf038fc00, 0x104, 0xf0397000, 0xf0389778) fp =
0xf0389778 pc = 0xf02a2c70 args = (0xf3d5ac00, 0xf3d82148, 0xf3d8214c,
0x0, 0xf03898c4, 0xb5ffd110, 0xf03897e0) fp = 0xf03897e0 pc =
0xf00cd43c args = (0xf3d82000, 0xf3d82148, 0xf3d8214c, 0x0,
0xf03898c4, 0xf03dcfc8, 0xf0389848) fp = 0xf0389848 pc = 0xf0008bdc
args = (0x0, 0xf00cc184, 0x700, 0x408000e6, 0xf03efcb4, 0x2,
0xf03898d0) fp = 0xf03898d0 pc = 0xf01d6b08 args = (0xf04ae000,
0xf04ae000, 0xf03eb800, 0x0, 0xf03eb914, 0x2, 0xf3b9de80) fp =
0xf3b9de80
dump to dev 7,1 not possible
sd1(esp0:0:3:0): esp0: timed out [ecb 0xf0ae2b98 (flags 0x3, dleft 0,
stat 0)], <state 5, nexus 0xf0ae2cb0, phase(l 10, c 0, p 7), resid
1000, msg(q 0,o 0) > sd1(esp0:0:3:0): esp0: timed out [ecb 0xf0ae2b98
(flags 0x43, dleft 0, stat 0)], <state 5, nexus 0xf0ae2cb0, phase(l 10,
c 0, p 7), resid 1000, msg(q 0,o 0) > AGAIN sd0: async, 8-bit transfers
sd1: async, 8-bit transfers sd1(esp0:0:3:0): polling command not done
panic: scsipi_execute_xs Begin traceback...
0x0(0xf0ae3348, 0xf03895b6, 0xa, 0x0, 0x0, 0x4) at netbsd:sd_flush+0x84
sd_flush(0xf464ad10, 0x103, 0x1, 0xf04ae000, 0x0, 0xf03896e8) at
netbsd:sd_shutdown+0x14 sd_shutdown(0xf464ad10, 0x0, 0x0, 0x0, 0x75,
0xffffffff) at netbsd:doshutdownhooks+0x38 doshutdownhooks(0xf03d6800,
0x7, 0x1, 0x0, 0xf039f400, 0xb00) at netbsd:cpu_reboot+0x1c cpu_reboot
(0x104, 0x0, 0x1, 0xf3b9df20, 0xf01f8730, 0x700) at netbsd:panic+0x238
panic(0xf034e630, 0xf3d6a87c, 0xf03eb800, 0xf038fc00, 0x104,
0xf0397000) at netbsd:lsi64854_setup+0x3cc lsi64854_setup(0xf3d5ac00,
0xf3d82148, 0xf3d8214c, 0x0, 0xf03898c4, 0xb5ffd110) at
netbsd:esp_dma_setup+0x1c esp_dma_setup(0xf3d82000, 0xf3d82148,
0xf3d8214c, 0x0, 0xf03898c4, 0xf03dcfc8) at netbsd:ncr53c9x_intr+0x12b8
ncr53c9x_intr(0x0, 0xf00cc184, 0x700, 0x408000e6, 0xf03efcb4, 0x2) at
netbsd:sparc_interrupt44c+0x160 sparc_interrupt44c(0xf04ae000,
0xf04ae000, 0xf03eb800, 0x0, 0xf03eb914, 0x2) at
netbsd:sched_curcpu_runnable_p+0x10 sched_curcpu_runnable_p(0xf0002000,
0x0, 0x1, 0xf3ba4e80, 0xf03efcb4, 0xf00027a0) at netbsd:idle_loop+0x138
idle_loop(0xf3ba7c80, 0xf3ba7c80, 0xf03d9c00, 0xf03d9c00, 0xf03d9c00,
0xf03d9c00) at netbsd:lwp_setfunc_trampoline End traceback... Frame
pointer is at 0xf0389390 Call traceback: pc = 0xf02e2f9c args = (0x11,
0x5, 0x0, 0x0, 0xf03894b0, 0xb00, 0xf03893f8) fp = 0xf03893f8 pc =
0xf01f8a1c args = (0x104, 0x0, 0x1, 0xf3b9df20, 0xf01f8730, 0xb00,
0xf0389468) fp = 0xf0389468 pc = 0xf02c0da4 args = (0xf0369db8,
0xf3d8287c, 0xf3d6a90c, 0xf038fc00, 0x104, 0xf0397000, 0xf03894d8) fp =
0xf03894d8 pc = 0xf02ca38c args = (0xf0ae3348, 0xf03895b6, 0xa, 0x0,
0x0, 0x4, 0xf0389540) fp = 0xf0389540 pc = 0xf02cafb8 args =
(0xf464ad10, 0x103, 0x1, 0xf04ae000, 0x0, 0xf03896e8, 0xf03895c8) fp =
0xf03895c8 pc = 0xf01de434 args = (0xf464ad10, 0x0, 0x0, 0x0, 0x75,
0xffffffff, 0xf0389630) fp = 0xf0389630 pc = 0xf02e2e6c args =
(0xf03d6800, 0x7, 0x1, 0x0, 0xf039f400, 0xb00, 0xf0389698) fp =
0xf0389698 pc = 0xf01f8a1c args = (0x104, 0x0, 0x1, 0xf3b9df20,
0xf01f8730, 0x700, 0xf0389708) fp = 0xf0389708 pc = 0xf00d8198 args =
(0xf034e630, 0xf3d6a87c, 0xf03eb800, 0xf038fc00, 0x104, 0xf0397000,
0xf0389778) fp = 0xf0389778 pc = 0xf02a2c70 args = (0xf3d5ac00,
0xf3d82148, 0xf3d8214c, 0x0, 0xf03898c4, 0xb5ffd110, 0xf03897e0) fp =
0xf03897e0 pc = 0xf00cd43c args = (0xf3d82000, 0xf3d82148, 0xf3d8214c,
0x0, 0xf03898c4, 0xf03dcfc8, 0xf0389848) fp = 0xf0389848 pc =
0xf0008bdc args = (0x0, 0xf00cc184, 0x700, 0x408000e6, 0xf03efcb4,
0x2, 0xf03898d0) fp = 0xf03898d0 pc = 0xf01d6b08 args = (0xf04ae000,
0xf04ae000, 0xf03eb800, 0x0, 0xf03eb914, 0x2, 0xf3b9de80) fp =
0xf3b9de80
dump to dev 7,1 not possible
sd0(esp0:0:1:0): polling command not done
panic: scsipi_execute_xs
Begin traceback...
End traceback...
Frame pointer is at 0xf03890f0
Call traceback:
pc = 0xf02e2f9c args = (0x11, 0x5, 0x0, 0x0, 0xf0389210, 0xf0389280,
0xf0389158) fp = 0xf0389158 pc = 0xf01f8a1c args = (0x104, 0x0, 0x1,
0xf0340411, 0xf01f8730, 0xb00, 0xf03891c8) fp = 0xf03891c8 pc =
0xf02c0da4 args = (0xf0369db8, 0xf3d827ec, 0xf3d6a90c, 0xf038fc00,
0x104, 0xf0397000, 0xf0389238) fp = 0xf0389238 pc = 0xf02ca38c args =
(0xf0ae3e70, 0xf0389316, 0xa, 0x0, 0x0, 0x4, 0xf03892a0) fp =
0xf03892a0 pc = 0xf02cafb8 args = (0xf464ac08, 0x103, 0xf0002000, 0x1,
0x0, 0xf0389448, 0xf0389328) fp = 0xf0389328 pc = 0xf01de434 args =
(0xf464ac08, 0x0, 0x0, 0x0, 0x75, 0xffffffff, 0xf0389390) fp =
0xf0389390 pc = 0xf02e2e6c args = (0xf03d6800, 0x7, 0x1, 0x0,
0xf039f400, 0xb00, 0xf03893f8) fp = 0xf03893f8 pc = 0xf01f8a1c args =
(0x104, 0x0, 0x1, 0xf3b9df20, 0xf01f8730, 0xb00, 0xf0389468) fp =
0xf0389468 pc = 0xf02c0da4 args = (0xf0369db8, 0xf3d8287c, 0xf3d6a90c,
0xf038fc00, 0x104, 0xf0397000, 0xf03894d8) fp = 0xf03894d8 pc =
0xf02ca38c args = (0xf0ae3348, 0xf03895b6, 0xa, 0x0, 0x0, 0x4,
0xf0389540) fp = 0xf0389540 pc = 0xf02cafb8 args = (0xf464ad10, 0x103,
0x1, 0xf04ae000, 0x0, 0xf03896e8, 0xf03895c8) fp = 0xf03895c8 pc =
0xf01de434 args = (0xf464ad10, 0x0, 0x0, 0x0, 0x75, 0xffffffff,
0xf0389630) fp = 0xf0389630 pc = 0xf02e2e6c args = (0xf03d6800, 0x7,
0x1, 0x0, 0xf039f400, 0xb00, 0xf0389698) fp = 0xf0389698 pc =
0xf01f8a1c args = (0x104, 0x0, 0x1, 0xf3b9df20, 0xf01f8730, 0x700,
0xf0389708) fp = 0xf0389708 pc = 0xf00d8198 args = (0xf034e630,
0xf3d6a87c, 0xf03eb800, 0xf038fc00, 0x104, 0xf0397000, 0xf0389778) fp =
0xf0389778 pc = 0xf02a2c70 args = (0xf3d5ac00, 0xf3d82148, 0xf3d8214c,
0x0, 0xf03898c4, 0xb5ffd110, 0xf03897e0) fp = 0xf03897e0 pc =
0xf00cd43c args = (0xf3d82000, 0xf3d82148, 0xf3d8214c, 0x0,
0xf03898c4, 0xf03dcfc8, 0xf0389848) fp = 0xf0389848 pc = 0xf0008bdc
args = (0x0, 0xf00cc184, 0x700, 0x408000e6, 0xf03efcb4, 0x2,
0xf03898d0) fp = 0xf03898d0 pc = 0xf01d6b08 args = (0xf04ae000,
0xf04ae000, 0xf03eb800, 0x0, 0xf03eb914, 0x2, 0xf3b9de80) fp =
0xf3b9de80
dump to dev 7,1 not possible
rebooting
=========
== 2nd panic ==
sparc20c# panic: tp->t_lastm == NULL
Begin traceback...
0x0(0xc, 0x3c, 0x48, 0x4000, 0x4000, 0x8000) at netbsd:tcp_usrreq+0x404
tcp_usrreq(0xf0afaab0, 0x9, 0xf0b9ad00, 0x0, 0xf474d5a0, 0x0) at
netbsd:tcp_usrreq_wrapper+0x4a39c tcp_usrreq_wrapper(0xf0bc3b80, 0x9,
0xf0b9ad00, 0x0, 0x0, 0xf474d5a0) at netbsd:sosend+0x49c sosend(0x0,
0xc8, 0xf48ade20, 0x0, 0x0, 0x0) at netbsd:soo_write+0x2c soo_write
(0xf4a06c00, 0xf4a06c00, 0xf48ade20, 0xf4a00840, 0x1, 0x2067fa37) at
netbsd:dofilewrite+0x5c dofilewrite(0x16, 0xf4a06c00, 0x20637000, 0x5c,
0xf4a06c00, 0xf0002000) at netbsd:sys_write+0x54 sys_write(0xf474d5a0,
0xf48adf20, 0xf48adf40, 0x3, 0x0, 0x6) at netbsd:syscall_plain+0xe0
syscall_plain(0x404, 0xf48adfb0, 0x204e7894, 0x3, 0x61984, 0xf04ae000)
at netbsd:memfault_sun4m+0x404 End traceback... Frame pointer is at
0xf48ad9d8 Call traceback: pc = 0xf02e2f9c args = (0x11, 0x5, 0x0,
0x0, 0xf48adaf8, 0x100, 0xf48ada40) fp = 0xf48ada40 pc = 0xf01f8a1c
args = (0x104, 0x0, 0x1, 0xf48adef8, 0xf01f8730, 0x100, 0xf48adab0) fp
= 0xf48adab0 pc = 0xf0025554 args = (0xf033ebc8, 0x14c, 0xf0af9b00,
0xf038fc00, 0x104, 0xf0397000, 0xf48adb20) fp = 0xf48adb20 pc =
0xf002be30 args = (0xc, 0x3c, 0x48, 0x4000, 0x4000, 0x8000,
0xf48adc00) fp = 0xf48adc00 pc = 0xf005e99c args = (0xf0afaab0, 0x9,
0xf0b9ad00, 0x0, 0xf474d5a0, 0x0, 0xf48adc68) fp = 0xf48adc68 pc =
0xf0220244 args = (0xf0bc3b80, 0x9, 0xf0b9ad00, 0x0, 0x0, 0xf474d5a0,
0xf48adcd0) fp = 0xf48adcd0 pc = 0xf0208c7c args = (0x0, 0xc8,
0xf48ade20, 0x0, 0x0, 0x0, 0xf48add50) fp = 0xf48add50 pc = 0xf01ff848
args = (0xf4a06c00, 0xf4a06c00, 0xf48ade20, 0xf4a00840, 0x1,
0x2067fa37, 0xf48addc0) fp = 0xf48addc0 pc = 0xf01ff964 args = (0x16,
0xf4a06c00, 0x20637000, 0x5c, 0xf4a06c00, 0xf0002000, 0xf48ade50) fp =
0xf48ade50 pc = 0xf02f1274 args = (0xf474d5a0, 0xf48adf20, 0xf48adf40,
0x3, 0x0, 0x6, 0xf48adec0) fp = 0xf48adec0 pc = 0xf0008838 args =
(0x404, 0xf48adfb0, 0x204e7894, 0x3, 0x61984, 0xf04ae000, 0xf48adf50)
fp = 0xf48adf50 pc = 0x203e541c args = (0x3, 0x20637000, 0x5c,
0x61800, 0x61800, 0x61984, 0xefffdd78) fp = 0xefffdd78
dump to dev 7,1 not possible
rebooting
==========
> netbsd:esp_dma_setup+0x1c esp_dma_setup(0xf3d82000, 0xf3d82148, 0xf3d8214c, 0x0, 0xf03898c4, 0xf03dcfc8) at netbsd:ncr53c9x_intr+0x12b8
> ncr53c9x_intr(0x0, 0xf00cc184, 0x700, 0x408000e6, 0xf03efcb4, 0x2) at netbsd:sparc_interrupt44c+0x160
OK, i'm convinced i need to put a disk into my ss20. :-) i usually
run everything diskless..
> == 2nd panic ==
> sparc20c# panic: tp->t_lastm == NULL
> Begin traceback...
> 0x0(0xc, 0x3c, 0x48, 0x4000, 0x4000, 0x8000) at netbsd:tcp_usrreq+0x404
> tcp_usrreq(0xf0afaab0, 0x9, 0xf0b9ad00, 0x0, 0xf474d5a0, 0x0) at netbsd:tcp_usrreq_wrapper+0x4a39c
> tcp_usrreq_wrapper(0xf0bc3b80, 0x9, > 0xf0b9ad00, 0x0, 0x0, 0xf474d5a0) at netbsd:sosend+0x49c
> sosend(0x0, > 0xc8, 0xf48ade20, 0x0, 0x0, 0x0) at netbsd:soo_write+0x2c
[ ... ]
this one i don't know about at all, and have never seen it.
i'm curious if you have or can test -current kernels? thanks.
in other news, the rmind-uvmplock branch is appearing to be very stable
for SMP sparc machines, more so than -current. we can't back port that
branch to netbsd-5, but maybe we can use it to figure out the missing
parts.
.mrg.
> failure is expected :-) thanks for testing.
I replaced the SM51s with a dual HyperSPARC module (two 125MHz
HyperSPARCs on a single MBUS card). That changed the symptoms but it's
still unstable. Hangs or panics on disk activity are common and I've
also crashed it with network activity (ttcp receive). Symptoms include:
+ System hangs with no error messages.
+ SCSI errors followed by a hang.
+ Various panics, console capture included below.
The SCSI errors look very much like a bad disk but they go away when I
re-run the test with a UP kernel so I don't think it's the disk.
> i'm curious if you have or can test -current kernels? thanks.
No, I don't track -current. However, I'm willing to try pre-built
kernels as long as they will work with a 5.1 userland.
George
== 'biodone' panic during boot ==
Configuring network interfaces: le0.
Adding interface aliases:.
add net default: gateway 10.0.1.164
Building databases: dev, utmp, utmpx, services panic: biodone2 already
Begin traceback...
0x0(0xf0bfedb0, 0xf0bfedb0, 0x0, 0xf3ca4ec0, 0x30299d87, 0x731db7f9) at
netbsd:b iointr+0x44
biointr(0x0, 0xf3ca4ec0, 0x1, 0xf03e9340, 0xf03efcb4, 0xf00027a0) at
netbsd:soft int_thread+0x74
softint_thread(0xf3d47008, 0xf3ca7780, 0x1001, 0xf03ea680, 0x1e4010e1,
0xf03dcd2 e) at netbsd:lwp_setfunc_trampoline
End traceback...
Frame pointer is at 0xf3d4ccd0
Call traceback:
pc = 0xf02e2f9c args = (0x11, 0x5, 0x0, 0x0, 0xf3d4cdf0, 0x0,
0xf3d4cd38) fp = 0xf3d4cd38
pc = 0xf01f8a1c args = (0x104, 0x0, 0x1, 0xf3d4cf20, 0xf01f8730,
0x0, 0xf3d4c da8) fp = 0xf3d4cda8
pc = 0xf0227058 args = (0xf035e540, 0x0, 0x1, 0xf038fc00, 0x104,
0xf0397000, 0xf3d4ce18) fp = 0xf3d4ce18
pc = 0xf02270a8 args = (0xf0bfedb0, 0xf0bfedb0, 0x0, 0xf3ca4ec0,
0x30299d87, 0x731db7f9, 0xf3d4ce80) fp = 0xf3d4ce80
pc = 0xf01dc450 args = (0x0, 0xf3ca4ec0, 0x1, 0xf03e9340,
0xf03efcb4, 0xf0002 7a0, 0xf3d4cee8) fp = 0xf3d4cee8
pc = 0xf000a3e4 args = (0xf3d47008, 0xf3ca7780, 0x1001, 0xf03ea680,
0x1e4010e 1, 0xf03dcd2e, 0xf3d4cf50) fp = 0xf3d4cf50
pc = 0x0 args = (0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) fp = 0x0
dumping to dev 7,1 offset 395477
dump Async registers (mid 8): afsr=3800<SE,UC,TO,AFA=0>; afva=0x00
Async registers (mid 9): afsr=1400<UC,BE,AFA=0>; afva=0x00
cpu0: NMI: system interrupts: 100c0000<VME=0,SBUS=0,SC,T,M>
memory error:
EFSR: 10002<DW=0,SYNDROME=0,ME>
MBus transaction: fcc4d30<VAH=0,TYPE=3,SIZE=5,C,VA=31,S,MID=0>
address: 0x0f02eb000
module location: ?
Type 'go' to resume
Type help for more information
<#0> ok
==========================
== 'NULL fpstate' panic during boot ==
Updating motd.
postfix: rebuilding /etc/mail/aliases (missing /etc/mail/aliases.db)
panic: cpu1: NULL fpstate
Begin traceback...
End traceback...
Frame pointer is at 0xf3e4aba8
Call traceback:
xcall(cpu0,0xf000ab84): couldn't ping cpus: cpu1
pc = 0xf02e2f9c args = (0x11, 0x5, 0x0, 0x0, 0xf3e4acc8, 0xffffffff,
0xf3e4ac
10) fp = 0xf3e4ac10
pc = 0xf01f8a1c args = (0x104, 0x0, 0x1, 0xf0340411, 0xf01f8730,
0xd00, 0xf3e 4ac80) fp = 0xf3e4ac80
pc = 0xf000ac20 args = (0xf000ab70, 0x1, 0x0, 0xf038fc00, 0x104,
0xf0397000, 0xf3e4acf0) fp = 0xf3e4acf0
pc = 0xf00089ec args = (0xf3e4adb8, 0xf02e02f0, 0xd00, 0x1e8000e4,
0xffff, 0x f0582000, 0xf3e4ad58) fp = 0xf3e4ad58
pc = 0xf01e3774 args = (0x863, 0xf0290214, 0x407, 0x0, 0x0, 0xb00,
0xf3e4ae08 ) fp = 0xf3e4ae08
pc = 0xf01df1cc args = (0xf3e4aed0, 0xf0582000, 0xf03eb800, 0x0,
0xf03eb914, 0x2, 0xf3e4ae70) fp = 0xf3e4ae70
pc = 0xf01c63b4 args = (0xf3caa2a0, 0xf3ca4dc0, 0xf0002000,
0xf3ca4d80, 0x0, 0xd, 0xf3e4aee8) fp = 0xf3e4aee8
pc = 0xf0009fb4 args = (0xf3e4d000, 0x0, 0x0, 0x0, 0x0, 0x0,
0xf3e4af50) fp = 0xf3e4af50
pc = 0x0 args = (0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xefffffa0) fp =
0xefffffa0
dumping to dev 7,1 offset 395477
dump Async registers (mid 9): afsr=1400<UC,BE,AFA=0>; afva=0x00
Async registers (mid 8): afsr=3800<SE,UC,TO,AFA=0>; afva=0x00
cpu0: NMI: system interrupts: 100c0000<VME=0,SBUS=0,SC,T,M>
memory error:
EFSR: 10002<DW=0,SYNDROME=0,ME>
MBus transaction: fcc4d30<VAH=0,TYPE=3,SIZE=5,C,VA=31,S,MID=0>
address: 0x0f02eb000
module location: ?
xcall(cpu1,0xf02e6270): couldn't ping cpus: cpu0
xcall(cpu1,0xf02e6270): couldn't ping cpus: cpu0
xcall(cpu1,0xf02e6270): couldn't ping cpus: cpu0
xcall(cpu1,0xf02e6270): couldn't ping cpus: cpu0
xcall(cpu1,0xf02e6270): couldn't ping cpus: cpu0
xcall(cpu1,0xf02e6270): couldn't ping cpus: cpu0
xcall(cpu1,0xf02e6270): couldn't ping cpus: cpu0
xcall(cpu1,0xf02e6270): couldn't ping cpus: cpu0
xcall(cpu1,0xf02df650): couldn't ping cpus: cpu0
xcall(cpu1,0xf02df650): couldn't ping cpus: cpu0
Async registers (mid 9): afsr=1400<UC,BE,AFA=0>; afva=0x00
===========================================================
== panic during ttcp receive ==
sparc20c# ttcp -r
ttcp-r: buflen=8192, nbuf=2048, port=5001 tcp
ttcp-r: socket
ttcp-r: accept from 10.0.1.170
missing buffer, no_td = 3, last_td = 0
missing buffer, no_td = 3, last_td = 0
missing buffer, no_td = 5, last_td = 0
missing buffer, no_td = 7, last_td = 0
missing buffer, no_td = 1, last_td = 0
missing buffer, no_td = 1, last_td = 0
missing buffer, no_td = 1, last_td = 0
missing buffer, no_td = 1, last_td = 0
Mar 11 15:51:04 sparc20c /netbsd: le0: device timeout
missing buffer, no_td = 3, last_td = 0
missing buffer, no_td = 1, last_td = 1
missing buffer, no_td = 1, last_td = 1
missing buffer, no_td = 1, last_td = 1
missing buffer, no_td = 3, last_td = 0
missing buffer, no_td = 3, last_td = 0
missing buffer, no_td = 5, last_td = 0
missing buffer, no_td = 7, last_td = 0
missing buffer, no_td = 9, last_td = 0
ttcpanic: sbdrop
Begin traceback...
0x0(0xf0c97230, 0x55a, 0x0, 0xf03ea3c0, 0xf4ada7c8, 0xf4a6114c) at
p-r:netbsd:sbflush+0x14 167sbflush(0xf0c97230, 0xf0c97230, 0x1, 0x0,
0x1e4010e0, 0xf03dcfa8) at 75netbsd:tcp_disconnect+0x50 846
tcp_disconnect(0xf0cd0dd0, 0xf0c97178, 0xf4aabdc0, 0xf03ea5c0,
0x1e4010a1, 0xf03dcd28) at bnetbsd:tcp_usrreq+0x168 ytestcp_usrreq(0x0,
0x6, 0x0, 0x0, 0x0, 0xf0bf8a80) at inetbsd:tcp_usrreq_wrapper+0x20 n
24tcp_usrreq_wrapper(0xf0c97178, 0x6, 0x0, 0x0, 0x0, 0x0)
at .1netbsd:sodisconnect+0x40 6 resodisconnect(0x25, 0xf3ca4f80, 0x0,
0x0, 0x1e0010e4, 0xf03dcca5) at alnetbsd:soclose+0x1e4 secsoclose
(0xf0c97178, 0x0, 0xf49a75c0, 0x0, 0x1e4010e5, 0xf03dcc6b) at
onnetbsd:soo_close+0x18 ds =soo_close(0x0, 0x0, 0xf49a75c0, 0xf03ea620,
0x1e4010e6, 0xf03dce18) at 6netbsd:closef+0x6c 78.0closef(0xf3eca340,
0x4, 0x0, 0xf03ea3c0, 0xf4ada7c8, 0xf4a6114c) at 0 netbsd:fd_close+0xc0
KB/sfd_close(0x4, 0xf4aabe40, 0xffffffff, 0x0, 0x1e4010e0, 0xf03dcfa8)
at ecnetbsd:fd_free+0x7c +++fd_free(0xf4a61000, 0x4, 0xf4aabdc0,
0xf03ea5c0, 0x1e4010a1, 0xf03dcd28) at netbsd:exit1+0x10c exit1
(0xf49a75c0, 0x0, 0x200334b8, 0xf499dfb0, 0x20024600, 0xf0000000) at
netbsd:sys_exit+0x38 sys_exit(0xf49a75c0, 0xf499df20, 0xf499df40, 0x0,
0x20024600, 0x6) at netbsd:syscall_plain+0xe0 syscall_plain(0x401,
0xf499dfb0, 0x20262378, 0x1058c, 0xefffe058, 0x1274a) at
netbsd:memfault_sun4m+0x404 End traceback... Frame pointer is at
0xf499d7c8 Call traceback: pc = 0xf02e2f9c args = (0x11, 0x5, 0x0,
0x0, 0xf499d8e8, 0x100, 0xf499d830) fp = 0xf499d830 pc = 0xf01f8a1c
args = (0x104, 0x0, 0x1, 0xf499def8, 0xf01f8730, 0x100, 0xf499d8a0) fp
= 0xf499d8a0 pc = 0xf02211fc args = (0xf035e2d8, 0x0, 0xf49a75c0,
0xf038fc00, 0x104, 0xf0397000, 0xf499d910) fp = 0xf499d910 pc =
0xf022121c args = (0xf0c97230, 0x55a, 0x0, 0xf03ea3c0, 0xf4ada7c8,
0xf4a6114c, 0xf499d978) fp = 0xf499d978 pc = 0xf002b774 args =
(0xf0c97230, 0xf0c97230, 0x1, 0x0, 0x1e4010e0, 0xf03dcfa8, 0xf499d9e0)
fp = 0xf499d9e0 pc = 0xf002bb94 args = (0xf0cd0dd0, 0xf0c97178,
0xf4aabdc0, 0xf03ea5c0, 0x1e4010a1, 0xf03dcd28, 0xf499da48) fp =
0xf499da48 pc = 0xf0014620 args = (0x0, 0x6, 0x0, 0x0, 0x0,
0xf0bf8a80, 0xf499dab0) fp = 0xf499dab0 pc = 0xf021d53c args =
(0xf0c97178, 0x6, 0x0, 0x0, 0x0, 0x0, 0xf499db18) fp = 0xf499db18 pc =
0xf021eba8 args = (0x25, 0xf3ca4f80, 0x0, 0x0, 0x1e0010e4, 0xf03dcca5,
0xf499db80) fp = 0xf499db80 pc = 0xf0208ccc args = (0xf0c97178, 0x0,
0xf49a75c0, 0x0, 0x1e4010e5, 0xf03dcc6b, 0xf499dbe8) fp = 0xf499dbe8 pc
= 0xf01bfbcc args = (0x0, 0x0, 0xf49a75c0, 0xf03ea620, 0x1e4010e6,
0xf03dce18, 0xf499dc50) fp = 0xf499dc50 pc = 0xf01bfcf4 args =
(0xf3eca340, 0x4, 0x0, 0xf03ea3c0, 0xf4ada7c8, 0xf4a6114c, 0xf499dcd0)
fp = 0xf499dcd0 pc = 0xf01bfe48 args = (0x4, 0xf4aabe40, 0xffffffff,
0x0, 0x1e4010e0, 0xf03dcfa8, 0xf499dd50) fp = 0xf499dd50 pc =
0xf01c5224 args = (0xf4a61000, 0x4, 0xf4aabdc0, 0xf03ea5c0,
0x1e4010a1, 0xf03dcd28, 0xf499ddb8) fp = 0xf499ddb8 pc = 0xf01c5910
args = (0xf49a75c0, 0x0, 0x200334b8, 0xf499dfb0, 0x20024600,
0xf0000000, 0xf499de58) fp = 0xf499de58 pc = 0xf02f1274 args =
(0xf49a75c0, 0xf499df20, 0xf499df40, 0x0, 0x20024600, 0x6, 0xf499dec0)
fp = 0xf499dec0 pc = 0xf0008838 args = (0x401, 0xf499dfb0, 0x20262378,
0x1058c, 0xefffe058, 0x1274a, 0xf499df50) fp = 0xf499df50 pc =
0x2024147c args = (0x0, 0x0, 0x800, 0x202973d4, 0x202971a4, 0x800,
0xefffe940) fp = 0xefffe940
dump to dev 7,1 not possible
rebooting
===============================
Hello,
On Mar 11, 2011, at 4:20 PM, George Harvey wrote:
> On Fri, 11 Mar 2011 18:13:55 +1100
> matthew green <m...@eterna.com.au> wrote:
>
>> failure is expected :-) thanks for testing.
>
> I replaced the SM51s with a dual HyperSPARC module (two 125MHz
> HyperSPARCs on a single MBUS card). That changed the symptoms but it's
> still unstable. Hangs or panics on disk activity are common and I've
> also crashed it with network activity (ttcp receive).
I use two 125MHz/256kB HyperSPARCs with -current and lately the rmind-
uvmplock branch.
> Symptoms include:
>
> + System hangs with no error messages.
I've seen that.
> + SCSI errors followed by a hang.
> + Various panics, console capture included below.
I haven't seen any SCSI errors at all, and no panics lately. What I
usually get is a watchdog reset.
Also, for some weird reason my SS20 keeps going for a LOT longer when
booted cold - I confirmed this several times ( when warm booted I'd
get a watchdog reset within minutes running build.sh, cold booted the
same kernel kept going for more than 30 hours with load avg. >4 ) but
I don't have the faintest idea what could cause that.
For the records, I have the OS, $OBJDIR etc. on local disks, sources
over nfs.
have fun
Michael
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
iQEVAwUBTXqxdspnzkX8Yg2nAQLyEwgAnK+vFVMVItQ2MxaY/D3WbXrJNEiNsBns
OjtbylBwzNZv/jCFgu4dA3vUtcDcn2M6KauIlWusJCXv2VejEEwl/4/1KY65YtEL
dJiziOCWuSIU0G6uFTvmoD9SSi7UWNy3r4C5x0fpEuZHXaGNH0PezoHGyrvhFrxC
+5gAOD4JJ8Jjp0hQja3zUJYLwS2LU9yuD3xjtWOvS7vaGo+exI16f/jYKar7sHJE
tx3isGulk16ASmuy/sTBTnls612QGTujEZtrNzwLqhJzT2gTBuem14kM3Lims2F1
YEJ1eYSJ2BmDEBxgK8JjNLHUvQfKe9Z3uQzXuMR3UVKJWFI+wa2Qzw==
=C84H
-----END PGP SIGNATURE-----