The machine has 4GB of RAM, although only a little over 3.5GB is
actually visible, naturally. As such, I only defined a token 2GB
of swap space.
I have the RAID in a single filesystem defined using GPT and wedges.
All file systems are mounted with the "log" option.
Prior to having the RAID intialized, newfs'ed and mounted, nothing
appeared amiss.
Once the big filesystem was online, the system will panic when shutting
down (e.g.: shutdown -r now) with "mutex lock error: locking against
myself". Unfortunately it all goes by too fast to read. It saves the
crashdump, but I think the subsequent savecore has problems--the memory
image is there, but the kernel image is only 10 bytes in size. It
displays "(null) bad address".
This appears to happen AFTER the big filesystem is successfully unmounted
but BEFORE the system disk's filesystems have been unmounted. Console
and dmesg indicates that the OS partitions get the log-replay treatment
when starting up again. The RAID and its filesystem are intact.
I've determined that if I manually unmount the GPT/RAID filesystem
before shutting down it doesn't panic.
Any clues?
Need more data?
--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-...@muc.de
Can you 'sysctl -w ddb.onpanic=1; shutdown -r now' and type 'bt' for a
backtrace?
Dave
--
David Young OJC Technologies
dyo...@ojctech.com Urbana, IL * (217) 344-0444 x24
It may not be related, but I suspect you may be the first person to be
testing the RF_RAID5_RS bits :) It wouldn't surprise me if there were a
few wrinkles with that code and NetBSD.
> Need more data?
Probably... (But I won't have a chance to look at this for at least 3
weeks... )
Later...
Greg Oster
You probably only need to boot a current kernel, the installed
userspace should be good enough.
David
--
David Laight: da...@l8s.co.uk
> It may not be related, but I suspect you may be the first person to be
> testing the RF_RAID5_RS bits :) It wouldn't surprise me if there were a
> few wrinkles with that code and NetBSD.
I've used it for years. On my previous file server, I actually got it to
work once (as in failed component replaced and reconstructed in place).
The other times, a second unit failed during reconstruction but the
raid continued to run OK in degraded mode.
The current file-server is built with it, but hasn't been put to the
test (no failed components since being placed into service in 2009).
The current fileserver is running NetBSD-4.0_STABLE. It has never
lost the RAID, but its system disk keeps eating itself and causing
panics (mostly freeing free block and mangled directory entry).
Also, it takes 17+ hours to check/reinitialize parity, so that's why I'm
building another one on hardware that likes the netbsd-5 branch.
In response to David Young's posting:
Jul 5 13:21:25 yggdrasil shutdown: reboot by sysop:
Jul 5 13:21:33 yggdrasil syslogd: Exiting on signal 15
syncing disks... 7 done
unmounting file systems...Mutex error: mutex_vector_enter: locking against myself
lock address : 0x00000000c4d5a71c
current cpu : 0
current lwp : 0x00000000cf9557c0
owner field : 0x00000000cf9557c0 wait/spin: 0/0
panic: lock error
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c057dc0c cs 8 eflags 246 cr2 bbb9890c ilevel 0
Stopped in pid 485.1 (reboot) at netbsd:breakpoint+0x4: popl %ebp
db{0}> bt
breakpoint(c0a89ee6,cfa097a8,c0ab9440,c04c7cbf,0,1,0,0,cfa097a8,cf9557c0) at net
bsd:breakpoint+0x4
panic(c0a4c51d,c0a4a16f,c0852252,c0a4a13e,c4d5a71c,0,cf9557c0,cf418730,c4d5a71c,
cf9557c0) at netbsd:panic+0x1b0
lockdebug_abort(c4d5a71c,c0ab62f0,c0852252,c0a4a13e,cf418730,1,cfa0987c,c049ce72
,c4d5a71c,c0852252) at netbsd:lockdebug_abort+0x2d
mutex_abort(c4d5a71c,c0852252,c0a4a13e,cf418730,a800,0,cfa0981c,c0514314,c0b7ac2
0,cdeb59c0) at netbsd:mutex_abort+0x2e
mutex_vector_enter(c4d5a71c,0,0,4,7,c4bfe000,cfa098cc,cdeb5740,a8,a8) at netbsd:
mutex_vector_enter+0x262
dkwedge_del(cfa098d0,cf35de48,10,306b64,c0b15b8c,cdeb5740,1,c081019d,cdeb5748,c4
d6c488) at netbsd:dkwedge_del+0x188
dkwedge_delall(c4cb2828,c0aa3220,0,c0826120,1203,cf9557c0,cfa099fc,c04bdbc4,1203
,3) at netbsd:dkwedge_delall+0x61
raidclose(1203,3,6000,cf9557c0,6000,3,6,3,cf6cc5c0,0) at netbsd:raidclose+0x12f
bdev_close(1203,3,6000,cf9557c0,0,0,cfa09a4c,1203,6000,0) at netbsd:bdev_close+0
x84
spec_close(cfa09a58,20002,cfa09a6c,c0509038,cf6cc5c0,c08537a0,cf6cc5c0,3,fffffff
f,3) at netbsd:spec_close+0x237
VOP_CLOSE(cf6cc5c0,3,ffffffff,c04f7335,c4d5a600,0,cfa09aac,c046a9e6,cf6cc5c0,3) a
t netbsd:VOP_CLOSE+0x6c
vn_close(cf6cc5c0,3,ffffffff,c0850ac0,a800,cf9557c0,cfa09adc,c04bdbc4,a800,3) at
netbsd:vn_close+0x4e
dkclose(a800,3,6000,cf9557c0,6000,3,6,3,cf4187e8,0) at netbsd:dkclose+0xc6
bdev_close(a800,3,6000,cf9557c0,0,0,cfa09b2c,a800,6000,0) at netbsd:bdev_close+0
x84
spec_close(cfa09b38,20002,cfa09b4c,c0509038,cf4187e8,c08537a0,cf4187e8,3,fffffff
f,c57ed000) at netbsd:spec_close+0x237
VOP_CLOSE(cf4187e8,3,ffffffff,c03b9066,0,c4d6c8cc,c4d6c880,cf701000,cf701000,cf7
01024) at netbsd:VOP_CLOSE+0x6c
ffs_unmount(cf701000,80000,cf8d63c0,0,cf701000,0,cfa09bcc,c050732f,cf701000,8000
0) at netbsd:ffs_unmount+0x1c9
VFS_UNMOUNT(cf701000,80000,cf8d63c0,0,2001018,cf6cc398,1,cf701000,cf700000,cf955
7c0) at netbsd:VFS_UNMOUNT+0x26
dounmount(cf701000,80000,cf9557c0,0,cfa09c08,7,0,cf9557c0,cfa09d00,0) at netbsd:
dounmount+0x13f
vfs_unmountall(cf9557c0,0,0,c048976d,cde8b630,0,cfa09c3c,c058456b,0,cf9557c0) at
netbsd:vfs_unmountall+0x63
vfs_shutdown(0,cf9557c0,0,0,cfa09d00,0,cfa09cec,c04b83c4,0,0) at netbsd:vfs_shut
down+0x8d
cpu_reboot(0,0,0,0,0,0,c0a4b571,0,23,fffffffe) at netbsd:cpu_reboot+0x13b
sys_reboot(cf9557c0,cfa09d00,cfa09d28,0,0,bfbfeed8,8049144,2,1,1) at netbsd:sys_
reboot+0x74
syscall(cfa09d48,b3,ab,1f,1f,1,d,bfbfeed8,0,256) at netbsd:syscall+0xbd
db{0}>
Hope this helps. It'll be a couple of days before I can poke at the
machine again.