A box of mine running RELENG_7_0 and ZFS over a couple of disks (6
disks, 3 mirrors) seems to have gotten stuck. From Ctrl-T:
load: 0.50 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock]
0.02u 0.04s 0% 3404k
load: 0.43 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock]
0.02u 0.04s 0% 3404k
load: 0.10 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock]
0.02u 0.04s 0% 3404k
load: 0.10 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock]
0.02u 0.04s 0% 3404k
load: 0.11 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock]
0.02u 0.04s 0% 3404k
Worked for a while then that stopped working too (was over ssh). When
trying a local login i only got
load: 0.09 cmd: login 1611 [zfs] 0.00u 0.00s 0% 208k
I found one post like this earlier (by Xin LI), but nobody seemed to
have replied...
in my current conf, I think my kmem/kmem_max is at 512Mb (not sure
though, since I've edited my file yesterday for next reboot), with 2G
of system RAM.. Normally I'd run kmem(max) 1G (with arcsize of 512M.
currently it is at default), but since I just got back to 2G total mem
after some hardware problems I've been runnig at those lows (1G total
is kindof tight with zfs..)
Well, just wanted to report... The box is not totally dead yet, ie I
can still do Ctrl-T on console, but thats it.. I don't really know
what more I can do so.. I don't have KDB/DDB.
I'll wait another hour or so before I hard reboot it, unless it
"unlocks" or if anyone have any suggestions.
Thanks
--
Johan Ström
Stromnet
jo...@stromnet.se
http://www.stromnet.se/
_______________________________________________
freeb...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-...@freebsd.org"
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-...@muc.de
I don't think there are any suggestions left to give. Many people,
including myself, have experienced this kind of problem. It's well-
documented both on my Common Issues page, and the official FreeBSD ZFS
Wiki.
ZFS is still considered highly experimental, so if your data is at all
important to you, perform backups or switch to another filesystem
provider.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
The key is to increase your kmem and prevent it from being exhausted. I
think more recent OpenSolaris's ZFS code has some improvements but I do
not have spare devices at hand to test and debug :(
Maybe pjd@ would get a new import at some point? I have cc'ed him.
Cheers,
--
Xin LI <del...@delphij.net> http://www.delphij.net/
FreeBSD - The Power to Serve!
Ah.. I guess I was just to restrictive with the googling on
"zfs:&buf_hash_table.ht_locks[i].ht_lock".
>
>
> ZFS is still considered highly experimental, so if your data is at all
> important to you, perform backups or switch to another filesystem
> provider.
That I am aware of.
Thanks._______________________________________________
This situation is not recoverable and you can trust ZFS that you will
not lose data if they are already sync'ed.
> The key is to increase your kmem and prevent it from being
> exhausted. I think more recent OpenSolaris's ZFS code has some
> improvements but I do not have spare devices at hand to test and
> debug :(
Yep, never had the problem when I was running with 2G total mem, but
then one stick (damn consumer crap) failed and I was left with 1G, and
I started to have random problems. Going to tune kmem back up now when
I got more mem again, thinking about putting in 4G too..
>
>
> Maybe pjd@ would get a new import at some point? I have cc'ed him.
>
> Cheers,
> --
> Xin LI <del...@delphij.net> http://www.delphij.net/
> FreeBSD - The Power to Serve!
>
_______________________________________________
> For your question: just reboot would be fine, you may want to tune
> your arc size (to be smaller) and kmem space (to be larger), which
> would reduce the chance that this would happen, or eliminate it,
> depending on your workload.
Back online now, with kmem/kmem_max to 1G and arcsize to 512M. Are
those reasonable on a 2G machine? I think I've read that from
somewhere, but cannot find that (arc at least) in the TuningGuide now.
>
> This situation is not recoverable and you can trust ZFS that you
> will not lose data if they are already sync'ed.
>
Actually, I've had a lot of hard crashes lately on this machine (bad
hw) but not a single time I have lost data (to my knowledge at
least...). In that regard, comparing to UFS, ZFS is waaay better! :)
> --
> Xin LI <del...@delphij.net> http://www.delphij.net/
> FreeBSD - The Power to Serve!
>
_______________________________________________
Depending on your work load you are just buying more time, so
"reasonable" is a matter of perspective. :( I didn't see if you said
you are on 32bit or 64bit? Keep in mind the kmem max is 1.5-2G on amd64
regardless of how much memory you have. If 512M arcsize crashes too soon
for your tastes you can always lower it down to 256M, or 128M, etc.
I tried for several weeks to get ZFS stable on a 64bit system with a
1.5G kernel. The best uptime I ever got was 72 hours, the worst was 2,
the average about 24. Interestingly, most of the hangs were at off
hours, when the system was lightly loaded, had lots of free memory, etc.
That suggests to me a slow leak of some sort.
Anyway, ZFS is not ready for production. Some people may get lucky, but
you can't count on it.
Spike