Strange ZFS performance

Mikle

unread,

Apr 4, 2010, 3:18:45 PM4/4/10

to freeb...@freebsd.org

Hello, list!
I've got some strange problem with one-disk zfs-pool: read/write performance for the files on the fs (dd if=/dev/zero of=/mountpoint/file bs=4M count=100) gives me only 2 MB/s, while reading from the disk (dd if=/dev/disk of=/dev/zero bs=4M count=100) gives me ~70MB/s.
pool is about 80% full; PC with the pool has 2GB of ram (1.5 of which is free); i've done no tuning in loader.conf and sysctl.conf for zfs. In dmesg there is no error-messages related to the disk (dmesg|grep ^ad12); s.m.a.r.t. seems OK.
Some time ago disk was OK, nothing in software/hardware has changed from that day.
Any ideas what could have happen to the disk?

Wbr,

Jeremy Chadwick

unread,

Apr 4, 2010, 4:41:27 PM4/4/10

to Mikle, freeb...@freebsd.org

Please provide the following output:

1) uname -a
2) sysctl kstat.zfs.misc.arcstats
3) smartctl -a /dev/ad12

Also, does rebooting the box restore write speed (yes, this is a serious
question/recommendation)?

Mikle

unread,

Apr 4, 2010, 5:25:36 PM4/4/10

to Jeremy Chadwick, freeb...@freebsd.org

On Sun, Apr 04, 2010 at 01:41:27PM -0700, Jeremy Chadwick wrote:
> Please provide the following output:
>
> 1) uname -a
> 2) sysctl kstat.zfs.misc.arcstats
> 3) smartctl -a /dev/ad12

FreeBSD takino.zet 8.0-STABLE FreeBSD 8.0-STABLE #0: Mon Mar 8 06:25:34 MSK 2010 ro...@takino.zet:/usr/obj/usr/src/sys/TAKINO amd64
(TAKINO is pretty basic untuned config, generic config plus ipfw-related things minus '-g' debug flag)
sysctl & smart outputs are in the attaches.

> Also, does rebooting the box restore write speed (yes, this is a serious
> question/recommendation)?

Yes, it did slightly: after reboot i got 6MB/s.

Also, one more (may be) related thing: there was a power-crash some time ago.

Wbr,

smartctl.ad12.txt

sysctl.kstat.txt

sysctl.kstat.after_reboot.txt

Wes Morgan

unread,

Apr 4, 2010, 11:08:21 PM4/4/10

to Mikle, freeb...@freebsd.org

Has it ever been close to 100% full? How long has it been 80% full and
what kind of files are on it, size wise?

Mikle Krutov

unread,

Apr 5, 2010, 2:55:00 AM4/5/10

to Wes Morgan, freeb...@freebsd.org

No, it was never full. It is at 80% for about a week maybe. Most of the files are the video of the 200MB - 1.5GB size per file.

--
Wbr,
Krutov Mikle

Wes Morgan

unread,

Apr 5, 2010, 6:59:47 AM4/5/10

to Mikle Krutov, freeb...@freebsd.org

I'm wondering if your pool is fragmented. What does gstat or iostat -x
output for the device look like when you're doing accessing the raw device
versus filesystem access? A very interesting experiment (to me) would be
to try these things:

1) using dd to replicate the disc to another disc, block for block
2) zfs send to a newly created, empty pool (could take a while!)

Then, without rebooting, compare the performance of the "new" pools. For
#1 you would need to export the pool first and detach the original device
before importing the duplicate.

There might be a script out there somewhere to parse the output from zdb
and turn it into a block map to identify fragmentation, but I'm not aware
of one. If you did find that was the case, currently the only fix is to
rebuild the pool.

Mikle Krutov

unread,

Apr 5, 2010, 8:25:32 AM4/5/10

to Wes Morgan, freeb...@freebsd.org

device r/s w/s kr/s kw/s wait svc_t %b
ad12 18.0 0.0 2302.6 0.0 4 370.0 199
for cp'ing from one pool to another
gstat line is:
L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
3 22 22 2814 69.0 0 0 0.0 71.7| gpt/pool2
For dd (now performance is crappy, too):
L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
1 99 99 12658 14.2 0 0 0.0 140.4| gpt/pool2

Unfortunately, i got no free hdd with of same size, so the experiment will take time later.

Also, zfs faq from sun tells me:
>Q: Are ZFS file systems shrinkable? How about fragmentation? Any need to defrag them?
>A: <...> The allocation algorithms are such that defragmentation is not an issue.
Is that just marketing crap?

p.s. There was some mailing-list issue and we got second thread:

Also i forgot to post atacontrol cap ad12 to that thread, here it is:
Protocol SATA revision 2.x
device model WDC WD10EADS-00M2B0
serial number WD-WMAV50024981
firmware revision 01.00A01
cylinders 16383
heads 16
sectors/track 63
lba supported 268435455 sectors
lba48 supported 1953525168 sectors
dma supported
overlap not supported

Feature Support Enable Value Vendor
write cache yes yes
read ahead yes yes
Native Command Queuing (NCQ) yes - 31/0x1F
Tagged Command Queuing (TCQ) no no 31/0x1F
SMART yes yes
microcode download yes yes
security yes no
power management yes yes
advanced power management no no 0/0x00
automatic acoustic management yes no 254/0xFE 128/0x80

http://permalink.gmane.org/gmane.os.freebsd.devel.file-systems/8876
>On Mon, Apr 05, 2010 at 12:30:59AM -0700, Jeremy Chadwick wrote:
>> I'm not sure why this mail didn't make it to the mailing list (I do see
>> it CC'd). The attachments are included inline.
>>
>> SMART stats for the disk look fine, so the disk is unlikely to be
>> responsible for this issue. OP, could you also please provide the
>> output of "atacontrol cap ad12"?
>>
>> The arcstats entry that interested me the most was this (prior to the
>> reboot):
>>
>> > kstat.zfs.misc.arcstats.memory_throttle_count: 39958287
>>
>> The box probably needs tuning in /boot/loader.conf to relieve this
>> problem.
>>
>> Below are values I've been using on our production systems for a month
>> or two now. These are for machines with 8GB RAM installed. The OP may
>> need to adjust the first two parameters (I tend to go with RAM/2 for
>> vm.kmem_size and then subtract a bit more for arc_max (in this case
>> 512MB less than kmem_size)).
>>
>> # Increase vm.kmem_size to allow for ZFS ARC to utilise more memory.
>> vm.kmem_size="4096M"
>> vfs.zfs.arc_max="3584M"
>>
>> # Disable ZFS prefetching
>> # http://southbrain.com/south/2008/04/the-nightmare-comes-slowly-zfs.html
>> # Increases overall speed of ZFS, but when disk flushing/writes occur,
>> # system is less responsive (due to extreme disk I/O).
>> # NOTE: 8.0-RC1 disables this by default on systems <= 4GB RAM anyway
>> # NOTE: System has 8GB of RAM, so prefetch would be enabled by default.
>> vfs.zfs.prefetch_disable="1"
>>
>> # Decrease ZFS txg timeout value from 30 (default) to 5 seconds. This
>> # should increase throughput and decrease the "bursty" stalls that
>> # happen during immense I/O with ZFS.
>> # http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007343.html
>> # http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007355.html
>> vfs.zfs.txg.timeout="5"
>I've tried that tuning, now i have:
>vm.kmem_size="1024M"
>vfs.zfs.arc_max="512M"
>vfs.zfs.txg.timeout="5"
>No change in perfomance. Also, now reading directrly from hdd is slow, too (22-30MB/s), so that shows me
>that this could be some hardware problem (sata controller? but than the other disks were in same situation
>too. also, i've thought that that could be sata cable - and changed it - no speed after this).
>Additional information for dd:
>dd if=/dev/zero of=./file bs=4M count=10
>41943040 bytes transferred in 0.039295 secs (1067389864 bytes/sec)
>
>dd if=/dev/zero of=./file bs=4M count=20
>83886080 bytes transferred in 0.076702 secs (1093663943 bytes/sec)
--
Wbr,
Krutov Mikle

Mikle Krutov

unread,

Apr 6, 2010, 1:34:59 PM4/6/10

to freeb...@freebsd.org

On Sun, Apr 04, 2010 at 11:18:45PM +0400, Mikle wrote:

Well, list, somehow after moving some important files (about 50GB)
to other disk performance became OK. I did not get why it was so
crappy, if anyone could give me any ideas - that would be great.
Everyone, thanks for the replies.

--
Wbr,
Krutov Mikle