On 10/8/2009 2:01 PM, sghe...@hotmail.com wrote:
> if anyone notices that bs=1m is illegal for GNU dd... you are right. I
> actually used sdd but no one uses it, and neither should I since the
> speed/progress reporting it does sucks. pv (pipe view) to the rescue!
>
> By the way, I ran another test with compression off:
>
> *root@intrepid:/tmp# pv blob.bin > /ssd/faster
> * 1GB 0:00:14 [72.1MB/s] [====>] 100%
> root@intrepid:/tmp# pv blob.bin > /ssd/nocompress/fast
> 1GB 0:00:16 [63.5MB/s] [====>] 100%
> root@intrepid:/tmp# zfs list -o name,compression,compressratio
> NAME COMPRESS RATIO
> ssd on 1.00x
> ssd/nocompress off 1.00x
>
> Obviously pure randomness should be uncompressible. However, the
> apparent timing difference did surprise me a little. It could be that it
> is entirely within the margin of measurement error...
>
> sghe...@hotmail.com wrote:
>> I'm so happy. Just received and installed twice the Corsair Extreme
>> 64G SSDs (rated raw r/w 240/135Mb/s by manufacturer).
>>
>> Guess what... first thing I did:
>>
>> ------
>> PS. Worth noting I'm on OS and package versions stated below
>> Anyone have experience tuning ZFS for SSD?
>> ------
>>
>> *$ zpool status
>> $ zpool create ssd /dev/disk/by-id/ata-Corsair_CMFSSD64-D1_[ST]*
>> * pool: ssd
>> state: ONLINE
>> scrub: none requested
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> ssd ONLINE 0 0 0
>> disk/by-id/ata-Corsair_CMFSSD64-D1_S92O8T9T1F66W3MZ3O11 ONLINE 0 0 0
>> disk/by-id/ata-Corsair_CMFSSD64-D1_T30O5KI2747KJJ7DZG91 ONLINE 0 0 0
>>
>> errors: No known data errors
>>
>> *$ zfs set compression=on ssd
>> $ cd /tmp # my tmp is on tmpfs
>>
>> $ # sorry, didn't time:
>> $ dd if=/dev/urandom of=blob.bin bs=1m count=1024
>>
>> $ # now for the fun:
>> $ pv blob.bin > /ssd/fast
>> *
>> Guess what: steady write throughput (synthetetic test) yeilds 68Mb/s
>> _on average_. Using zpool iostat I've spotted a peak off 91Mb/s (whoah).
>>
>> This is totally acceptable for me! Needless to say, I'll run a config
>> on ext4/BIOS RAID0 and a setup on NILFS (dual linear SATA mode) for
>> comparisons before I make up my mind, but this is no longer a stopper
>> for me in terms of performance.
>>
>> Now for some even less representative fun:
>>
>> *root@intrepid:/tmp/work# pv /ssd/fast | md5sum
>> * 1GB 0:00:03 [ 260MB/s] [=============>] 100%
>> f4a8bb6266eed950e8c76163fa092a3b -
>>
>> *root@intrepid:/tmp/work# echo 1 > /proc/sys/vm/drop_caches
>> root@intrepid:/tmp/work# pv /ssd/fast | md5sum
>> * 1GB 0:00:05 [ 184MB/s] [============>] 100%
>> f4a8bb6266eed950e8c76163fa092a3b -
>> *
>> *Not bad AT ALL (sorry for shouting)!
>>
>> *VERSIONS USED:**
>> root@intrepid:/tmp/work# dpkg --status zfs-fuse
>> *Package: zfs-fuse
>> Status: install ok installed
>> Priority: optional
>> Section: multiverse/admin
>> Installed-Size: 4160
>> Maintainer: Filip Brcic <br...@gna.org>
>> Architecture: i386
>> Version: 0.5.1-1ubuntu5
>> *
>>
>> root@intrepid:/tmp/work# lsb_release -a
>> *No LSB modules are available.
>> Distributor ID: Ubuntu
>> Description: Ubuntu 9.04
>> Release: 9.04
>> Codename: jaunty
>> *root@intrepid:/tmp/work# uname -a
>> *Linux intrepid.sehe.nl 2.6.28-15-server #52-Ubuntu SMP Wed Sep 9
>> 11:50:50 UTC 2009 i686 GNU/Linux
>> *
>> /Yes I know my hostname is a misnomer. /*Note I'm using a server
>> kernel image because it will access 8Gb or RAM without tweaking.
>>
>> Seth
>>
>>
>
>
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google
> Groups "zfs-fuse" group.
> To post to this group, send email to zfs-...@googlegroups.com
> To unsubscribe from this group, send email to
> zfs-fuse+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/zfs-fuse?hl=en
> -~----------~----~----~----~------~----~------~--~---
>
I'll keep you posted
- ZFS shouldn't be faster than anything at full sequential reading. I
agree -- that's suspicious, but I can't see anything obviously wrong
with your setup. Well, maybe one thing. You should try to run tests
that take minutes or longer to complete. The difference between 6
seconds and 7 seconds is technically over 10%, but probably
insignificant. Increase your data size by 10x or 100x if possible.
- Have you tried any random reads/writes or mixed traffic?
- ISO is not a compressed format, but often the data inside an ISO is
compressed. Checking compressratio at 1.00x is enough to prove that the
data is incompressible, so that's fine.
- "fake-RAID" is usually not worth it, as you have shown below.
- I was previously suspicious if md5sum was limiting your performance,
but I didn't mention it. /dev/null is pretty fast though :D
Jonathan
On 10/9/2009 6:51 AM, sghe...@hotmail.com wrote:
> I repeated some of the striping performance tests using onboard Intel
> (fake)raid instead of lvm striping. IE I created a zfs pool on the BIOS
> raidset (single vdev) and a linear lvm volume on the the same disk. I
> kept the stripe-size at the same recommended 128k and the fs for lvm at
> ext2/4k blocks/32 blocks stripe-width (see before). [1] Raw figures, below.
>
> *CONCLUSIONS:
> *Biggest observation: I may have done an incorrect measurement of the
> max throughput of md5sum on my CPU ... I switched the read test to >
> /dev/null instead of | md5sum. To my horror/surprise I got some read
> speeds far in excess of my md5sum speed... eeck. It seems I'll have to
> check the accuracy of my reported read speeds... :(
>
> It appears that write performance on ZFS suffers a bit compared to
> letting zfs-fuse manage the striping internally: drop from ~68 to ~57
> Mb/s (when writing). The (tweaked) ext2 approach suffers _more_,
> dropping from ~199 to ~151 Mb/s. The takeaway from this test is that
> I'll probably end up ditching fake-raid, since it looses performance and
> flexible management. The only upside would be that I would be able to
> dual boot a windows install on it in the future. Nice, but not enough to
> make these sacrifices for....
>
> On the read side of things I'm a a bit confused now (see 'obervation'
> below). Read spead of ZFS on fakeraid might anything from _twice as
> fast_ (sic?!) to _slower_ by unknown amount. On the ext2 side of things
> the same thing goes. I can't explain ext2 on fakeraid0 being faster by
> ~264 to ~148 Mb/s compared to lvm striping with the same specs... I'll
> go back to lvm stripes on non-fakeraid0 for verification without the
> md5sum goofup
> *
> *The good news is: ZFS read performance really champions everything,
> averaging at ~355Mb/s. (!)
> This was a bit too awesome for my taste, I didn't quite trust things
> weren't just cached/buffered somewhere. I tried to minimize that chance
> by exporting/importing the pool before performing the read tests. No
> difference however... So unless someone points me to the obvious mistake...?
>
> By comparison, linear lvm2 on the same fake-raid0 is only meager: 'just'
> 264Mb/s read speed :)
>
> Also note that the assymetry of read vs write performance is extreme
> with ZFS (single vdev on fake-raid0): ~57 vs ~355 Mb/s Whoah: roughly a
> factor 7.
> The chasm is not at all that extreme using ext2 on linear lvm2 on
> fake-raid0: ~151 vs 264 Mb/s, not even a factor 2?
>
> It seems to me things could be tuned on the ZFS write side. Note that
> these are new SSDs and the blocks used for these test have never been
> written to before. So the classic 'MLC write degradation' arguments
> should be ruled out for these tests.
> I'll simply have to go and see in karmic beta using the 0.6.x versions
> in order to test the big_writes patch. Somehow, I expect big wins
> especially for this test scenario... Does anyone know whether I still
> have to upgrade the version of fuse in order to do so? Or does karmic
> come with a modern-enough fuse module?
>
> *ZFS WRITE PERFORMANCE:
> *
> root@intrepid:/home/sehe/custom# ls *.iso | cpio -ov | pv >
> /ssd/uncompressed/fast
> 2.53GB 0:00:45 [57.5MB/s]
>
> root@intrepid:/home/sehe/custom# ls *.iso | cpio -ov | pv >
> /ssd/compressed/fast
> 2.53GB 0:00:45 [56.4MB/s]
>
> *ZFS READ PERFORMANCE:
> *
> root@intrepid:~/SSDTEST# zpool export ssd
> root@intrepid:~/SSDTEST# zpool import -d /dev/mapper/ ssd
>
> root@intrepid:~/SSDTEST# echo 1 > /proc/sys/vm/drop_caches
> root@intrepid:~/SSDTEST# pv /ssd/uncompressed/fast > /dev/null
> 2.53GB 0:00:07 [ 359MB/s]
>
> root@intrepid:~/SSDTEST# echo 1 > /proc/sys/vm/drop_caches
> root@intrepid:~/SSDTEST# pv /ssd/compressed/fast > /dev/null
> 2.53GB 0:00:07 [ 355MB/s]
>
> *LVM WRITE PERFORMANCE:*
>
> root@intrepid:/home/sehe/custom# ls *.iso | cpio -ov | pv >
> ~/SSDTEST/bios_striped/fast
> 2.53GB 0:00:17 [ 151MB/s]
>
> *LVM READ PERFORMANCE:
> *
> root@intrepid:~/SSDTEST# echo 1 > /proc/sys/vm/drop_caches
> root@intrepid:~/SSDTEST# pv bios_striped/fast > /dev/null
> 2.53GB 0:00:09 [ 264MB/s]
>
>
> *NOTES ON SETUP:*
>
> [1] I changed the data policy from 'random generated blob' to 'assorted
> isos'. For convenience I pipe directly from cpio. The cpio overhead is
> negligeable:
>
> root@intrepid:/home/sehe/custom# pv *.iso > /ssd/uncompressed/fast
> 2.53GB 0:00:43 [ 60MB/s]
>
> I picked ISO's because, although not random, they should not normally be
> easily compressed because the ISO format is compressed by default. Check
> (after copying the data, of course):
>
> root@intrepid:~/SSDTEST# zfs list -o name,compression,compressratio
> NAME COMPRESS RATIO
> ssd off 1.00x
> ssd/compressed on 1.00x
> ssd/uncompressed off 1.00x
>
> *Full details of zpool/fs and lvm options:*
>> *# write to linear volume
>> root@intrepid:~/SSDTEST# pv /tmp/largeblob > linear/fast
>> *1.31GB 0:00:13 [97.1MB/s]
>>
>> *# write to striped volume
>> root@intrepid:~/SSDTEST# pv /tmp/largeblob > striped/fast
>> *1.31GB 0:00:06 [ 199MB/s]
>>
>> *# read from linear volume
>> root@intrepid:~/SSDTEST# echo 1 > /proc/sys/vm/drop_caches
>> root@intrepid:~/SSDTEST# pv linear/fast | md5sum
>> *1.31GB 0:00:11 [ 116MB/s]
>> bdb210cf6a38d7df726d759b50853afa -
>>
>> *# read from striped volume
>> root@intrepid:~/SSDTEST# echo 1 > /proc/sys/vm/drop_caches
>> root@intrepid:~/SSDTEST# pv striped/fast | md5sum
>> *1.31GB 0:00:09 [ 148MB/s]
>> bdb210cf6a38d7df726d759b50853afa -
>>
>> The setup was made as follows:
>>
>> # (obviously: zpool destroy ssd [1])
>> # mark the ssd disks for lvm2 now:
>> *$ pvcreate /dev/sd[bd]
>> $ vgcreate ssd /dev/sd[bd]
>> $ lvcreate -n linear -L 32g ssd
>> $ lvcreate -i2 -I128 -n striped -L 32g ssd
>> $ mkfs.ext2 -E stripe-width=32 -b 4096 -L ssd_lin /dev/ssd/linear
>> $ mkfs.ext2 -E stripe-width=32 -b 4096 -L ssd_stripe /dev/ssd/striped
>> $ mkdir -p SSDTEST/{striped,linear}
>> $ cd SSDTEST/
>> $ mount LABEL=ssd_lin -o noatime linear/
>> $ mount LABEL=ssd_stripe -o noatime striped/
>> *
You can run an ATA "Secure Erase" command to refresh the entire SSD back
to (like-)new conditions.
Partitioning the drive into 20GB chunks actually does nothing, due to
the way that the internal flash indirection system works. Writing to
the same LBA 100x over, or 100 sequential LBAs, or 100 random LBAs have
roughly the same effect internally.
>> - Have you tried any random reads/writes or mixed traffic?
> For the same reasons, no. I'm looking into maybe running Bonnie++ tests
> in read-only mode later.
See above. Sequential or random I/O will cause the same sort of wear on
the SSD. All this benchmarking you are doing won't significantly reduce
the overall lifespan of your drives, but I understand your hesitance
anyway. Note that writing wears out the drive much quicker than
reading, so perhaps focus your tests on read performance :)
>> - ISO is not a compressed format, but often the data inside an ISO is
>> compressed. Checking compressratio at 1.00x is enough to prove that
>> the data is incompressible, so that's fine.
> Hmmm. I'll need to check that on Wikipedia. I was pretty sure it is
> inherently compressed.
"They are stored in an uncompressed format."
- http://en.wikipedia.org/wiki/ISO_image
No harm done, your ISO's obviously *contained* compressed data, even if
the format itself isn't compressed.
>> - "fake-RAID" is usually not worth it, as you have shown below.
>>
>> - I was previously suspicious if md5sum was limiting your performance,
>> but I didn't mention it.
> I did ! Apparently though, I was wrong when I measured an md5 throughput
> of ~260 Mb. I don't remember how I got that number, but it turned out
> pretty consistently at 184Mb/s the last time I checked, which pretty
> much invalidates those earlier read speeds.
> Now an interesting thought to entertain: maybe md5sum's performance
> characteristic is wildly dependent on the input values? In that case,
> there could be a security whole in there, as it may be possible to
> induce certain information about the input just by measuring how much
> time (relatively) is spent calculating the checksum of it :) In other
> words: I don't suppose this could be the case.
Timing attacks are common in security systems. If you want a challenge,
try looking into differential power analysis methods of plaintext/key
recovery from smart cards, etc. It's crazy stuff!
>> /dev/null is pretty fast though :D
> Have you mounted /dev/null on tmpfs :) Hehehe
Yikes!
http://ata.wiki.kernel.org/index.php/ATA_Secure_Erase
You have both the hdparm way, that won't work if your BIOS freezes the
disks security features, and a link to HDDErase that is a DOS tool that
can deal with that (unfreezing will require a reboot, though, so you'll
have to boot on HDDErase twice)
Usually, they don't have an option to disable that, but HDDErase somehow
manages to make the BIOS not freeze once at next boot.
I usually just hot-reboot the drives once Linux is booted (use a Live-CD
to be safe). Then the BIOS can't get its mucky paws into the ATA
security features and hdparm works beautifully.
- boot system from Ubuntu live CD
- "hdparm -I /dev/sdX" will show "frozen"
- unplug SATA data cable from drive
- unplug power from drive
- plug power into drive
- plug SATA data cable back in
- wait a few seconds
- "hdparm -I /dev/sdX" will now show "not frozen"
- hdparm --user-master u --security-set-pass test /dev/sdX
- hdparm --user-master u --security-erase test /dev/sdX
BOOM!
To be safe, I usually unplug all disks from the system except the one I
intend to erase.
HTH,
Jonathan
Yeah, I read that you could hot-replug disks to unfreeze them, except that
it's somehow not practical to do on a laptop...
Mike
I don't see a problem with that. Unless your laptop was designed by
sadists (or Apple?), the HDD bay should be externally accessible and you
can hot plug the drive nearly as easily as a desktop. At most you
should just have to flip the laptop over briefly.
You nailed it: to access the HDD, I need to remove the keyboard and
the palmrest. (and FWIW, some Apple models have very accessible HDDs)
Mike