GOOD News: performance no longer biggest issue! (by a long way?)

19 views
Skip to first unread message

sgheeren

unread,
Jan 19, 2010, 6:24:46 PM1/19/10
to zfs-...@googlegroups.com
Hello all.

Do you still believe zfs-fuse is unbelievably slow? Do you secretly
think, it is unacceptably slow, but, we gotta have data integrity and
backups after all, now do we?

Well, sit yourself down and have a gawk at my latest comparison:


http://spreadsheets.google.com/ccc?key=0AlX1n8WSRNJWdGNQME5BMXVlMnlydS1Gd19wWXZuOEE

I added a comparison of my newest linux zfs-fuse figures to a brand new
(i.e. clean) Open Solaris box (running bleeding edge, matching almost
exactly what I have running on my linux box). Everything is on the same
metal, the pools are exported/imported into the other OS, but residing
physically on the same disks.

I must say I'm more than a little surprised, especially with the _large
gap_ between Solaris's sucky performance on SSDs and ZFS-fuse
performance. If someone with a ton of Solaris-fu can point me at my
obvious mistake? I cannot think of one, as I haven't done any tuning[1]

I'm seeing headlines: "flying colours!" more headlines: "hands down" !
"comprehensively beaten"?
I'm also thinking, I must be wrong somewhere... but I'm probably not.
Unless Solaris cripples SSD performance out of the box for some reason
unbeknownst to mankind [3]?

Now here comes the limited warranty:
1. everything is readonly (opportunism: I started out looking at
fuse's max_readahead=0 option; I don't wish to clobber my pools today)
2. in a way the tests are synthetic (in that zfs-fuse will actually
be 'slowest', relatively, when doing traversals, stats, lookups and
readdirs probably). I picked two different datasets: Mubi containing
some larger files, and many small ones. MyPictures, containing
(obviously) only 2-4Mb jpegs.

You could say that even on a single Seagate, zfs-fuse is the winner for
the larger files (45.6 Mbs vs. Solaris's 45.1); by a tiny bit [2][3] but
nonetheless a good feat for fuse vs. Solaris kernel!

Anyone care to shoot at these figures? I'm still a little reluctant to
shout victory myself. Still, the signs are good, the signs are very good
indeed.
Thanks Emmanuel, Rudd-O, Mike, anyone contributing bug reports,
technical hints and test feedback! It seems we are getting somewhere.

Now, who wants to start porting ZFS Timeslider to the linux desktop!!!

Seth Heeren

[1] other than adding elevator=noop to the kernel args. Which reminds
me, I still have to re-enable cfq for the rotating platters in my box;
None of this should have any impact on the _read_ performance, whatsoever
[2] undoubtedly in the noise margin
[3] me, for starters

Rudd-O

unread,
Jan 19, 2010, 6:37:09 PM1/19/10
to sgheeren, zfs-...@googlegroups.com
I'll say it again.

You are a NINJA. You have mastered the ninjitsu of filesystems.

sgheeren

unread,
Jan 19, 2010, 6:42:34 PM1/19/10
to zfs-...@googlegroups.com
PS. refer to my prior post for a little more detail on the test procedure:

http://groups.google.com/group/zfs-fuse/browse_thread/thread/57090db4ff8ebe43

Basically, the exact same commands were entered on Solaris with the
following exceptions:

1. needed to specify pool on the bulk mount command line (root pool
was already mounted)
2. needed to specify path to custom version of cpio (to support -0
option) /usr/local/bin/cpio
3. pool was exported, imported and remounted instead of the
'drop_caches' thingie in linux

For bonus points: who spotted the missing option to zfs list in the
mount command :)

Ricardo M. Correia

unread,
Jan 19, 2010, 7:34:25 PM1/19/10
to zfs-...@googlegroups.com
Hi sgheereen,

The results that you got are very interesting.
There is, however, at least one minor methodology flaw that I could
spot :)

I think your exception nr. 3 caused different behavior between Solaris
and zfs-fuse.
The reason is that (AFAIK) doing 'echo 3 > /proc/sys/vm/drop_caches'
won't cause zfs-fuse's ARC cache to be flushed, so zfs-fuse will be at a
slight advantage because it can return data/metadata its cache instead
of having to read from disk like Solaris had to do.

So I think you should at least repeat those tests, but instead
exporting/importing the pool in zfs-fuse like you did in Solaris.

Cheers,
Ricardo

Nicholas Lee

unread,
Jan 19, 2010, 11:47:50 PM1/19/10
to zfs-fuse
This is pretty interesting.

Given performance is now getting close to Solaris, how reliable is zfs-
fuse? I'd be happy to see 10% or even 20% drop in performance going
from Solaris to Linux - just so I could rid of PITA Solaris
upgrades. However, it is not worth it if there are still chances of
data loss.


Nicholas


On Jan 20, 12:24 pm, sgheeren <sghee...@hotmail.com> wrote:
> Hello all.
>
> Do you still believe zfs-fuse is unbelievably slow? Do you secretly
> think, it is unacceptably slow, but, we gotta have data integrity and
> backups after all, now do we?

...

Mike Hommey

unread,
Jan 20, 2010, 3:24:38 AM1/20/10
to zfs-...@googlegroups.com
On Wed, Jan 20, 2010 at 12:24:46AM +0100, sgheeren wrote:
> Hello all.
>
> Do you still believe zfs-fuse is unbelievably slow? Do you secretly
> think, it is unacceptably slow, but, we gotta have data integrity and
> backups after all, now do we?
>
> Well, sit yourself down and have a gawk at my latest comparison:
>
>
> http://spreadsheets.google.com/ccc?key=0AlX1n8WSRNJWdGNQME5BMXVlMnlydS1Gd19wWXZuOEE
>
> I added a comparison of my newest linux zfs-fuse figures to a brand new
> (i.e. clean) Open Solaris box (running bleeding edge, matching almost
> exactly what I have running on my linux box). Everything is on the same
> metal, the pools are exported/imported into the other OS, but residing
> physically on the same disks.

I think you should give a try with a 64 bits ubuntu system, you may have
more surprises to share.

Mike

sgheeren

unread,
Jan 20, 2010, 6:00:55 AM1/20/10
to zfs-...@googlegroups.com
Thanks gents,

much appreciated reactions. As you know, I was struck by the results
myself. I still think there should be something obvious once I google
for "Solaris SSD Slow"... I highlight two suggestions, duly noted:

Mike Hommey wrote:
> I think you should give a try with a 64 bits ubuntu system, you may have
> more surprises to share.
>
> Mike
>

Yeah that had my interest as well. I'm thinking my earlier stress tests
hit a ceiling at about 300 active pools precisely because of a limit on
per process locked memory (considering fragmentation due to rebased
shared libs?). I was going to try 64 bit in order to see whether that
limit was lifted (or else to debug the cause).

Ricardo M. Correia wrote:
> So I think you should at least repeat those tests, but instead
> exporting/importing the pool in zfs-fuse like you did in Solaris.
>
> Cheers,
> Ricardo
>

I will do that (working on linux I can move pretty quickly). However, I
can tell you that at least the 'crushing' results on SSD fit your
description perfectly already. The sequence of events was:
shutdown solaris,
reboot into linux
remove zpool.cache
configure zfsrc, launch zfs-fuse, import pool -R /perf
run cpio (1. and 2.) bench

That way, it is pretty much guaranteed that nothing was even in
zfs-fuse's caches. To be certain, I'll simply replay the linux tests in
a controlled fashion, explicitely exporting the pools between each test
run.


Cheers,
Seth Heeren

zfs-tarepanda

unread,
Jan 20, 2010, 7:48:21 AM1/20/10
to zfs-...@googlegroups.com
On Wed, 20 Jan 2010 00:24:46 +0100
sgheeren <sghe...@hotmail.com> wrote:

Hello sgheeren, and all.

I have a great interest for your comparison.
By the way, why don't you use bonnie++ to evaluate HDD performance?
Don't get me wrong, this is NOT claim, but just technical interest.

Now I try to make a environment for evaluation of zfs-fuse performance
under Vine Linux. Vine Linux is Japanese distribution. And I have a plan
to use bonnie++.

Thanks.

--
zfs-tarepanda <zfs_on...@yahoo.co.jp>

--------------------------------------
Get the new Internet Explorer 8 optimized for Yahoo! JAPAN
http://pr.mail.yahoo.co.jp/ie8/

sgheeren

unread,
Jan 20, 2010, 9:31:02 AM1/20/10
to zfs-...@googlegroups.com
zfs-tarepanda wrote:
> I have a great interest for your comparison.
> By the way, why don't you use bonnie++ to evaluate HDD performance?
>
Well, I don't want to raise expectations as if I (can) conduct very
high-quality benchmarks. I just wanted some quick figures to see whether
passing max_readahead=0 to libfuse (fusermount) would make a "real life"
difference. It was suggested somewhere[1] that this would improve
performance since ZFS does its own readahead (prefetch).

Now the preliminary results show, that the performance ever so slightly
decrease when passing max_readahead=0. For this test, fullblown bonnie++
seems like overkill. If someone wants to do this properly... be my
guest. I cannot currently find the motivation because I'll probably not
be changing my daily workload because certain block sizes maximize the
throughput [2].

[1] URL lost... sry
[2] which, IMHO is the main excellence of bonnie++, it will chart the
relative performance against blocksize. This apart from the convenience
that bonnie++ will allocate all required temp storage and by default
will watch for common pitfalls. I'm quite aware of the pitfalls, so I'm
pretty confident that I didn't commit any hilarious mistakes (apart from
the worry that Riccardo Correia already reported).


> Don't get me wrong, this is NOT claim, but just technical interest.
>
> Now I try to make a environment for evaluation of zfs-fuse performance
> under Vine Linux. Vine Linux is Japanese distribution. And I have a plan
> to use bonnie++.
>

That would be great. The latest bonnie++ figures (I think) have been
published by Dave Abrahams and Chris Samuels if I remember correctly.
Chris has a ton of experience benchmarking the a plethora of
filesystems, but I think Dave had access to a slightly more recent
version of zfs-fuse. I still think that there are no valid benchmarks
for 0.6.0 with tuning/without tuning. Be sure to read up in
/etc/zfs/zfsrc (under contrib in the repo).

Now I also know there are some benchmarks scripts on zfs-fuse.net
(homepage) but I'm not quite aware whether they are bonnie-based.

$0.02

Seth


> Thanks.
>
>
>> Hello all.
>>
>> Do you still believe zfs-fuse is unbelievably slow? Do you secretly
>> think, it is unacceptably slow, but, we gotta have data integrity and
>> backups after all, now do we?
>>
>> Well, sit yourself down and have a gawk at my latest comparison:
>>
>>
>> http://spreadsheets.google.com/ccc?key=0AlX1n8WSRNJWdGNQME5BMXVlMnlydS1Gd19wWXZuOEE
>>

>> [.....]

Rudd-O

unread,
Jan 20, 2010, 4:52:11 PM1/20/10
to zfs-...@googlegroups.com
ZFS is really reliable now. I have not seen ZFS fail in a loooong time.

> --
> To post to this group, send email to zfs-...@googlegroups.com
> To visit our Web site, click on http://zfs-fuse.net/


Nicholas Lee

unread,
Jan 20, 2010, 9:42:55 PM1/20/10
to zfs-fuse
When you say this, you I assume you mean ZFS on Fuse.

I've got some zfs pools on Solaris that are a couple years old without
any problems. I'd like to switch to Linux as Solaris is a PITA, even
with Nexenta.

Nicholas

Piotr Pawłow

unread,
Jan 21, 2010, 6:16:47 AM1/21/10
to zfs-...@googlegroups.com
Nicholas Lee wrote:

> Given performance is now getting close to Solaris, how reliable is zfs-
> fuse?

Another data point:

I use ZFS-Fuse since version 0.4.0-beta1 (March 2007) for rsyncing several
Linux machines over the net, and imaging local Windows workstations over
SMB. I keep 10 to 20 snapshots for each machine.

I had a problem with ZFS-Fuse using more and more virtual memory over the
course of several days, until it reached 2GB and crashed. I added a cron job
restarting ZFS every night, and it never crashed since then.

I am currently using 0.6.0, and changed the OS to 64-bit, so that old
problem may already be gone, but I need to delete the cron job to find
out :)

> I'd be happy to see 10% or even 20% drop in performance going
> from Solaris to Linux - just so I could rid of PITA Solaris
> upgrades.

It doesn't seem very fast, but acceptable for me, so I wasn't doing any
benchmarks.

> However, it is not worth it if there are still chances of
> data loss.

I never had any data loss or detectable corruptions. On sundays and
saturdays I rsync my LVM snapshots to ZFS using 2 runs: first normally, then
second run with --checksum option to find corruptions. It never found any
which could be caused by ZFS. It did, however, found a rsync problem (fixed
now) corrupting files when run with --sparse and --inplace, so the check
itself seems to work :)

My biggest issues ATM are:
- no xattrs support, so I can't use --fake-super with rsync
- no ACL support. I would try to use it more with my Samba installations,
but I need finer permission control.

Cheers

sgheeren

unread,
Jan 21, 2010, 7:28:24 PM1/21/10
to zfs-...@googlegroups.com
Peeps, we need write benchmarks.

In light of my recent shufflings and 'stress testing' I revisited this
old thread:

http://groups.google.com/group/zfs-fuse/browse_thread/thread/68138bd00ad57ce3

It might just be that read performance is ok, but write performance is
still atrocious in ZFS-fuse. I remember having found signs of big
asymmetry on a 0.5.x version before. I never use any raidz or mirror
setups: I use ZFS purely for ease of management + checksum error
signalling. With striping, bot read _and_ write speeds should benefit
from more spindles per pool.

I'll certainly be doing benchmarks, but I'll wait until my NAS box
arrives[1]. I'll throw in two brand new WD 1.5Tbs and have at it.
Obviously, I'll try to slam Osol nv_130 into it just for fun/comparison.

Seth

[1] my post with 'BBS2' in the subject

Stefano Z.

unread,
Jan 22, 2010, 4:18:50 AM1/22/10
to zfs-...@googlegroups.com
What is the latest release that i can download to achieve that performance boost ?
is this ok:

or i need the git one ?
thanks

Signature powered by WiseStamp 


sgheeren

unread,
Jan 22, 2010, 5:27:26 AM1/22/10
to zfs-...@googlegroups.com
I'm assuming you refer to my positive _read_ benchmark results.
I achieved them with an unstable version (mainly because my pools use pool version 21).

I see no real reason why it would be slower when using 0.6.0 (but refer to the issue tracker for the known issues).

Be sure to check out the following config options:

*    deploy a custom zfsrc in /etc/zfs/zfsrc (you're packaging choice might already include the file, it is under contrib/ in the repo tree)
*    check that you are running fuse 2.8.* for enhanced performance (through the new big_writes feature)

GL

sgheeren

unread,
Jan 23, 2010, 10:27:17 AM1/23/10
to zfs-...@googlegroups.com
FWIW,

my 'default everything' on a single SATA disk (320Gb, unknown brand). Branch still 0.6.x (mubi in my repo), patched with the threading patch posted just earlier.

# bonnie++ -d /testpool -u sehe

Using uid:1000, gid:1000.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03c       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
karmic          16G 35634  44 40483   6 23349   4 51302  65 71939   6 101.0   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  7331  12 24372  11  8355   6  6362   7 +++++ +++ 10419   9
karmic,16G,35634,44,40483,6,23349,4,51302,65,71939,6,101.0,0,16,7331,12,24372,11,8355,6,6362,7,+++++,+++,10419,9

zfs-tarepanda

unread,
Jan 23, 2010, 10:46:05 PM1/23/10
to zfs-...@googlegroups.com
On Sat, 23 Jan 2010 16:27:17 +0100
sgheeren <sghe...@hotmail.com> wrote:

Hi, sgheeren.

Wow!, you have finished to execute bonnie++ with zfs-fuse.
It is very goot informative guide to my work.
In near future, I will report my evaluation result in my blog.
Then, I will give an information in here.

Thanks.

> --
> To post to this group, send email to zfs-...@googlegroups.com
> To visit our Web site, click on http://zfs-fuse.net/

--

zfs-tarepanda

unread,
Jan 23, 2010, 10:43:16 PM1/23/10
to zfs-...@googlegroups.com
On Sat, 23 Jan 2010 16:27:17 +0100
sgheeren <sghe...@hotmail.com> wrote:

Hi, sgheeren.

Wow!, you have finished to execute bonnie++ with zfs-fuse.
It is very goot informative guide to my work.
In near future, I will report my evaluation result in my blog.
Then, I will give an information in here.

Thanks.

> FWIW,

Stone

unread,
Jan 22, 2010, 11:19:17 AM1/22/10
to zfs-fuse
Hi,

This benchmarking got me also, so I did my own, comparing zfs-fuse
with ext4 on software raid with bonnie++.

The comparison is at: http://midway.hu/zfs-bonnie.html

It seems that input is nearly the same but for output zfs-fuse is
behind. Despite or in spite this I think that zfs-fuse is a great
filesystem, and I'll use it many places in the near future (I use it
already on my laptop).

Cheers: Stone

zfs-tarepanda

unread,
Jan 22, 2010, 6:12:25 AM1/22/10
to zfs-...@googlegroups.com
On Wed, 20 Jan 2010 15:31:02 +0100
sgheeren <sghe...@hotmail.com> wrote:

Hi, sgheeren.

Thank you for your reply.
And, I understand the purpose of your benchmark works. It is very useful
information for to use zfs for linux people.


> zfs-tarepanda wrote:
> > I have a great interest for your comparison.
> > By the way, why don't you use bonnie++ to evaluate HDD performance?
> >
> Well, I don't want to raise expectations as if I (can) conduct very
> high-quality benchmarks. I just wanted some quick figures to see whether
> passing max_readahead=0 to libfuse (fusermount) would make a "real life"
> difference. It was suggested somewhere[1] that this would improve
> performance since ZFS does its own readahead (prefetch).

> ....

Piotr Pawłow

unread,
Feb 7, 2010, 11:39:01 AM2/7/10
to zfs-...@googlegroups.com
Piotr Paw�ow wrote:

> I had a problem with ZFS-Fuse using more and more virtual memory over the
> course of several days, until it reached 2GB and crashed. I added a cron
> job restarting ZFS every night, and it never crashed since then.
>
> I am currently using 0.6.0, and changed the OS to 64-bit, so that old
> problem may already be gone, but I need to delete the cron job to find
> out :)

Replying to myself, as I deleted the cron job after posting that message. It
seemed fine, and I stopped monitoring it after few days. But today it
crashed, in the middle of a long running rsync job. I'll try version 0.6.1
to see if the stability fix improve the situation. I'll also try to get a
backtrace to file a bug report in case it crashes again.

Cheers

Emmanuel Anne

unread,
Feb 7, 2010, 2:50:35 PM2/7/10
to zfs-...@googlegroups.com
I suppose you don't use a zfsrc file (I don't use 0.6.0 nor 0.6.1, so I don't know if they install a config file by default).
Anyway, from what I know the fixes in 0.6.1 are not related to memory usage, so I'd suggest to either use the default zfsrc file, or at least pass the --zfs-disable-prefetch command line argument (to disable the zfs prefetch cache which makes it to eat crazy amounts of memory).
With the default configuration (all the default parameters in /etc/zfs/zfsrc) you should never see the memory usage rising above 250 Mb, actually even above 200 Mb is a rarity.
So try this, and report (and don't use some cron job to kill the daemon, it should be able to run 24h/24 now, but I can't test it like that on my side).

I'll attach the default zfsrc file in case you don't have it. Just place it in /etc/zfs and the daemon should use it when you restart it (unless the config file patch was not included in 0.6.0, but I think it has this patch).
Good luck !

2010/2/7 Piotr Pawłow <p...@siedziba.pl>
--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/



--
zfs-fuse git repository : http://rainemu.swishparty.co.uk/cgi-bin/gitweb.cgi?p=zfs;a=summary
zfsrc

Piotr Pawłow

unread,
Feb 7, 2010, 4:30:12 PM2/7/10
to zfs-...@googlegroups.com
> Anyway, from what I know the fixes in 0.6.1 are not related to memory
> usage,

Might have been something else, as there is no oom killer message in kernel
log, and the binary is 64-bit so it should not be limited to 2GB of virtual
memory.

> pass the --zfs-disable-prefetch command line argument (to disable the zfs
> prefetch cache which makes it to eat crazy amounts of memory).

I'll try that if it crashes again. I don't want to change too many things at
once.

> I'll attach the default zfsrc file in case you don't have it. Just place
> it in /etc/zfs and the daemon should use it when you restart it (unless
> the config file patch was not included in 0.6.0, but I think it has this
> patch).

No mention of "zfsrc" in the sources. I guess not.

Cheers

Emmanuel Anne

unread,
Feb 7, 2010, 5:02:48 PM2/7/10
to zfs-...@googlegroups.com
If you don't find zfsrc in these sources you'll have to do it manually, or use my repository.
To do it manually, just arrange to pass the command line argument I gave you to zfs-fuse when running it, and don't wait for a crash to do it, it will simply limit its memory usage.
2 Gb of memory usage is a disaster from my point of view.
And you can always check that you don't loose performance (you should not, this prefetch cache will create more trouble than improvements because of fuse).
But of course it's just a suggestion...

2010/2/7 Piotr Pawłow <p...@siedziba.pl>

Cheers

--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/

sgheeren

unread,
Feb 7, 2010, 5:07:46 PM2/7/10
to zfs-...@googlegroups.com
Emmanuel Anne wrote:
I suppose you don't use a zfsrc file (I don't use 0.6.0 nor 0.6.1, so I don't know if they install a config file by default).
Anyway, from what I know the fixes in 0.6.1 are not related to memory usage, so I'd suggest to either use the default zfsrc file, or at least pass the --zfs-disable-prefetch command line argument (to disable the zfs prefetch cache which makes it to eat crazy amounts of memory).
With the default configuration (all the default parameters in /etc/zfs/zfsrc) you should never see the memory usage rising above 250 Mb, actually even above 200 Mb is a rarity.
So try this, and report (and don't use some cron job to kill the daemon, it should be able to run 24h/24 now, but I can't test it like that on my side).
Erm, Emmanuel, the way _I_ read the crash report was that it was actually /not/ related to memory. So telling users to shut the memory impacting subsystems down, is a bit... well, not to the point.


I'll attach the default zfsrc file in case you don't have it. Just place it in /etc/zfs and the daemon should use it when you restart it (unless the config file patch was not included in 0.6.0, but I think it has this patch).

The rsync failures reek of issue #21. This has been addressed in 0.6.1 (critical branch) and in the master branch (albeit in different ways). Both ways have proven pretty solid. I'm positive you will not have stress/concurrency related failures even without the 'magic' zfsrc[1].

$0.02

[1]  (a zfsrc that will possibly impact zfs-fuse performance)

Emmanuel Anne

unread,
Feb 8, 2010, 3:43:53 PM2/8/10
to zfs-...@googlegroups.com
Ok, so the problem is already fixed !

2010/2/7 sgheeren <sghe...@hotmail.com>
--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/
Reply all
Reply to author
Forward
0 new messages