compilation benchmarks for current version of zfs-fuse

Emmanuel Anne

unread,

Nov 7, 2009, 2:27:34 PM11/7/09

to zfs-...@googlegroups.com

interesting results. I like this kind of test because it's something I do often so it's more usefull than a benchmark of extreme situation. This is the compilation of zfs-fuse source by running time scons just after having mounted the directory (and after having run scons -c before unmounting it) :

1) on a jfs filesystem : 1:43
2) on a compressed zfs filesystem, debug build (scons debug=2) : 2:29
3) same thing with an optimized build (scons) : 2:17
4) same thing but running with "-a 1 -e 1" command line options : 1:52
5) same thing but with prefetch cache enabled : 1:55

The last result is particularly interesting : usually I disable the prefetch cache (the option "zfs-prefetch-disable" is uncommented in my default zfsrc), but I do it mainly for memory reasons because this cache tends to eat ram in an uncontrolled way. Well apparently this test is slower when the prefetch cache is enabled than when it's disabled ! There is not a big difference, but it's still there. Of course it's probably because there is a concurrence between the kernel cache and zfs, but anyway for now disabling the prefetch cache still seems to be a good idea.

Also these tests are run with my default zfsrc, that is :
vdev-cache-size = 10
max-arc-size = 100

1:52 is acceptable (at least for me).

sghe...@hotmail.com

unread,

Nov 7, 2009, 3:49:10 PM11/7/09

to zfs-...@googlegroups.com

Emmanuel Anne wrote:

interesting results. I like this kind of test because it's something I do often so it's more usefull than a benchmark of extreme situation. This is the compilation of zfs-fuse source by running time scons just after having mounted the directory (and after having run scons -c before unmounting it) :

The unmounting can probably be (better) replaced by doing:

echo 1 > /proc/sys/vm/drop_caches

This will even ensure proper results if some things are cached on tmpfs and things like that.

Mike Hommey

unread,

Nov 7, 2009, 4:00:42 PM11/7/09

to zfs-...@googlegroups.com

On Sat, Nov 07, 2009 at 09:49:10PM +0100, sghe...@hotmail.com wrote:
> Emmanuel Anne wrote:
> > interesting results. I like this kind of test because it's something I
> > do often so it's more usefull than a benchmark of extreme situation.
> > This is the compilation of zfs-fuse source by running time scons just
> > after having mounted the directory (and after having run scons -c
> > before unmounting it) :
> >
> The unmounting can probably be (better) replaced by doing:
>

> *echo 1 > /proc/sys/vm/drop_caches *

3 would be better, as it also drops dentries and inodes, on top of
pagecache.

sghe...@hotmail.com

unread,

Nov 7, 2009, 4:02:48 PM11/7/09

to zfs-...@googlegroups.com

Woooey who are you? You never say much but when you do, it's always
these nuggets of knowledge. Thanks!
I suppose I could RTFM of course :)

Emmanuel Anne

unread,

Nov 7, 2009, 4:22:44 PM11/7/09

to zfs-...@googlegroups.com

errr, god ? ;-)

2009/11/7 sghe...@hotmail.com <sghe...@hotmail.com>

Emmanuel Anne

unread,

Nov 7, 2009, 4:37:04 PM11/7/09

to zfs-...@googlegroups.com

On 2nd thought, the echo is not enough because zfs-fuse has its own internal cache too, and if you want to clear it the easiest, and maybe the only solution is to unmount/remount.

2009/11/7 Emmanuel Anne <emmanu...@gmail.com>

sghe...@hotmail.com

unread,

Nov 7, 2009, 4:43:21 PM11/7/09

to zfs-...@googlegroups.com

There is a point there. However, I did 'zfs umount -a; killall zfs-fuse;
zfs-fuse' in between my tests. Results in a few minutes

Emmanuel Anne wrote:
> On 2nd thought, the echo is not enough because zfs-fuse has its own
> internal cache too, and if you want to clear it the easiest, and maybe
> the only solution is to unmount/remount.
>
> 2009/11/7 Emmanuel Anne <emmanu...@gmail.com

> <mailto:emmanu...@gmail.com>>
>
> errr, god ? ;-)
>
> 2009/11/7 sghe...@hotmail.com <mailto:sghe...@hotmail.com>
> <sghe...@hotmail.com <mailto:sghe...@hotmail.com>>

sghe...@hotmail.com

unread,

Nov 7, 2009, 5:27:06 PM11/7/09

to zfs-...@googlegroups.com

Hi all,

Hereby my benchmark figures for compiling zfs-fuse (new-solaris branch) obviously running on that same version of zfs where applicable. I highlighted some (random) points of interest in red (in case you have HTML view). I'd say these results are pretty interesting, mildly astonishing and amusing :)

Be sure to check the main observation in the back (bragging rights). I question the value of this benchmark (at least for my system).

Notes:
1 All disks are on SSD RAID0 (i.e. both my ext4 fs-es are on RAID0 as well as my zpool is striped across two SSD vdevs).

2. Command sequence:

# for pool creation (sda and sdc have equal partitioning and are OCZ X64 SSDs) $ zpool create -O mountpoint=/home/sehe/zfs -O compression=on zfs /dev/sd[ac]6 # for each build: $ scons -c $ echo 3 > /proc/sys/vm/drop_caches $ time scons -j 5 # for the benchmarks _with_ zfsrc: $ grep -v '#' /etc/zfs/zfsrc_ | sort -u fuse-attr-timeout = 1 fuse-entry-timeout = 1 max-arc-size = 100 vdev-cache-size = 10Processor is Q9550 and 8Gb of RAM.
SATA harddisks not applicable for these tasks (although my system has-em). No swap, and /tmp on tmpfs.
Running full gnome desktop with gfx (compiz, dual head).
Note how I enable prefetch since I don't care about memory usage.

Because my debug builds consistently time in shorter than the 'optimized' ones, I'll specify my compiler and platform (I'm guessing yours is different because it seems to be the other way around)
$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 9.10 Release: 9.10 Codename: karmic $ uname -a Linux karmic 2.6.31-14-generic-pae #48-Ubuntu SMP Fri Oct 16 15:22:42 UTC 2009 i686 GNU/Linux $ g++ -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.1-4ubuntu8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-targets=all --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu8)

Emmanuel Anne wrote:

interesting results. I like this kind of test because it's something I do often so it's more usefull than a benchmark of extreme situation. This is the compilation of zfs-fuse source by running time scons just after having mounted the directory (and after having run scons -c before unmounting it) :

1) on a jfs filesystem : 1:43

I'm deffo spoilt. My ext4 home does it in 0:21 (with -j 50 in 18.6s (average) - just for fun).

On my compressed zpool:

zfsrc, 'scons debug=2 -j 5': 0:16

zfsrc, 'scons -j 5': 0:25

no zfsrc, 'scons debug=2 -j 5': 0:31

no zfsrc, 'scons -j 5': 0:40

5) same thing but with prefetch cache enabled : 1:55

Okay, so I _had_ to try with prefetch disabled (echo zfs-prefetch-disable >> /etc/zfs/zfsrc):

zfsrc, zfs-prefetch-disable, 'scons debug=2 -j 5': 0:16

zfsrc, zfs-prefetch-disable, 'scons -j 5': 0:26

I'd say: no significant difference.

1:52 is acceptable (at least for me).

No comment :)

For absolutely no reason but bragging right:

zfsrc, 'scons -j 500': 0:24.1
ext4, 'scons -j 500': 0:19.7
tmpfs, 'scons -j 500': 19.3s

How is that for depressing?! This benchmark (on my system) appears to be testing _nothing_ about filesystem performance. The speed is essentially the same when running on ext4 or zfs or even tmpfs (gasp!!!).
The next interesting point to notice, though, is that at least when running 'vanilla' zfs-fuse (no zfsrc) makes the stuff a lot slower... In a way, this benchmark is only able to demonstrate when performance of a filesystem sucks. It doesn't do a lot to demonstrate where it is above average. The 'good performance' band datapoints get completely swamped by compilation time per se.

Food for thought. I venture the main thing is the dramatically different latency on (my) SSD 'drives'. There is virtually no access time for SSD (relative to magnetic disks). In that sense, running (read tasks) off a non-fragmented SSD with a good controller is much like running off a loop-device on tmpfs anyway.

Cheers,
Seth

PS. Of course, to the devs that have been plowing away at speeding ZFS up: the take-away seems to be that ZFS 0.6.* seems to suck less performance-wise when using the default zfsrc. That is a bonus.

sghe...@hotmail.com

unread,

Nov 7, 2009, 6:32:44 PM11/7/09

to zfs-...@googlegroups.com

REPOST with all formatting removed, because it turned out to be
disrupting the content when viewed via the web. Since when did
google-groups not show HTML on it's _web_ pages.... Oh well
----

Hi all,

2. Command sequence:

Emmanuel Anne wrote:
> interesting results. I like this kind of test because it's something I
> do often so it's more usefull than a benchmark of extreme situation.
> This is the compilation of zfs-fuse source by running time scons just
> after having mounted the directory (and after having run scons -c
> before unmounting it) :
>
> 1) on a jfs filesystem : 1:43

I'm deffo spoilt. My ext4 home does it in 0:21 (with -j 50 in 18.6s
(average) - just for fun).

On my compressed zpool:

zfsrc, 'scons debug=2 -j 5': 0:16

zfsrc, 'scons -j 5': 0:25

no zfsrc, 'scons debug=2 -j 5': 0:31

no zfsrc, 'scons -j 5': 0:40

> 5) same thing but with prefetch cache enabled : 1:55

Okay, so I _had_ to try with prefetch disabled (echo
zfs-prefetch-disable >> /etc/zfs/zfsrc):

zfsrc, zfs-prefetch-disable, 'scons debug=2 -j 5': 0:16

zfsrc, zfs-prefetch-disable, 'scons -j 5': 0:26

I'd say: no significant difference.
>

> 1:52 is acceptable (at least for me).

Emmanuel Anne

unread,

Nov 8, 2009, 12:27:18 AM11/8/09

to zfs-...@googlegroups.com

Well what to say except that you have a monster system, especially compared to my aging amd64 ? ;-)

Obviously the numbers are too low on your system to be really usefull... you bump into system usage numbers which can make a difference of 1s, here your numbers are already at the level of 1s.

-j 500 ? I thought it was to say that you wanted to compile with 500 cores and it would make things slower if you had less real cpus available ?

The reason why the zfsrc file makes any difference is very probably because of the fuse_entry_timeout and fuse_attr_timeout arguments, the others are less important.

At least it's good news for zfs usage, it shows that if your disks are really fast, then fuse doesn't slow things down noticeably ! ;-)

Impressive anyway, thanks to have taken the time for that !

2009/11/8 sghe...@hotmail.com <sghe...@hotmail.com>

sghe...@hotmail.com

unread,

Nov 8, 2009, 6:01:49 AM11/8/09

to zfs-...@googlegroups.com

Emmanuel Anne wrote:
> Well what to say except that you have a monster system, especially
> compared to my aging amd64 ? ;-)

Well I didn't consider my desktop a 'monster' system until now... If I
can recommend something: don't replace your PC, instead get oodles of
RAM (8Gb was EUR 150 or there abouts...) and an SSD disk.
The SSD + Karmic makes my system boot in 12 seconds (bootchart time).
For the SSD be sure to tweak your volume manager and filesystem layout
options.

>
> -j 500 ? I thought it was to say that you wanted to compile with 500
> cores and it would make things slower if you had less real cpus
> available ?

Yes. -j 500 is outrageous. The overhead _would_ increase if you ran out
of memory _or_ the number of _active_ processes would create a
scheduling bottleneck. None of these are true for this task.

I think I never saw more than 11 child processes spawned at any given
time (due to dependencies) and of these a fair number will usually have
been awaiting disk IO (especially the write IO which is significantly
assymetric on SSD).

Seeing that I have 4 CPU cores, -j 4 or maybe -j 5 for coordinating
tasks would make enough sense. However, I wanted to see if I could push
it. E.g. if I could make as much read IO take place immediately for
maximum parallel reads, even though the subsequent compilation/link CPU
cycles would obviously have to wait for an available CPU scheduler slot.
:) It turns out that it indeed squeezes about 3 seconds out of the real
elapsed time. Quite possibly this exact same time win would occur with,
say, -j 11. But then again, this way (-j500) I can make the computer
work out the available processing power.

Now if I were to schedule 500 mp3 encoding jobs, I certainly would not
raise my -j factor above 5 for gnu make (I use make files for this type
jobs). I use xargs -P and xjobs (of Solaris fame) frequently if I don't
have makefiles handy :)

Oh and 'sudo apt-get install ccontrol ccache distcc
distcc-monitor-gnome; ccontrol-init' will lend you a lot of insight in
build parallelization and optimization even on a single compile host.
Often paralellization/distribution completely breaks down on the link
times and library dependencies. But I'm getting off-topic.

> The reason why the zfsrc file makes any difference is very probably
> because of the fuse_entry_timeout and fuse_attr_timeout arguments, the
> others are less important.

I reckoned so. Infallible permission logic is not a concern on my desktop :)

>
> At least it's good news for zfs usage, it shows that if your disks are
> really fast, then fuse doesn't slow things down noticeably ! ;-)
>
> Impressive anyway, thanks to have taken the time for that !

It was kind of fun to do, especially since I didn't have to wait _that_
long (like e.g. the popular kernel compile benchmarks).

Reply all

Reply to author

Forward