ZFS performance weirdness on a a Netra X1

Tim Bradshaw

unread,

Aug 17, 2006, 4:12:23 PM8/17/06

to

I have a Netra X1 with a pair of 120GB (non-Sun) disks, and 1152MB of
memory (so enough, I hope). It's running 10 6/06, and the disks are
split between SVM and: /, /var and swap are UFS filesystems on SVM
mirrors, and the rest is taken up by zfs:
# zpool status
pool: export
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
export ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t0d0s7 ONLINE 0 0 0
c0t2d0s7 ONLINE 0 0 0

errors: No known data errors
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
export 520M 91.5G 27.5K /export
export/home 364M 91.5G 364M /export/home
export/zones 156M 91.5G 28K /export/zones
export/zones/ts 156M 91.5G 77.9M /export/zones/ts

I've been making zones with their roots in this space (for instance the
ts zone has it's root in export/zones/ts). I'm aware this isn't
supported, but it works on other machines and it's great for
development stuff as you can snapshot zones.

It's very, very slow. For instance installing ts, which is a
completely sparse zone, took over an hour. It's a slow machine, of
course, but still this is a really long time.

I did some rudimentary experimentation with something like:

dd if=/dev/zero of=file bs=1024 count=102400

with resulting write rates of about 3MB/s through zfs, and about 14
through UFS and SVM, onto the same pair of disks, obviously. 3MB/s is
pretty crippling. Obviously these aren't realistic FS benchmarks, but
they are indicative of something.

What am I doing wrong?

Thanks

--tim

Daniel Rock

unread,

Aug 17, 2006, 4:54:11 PM8/17/06

to

Tim Bradshaw <t...@tfeb.org> wrote:
> I have a Netra X1 with a pair of 120GB (non-Sun) disks, and 1152MB of
> memory (so enough, I hope). It's running 10 6/06, and the disks are
> split between SVM and: /, /var and swap are UFS filesystems on SVM
> mirrors, and the rest is taken up by zfs:

That way ZFS cannot enable the write cache on the disks. But these
are IDE disks and on non-Sun IDE disks the write cache is normally enabled
by default. So this type of setup should cause the performance
degradation.

> with resulting write rates of about 3MB/s through zfs, and about 14
> through UFS and SVM, onto the same pair of disks, obviously. 3MB/s is
> pretty crippling. Obviously these aren't realistic FS benchmarks, but
> they are indicative of something.

How did you set up the individual zfs filesystems? Did you change the
checksum algorithm to sha256? This will kill performance - especially on
such a slow machine. What is the output of "zfs get all export/zones/ts"
(or the other names)?

I have a similar setup (two 120 GB disks mirrored) and did also a small
write test (1 GB in total):

timex dd if=/dev/zero of=bigfile bs=1024k count=1024
1024+0 Datensätze in
1024+0 Datensätze aus

real 42.15
user 0.00
sys 2.26

So I achieved a write performance of ~25 MB/s. Although my machine is much
faster than a Netra X1, this shouldn't matter much: During the test the
system was ~95% idle.

--
Daniel

Tim Bradshaw

unread,

Aug 17, 2006, 5:37:36 PM8/17/06

to

On 2006-08-17 21:54:11 +0100, "Daniel Rock" <v20...@deadcafe.de> said:
>
> That way ZFS cannot enable the write cache on the disks. But these
> are IDE disks and on non-Sun IDE disks the write cache is normally enabled
> by default. So this type of setup should cause the performance
> degradation.

Yes, they are non Sun disks as you guessed. I think you mean `should
not' not `should'?

> How did you set up the individual zfs filesystems? Did you change the
> checksum algorithm to sha256? This will kill performance - especially on
> such a slow machine. What is the output of "zfs get all export/zones/ts"
> (or the other names)?

I just did zfs create export/... with no special options.

For export/zones/tank:
# zfs get all export/zones/ts
NAME PROPERTY VALUE SOURCE
export/zones/ts type filesystem -
export/zones/ts creation Tue Aug 8 21:42 2006 -
export/zones/ts used 24.5K -
export/zones/ts available 91.7G -
export/zones/ts referenced 24.5K -
export/zones/ts compressratio 1.00x -
export/zones/ts mounted yes -
export/zones/ts quota none default
export/zones/ts reservation none default
export/zones/ts recordsize 128K default
export/zones/ts mountpoint /export/zones/ts default
export/zones/ts sharenfs off default
export/zones/ts checksum on default
export/zones/ts compression off default
export/zones/ts atime on default
export/zones/ts devices on default
export/zones/ts exec on default
export/zones/ts setuid on default
export/zones/ts readonly off default
export/zones/ts zoned off default
export/zones/ts snapdir hidden default
export/zones/ts aclmode groupmask default
export/zones/ts aclinherit secure default

(I've deleted the zone, hence space is different now).

It's really annoying because we were planning to buy a couple more used
X1s as service machines (they're small, cheap, cheap to run, and SPARC
which matters to us), but performance like this is crippling. My
experiences with zfs on other machines have been fine.

--tim

Daniel Rock

unread,

Aug 17, 2006, 5:48:58 PM8/17/06

to

Tim Bradshaw <t...@tfeb.org> wrote:
> On 2006-08-17 21:54:11 +0100, "Daniel Rock" <v20...@deadcafe.de> said:
>>
>> That way ZFS cannot enable the write cache on the disks. But these
>> are IDE disks and on non-Sun IDE disks the write cache is normally enabled
>> by default. So this type of setup should cause the performance
>> degradation.
>
> Yes, they are non Sun disks as you guessed. I think you mean `should
> not' not `should'?

Yes, sorry for the typo.

> I just did zfs create export/... with no special options.

Hmm, tomorrow I might be able to do some tests on old hardware as well
(E3500 with 400 MHz CPUs). I don't think that checksumming (default
checksum is a simple xor) should be a performance bottleneck but you
could give it a try and turn checksumming off:

zfs set checksum=off export/zones/ts

What is the load of the machine during I/O tests (top, prstat, etc.)?
How busy are the disks, what I/O pattern (iostat -xnz 5)?

--
Daniel

Tim Bradshaw

unread,

Aug 17, 2006, 7:52:54 PM8/17/06

to

On 2006-08-17 22:48:58 +0100, "Daniel Rock" <v20...@deadcafe.de> said:
>
> zfs set checksum=off export/zones/ts

I'll try that.

>
>
> What is the load of the machine during I/O tests (top, prstat, etc.)?
> How busy are the disks, what I/O pattern (iostat -xnz 5)?

... but there is something a bit weird here. during a first boot of a
zone the load goes up to over 6 and it becomes hard to type at the
machine. I think something may be hardwarily screwed. No errors are
logged (and iostat -E etc shows no errors ever).

--tim

Daniel Rock

unread,

Aug 18, 2006, 12:43:27 PM8/18/06

to

Daniel Rock <v20...@deadcafe.de> wrote:
> Hmm, tomorrow I might be able to do some tests on old hardware as well
> (E3500 with 400 MHz CPUs).

Ok, I have dome some tests on this ancient brick with mixed results.
It is a E3500 with 4 x 400 MHz. I have turned off 3 of its CPUs (via
psradm) to get similar results. Disks were FC attached A5200 disks
(since ZFS doesn't reset the write cache if you destroy a pool the
UFS tests did also benefit from the still enabled write cache).

I achieved with ZFS a write throughput
dd if=/dev/zero of=/pool/bigfile bs=1024k count=1024
sync
an average throughput of ~15 MB/s
But during the test the system was almost unusable (sluggish) with a kernel
time of up to 99%.

With UFS in contrast I achieved the same throughput but the system kept being
responsive during the test (kernel time at ~35%)

So the CPU is likely too slow for ZFS. Turning checksums off didn't make any
significant difference.

--
Daniel

Tim Bradshaw

unread,

Aug 18, 2006, 1:59:15 PM8/18/06

to

On 2006-08-18 17:43:27 +0100, "Daniel Rock" <v20...@deadcafe.de> said:
>
> So the CPU is likely too slow for ZFS. Turning checksums off didn't make any
> significant difference.

Hum. That sucks. Why is ZFS so expensive I wonder?

--tim

Dave Miner

unread,

Aug 18, 2006, 2:45:25 PM8/18/06

to

Tim Bradshaw wrote:
...

> What am I doing wrong?
>

Nothing, actually. I ran into this on my X1 a while back and it's
reported as CR 6421427. A fix is in the works; I've been running the
test binary for a week or so and it's working well.

Dave

Frank Cusack

unread,

Aug 18, 2006, 3:51:51 PM8/18/06

to

The OP indicated local filesystems only; CR 6421427 (at least the
publically accessible bits) indicates a bad interaction with NFS,
and specifically says "Local access to the filesystem performs well."

-frank

Tim Bradshaw

unread,

Aug 18, 2006, 4:09:54 PM8/18/06

to

On 2006-08-18 19:45:25 +0100, Dave Miner <dave....@sun.com> said:

> Nothing, actually. I ran into this on my X1 a while back and it's
> reported as CR 6421427. A fix is in the works; I've been running the
> test binary for a week or so and it's working well.

Is there anything I can do to get hold of the fix, or will it appear as
a patch soon? I'd really like not to have to go back to UFS, even
temporarily.

Thanks

--tim

Bruno Delbono

unread,

Aug 19, 2006, 3:42:41 AM8/19/06

to

I've actually been following zfs threads here on comp.unix.solaris,
opensolaris-discuss@ and zfs-discuss@ and been brooding on deploying ZFS
in production. With the current reports (some interesting threads) I
think I'll wait a few more years before I feel more comfortable
deploying it. Meanwhile, I'll stick with UFS and Veritas VxFS 4.1

My two cents.

--
--------------------------------------------------------------------
Bruno Delbono | Systems Engineer | Open-Systems Group
Websites: www.mail.ac www.sendmail.tv www.open-systems.org
--------------------------------------------------------------------

Dave Miner

unread,

Aug 21, 2006, 2:07:41 PM8/21/06

to

Good point, I had missed that. The problem appears to be especially
aggravated by NFS writes, but there's a good chance that usage patterns
other than the simple test cases we were using to demonstrate it would
also cause it.

Dave

Dave Miner

unread,

Aug 21, 2006, 2:08:52 PM8/21/06

to

If you have a Solaris support contract, then an escalation is your best bet.

Dave

Dave Miner

unread,

Aug 21, 2006, 2:16:35 PM8/21/06

to

Bruno Delbono wrote:
> Tim Bradshaw wrote:
>> On 2006-08-18 19:45:25 +0100, Dave Miner <dave....@sun.com> said:
>>
>>> Nothing, actually. I ran into this on my X1 a while back and it's
>>> reported as CR 6421427. A fix is in the works; I've been running the
>>> test binary for a week or so and it's working well.
>>
>> Is there anything I can do to get hold of the fix, or will it appear
>> as a patch soon? I'd really like not to have to go back to UFS, even
>> temporarily.
>
> I've actually been following zfs threads here on comp.unix.solaris,
> opensolaris-discuss@ and zfs-discuss@ and been brooding on deploying ZFS
> in production. With the current reports (some interesting threads) I
> think I'll wait a few more years before I feel more comfortable
> deploying it. Meanwhile, I'll stick with UFS and Veritas VxFS 4.1
>

I would hope that this thread wouldn't have much bearing on forming such
an opinion, it's really a bug in the SPARC IDE driver that is being
exposed by the usage patterns of ZFS. You probably never would see it
at all as there aren't that many systems using it: I think there are
only two platforms that we've sold in the last 4 years that fall into
this category.

Dave

Tim Bradshaw

unread,

Aug 21, 2006, 3:03:54 PM8/21/06

to

On 2006-08-21 19:16:35 +0100, Dave Miner <dave....@sun.com> said:

> I would hope that this thread wouldn't have much bearing on forming
> such an opinion, it's really a bug in the SPARC IDE driver that is
> being exposed by the usage patterns of ZFS.

This is true I'm sure, but I think you overestimate people's
rationality - if performance sucks with some new & alien filesystem &
not with UFS on the same platform then they'll blame the FS and ...

> You probably never would see it at all as there aren't that many
> systems using it: I think there are only two platforms that we've sold
> in the last 4 years that fall into this category.

... also perhaps underestimate the number of people who will be using
scrap old machines to play with new OS releases on before moving them
to their expensive production HW.

(by `you' I don't really mean you personally, sorry. I mean `it is
easily possible to make these errors, if errors they are.)

--tim

Tim Bradshaw

unread,

Aug 21, 2006, 3:04:45 PM8/21/06

to

On 2006-08-21 19:08:52 +0100, Dave Miner <dave....@sun.com> said:

> If you have a Solaris support contract, then an escalation is your best bet.

Only a subscription, sigh. Is there any easy way for me to track the
CR (on opensolaris.org?) so I can know when a patch comes out?

Thanks

--tim

Dale Ghent

unread,

Aug 30, 2006, 11:33:42 PM8/30/06

to Tim Bradshaw

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6421427

According to a post by George Wilson @ Sun on Aug. 28 on the zfs-discuss
list, "A fix for this should be integrated shortly." Whether he means
Nevada or s10 or both, I don't know.

/dale

Dale Ghent

unread,

Aug 30, 2006, 11:34:20 PM8/30/06

to Tim Bradshaw

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6421427

Dale Ghent

unread,

Aug 30, 2006, 11:34:52 PM8/30/06

to Tim Bradshaw

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6421427