Alignment for Disks with 4KB Sectors

35 views
Skip to first unread message

Gordan Bobic

unread,
Mar 25, 2011, 11:25:13 AM3/25/11
to KQStor ZFS Discussion
How does ZFS handle disks with 4KB hardware sectors? The reason I ask
is because the GPT partition that ZFS creates starts at sector 1
rather than sector 0, and unless I am mistaken, this will throw out
the alignment. Unless the FS is internally offset by 3.5KiB (7
sectors) all the data blocks will be offset by this amount.

The obvious workaround is to align the partition to start at sector 8,
but it would be nice if somebody with a deeper knowledge could explain
what effect this would have on the file system alignment itself.

Also, are there any special parameters that can be used to tell ZFS
that it is on a disk with 4KiB sectors? For example, to prevent it
from trying to use blocks smaller than 4KiB? The documentation I can
find says that it will use blocks up to 128KiB in size, but nothing
lists the lower limit.

Gordan

Gordan Bobic

unread,
Mar 25, 2011, 12:32:33 PM3/25/11
to KQStor ZFS Discussion
I found references to modified OpenSolaris zpool binaries that have
ashift hard coded to 12 instead of 9 (2^9=512, 2^12=4096):

http://digitaldj.net/2010/11/03/zfs-zpool-v28-openindiana-b147-4k-drives-and-you/
http://web.archiveorange.com/archive/v/Lmwut1dzJz8KUiAVcfdG

Is there a way with the Linux implementation to set ashift at zpool
creation time?

Gordan

Gordan Bobic

unread,
Mar 26, 2011, 11:07:45 AM3/26/11
to kqstor-zf...@googlegroups.com

More importantly - are there assumptions made in the code about ashift
always being 9? Would it be safe to modify the zpool sources to
hard-code it to 12? What are the chances of this breaking other things?

Gordan

Gordan Bobic

unread,
Mar 26, 2011, 3:44:25 PM3/26/11
to KQStor ZFS Discussion
Much as I hate to reply to my own post, I've been doing some digging,
and it looks like the zfs tools are virtually unchanged from the ones
from OpenSolaris. The patch for those to force ashift to 12 is here:

http://www.solarismen.de/archives/5-Solaris-and-the-new-4K-Sector-Disks-e.g.-WDxxEARS-Part-2.html

Whether that will work or produce a completely corrupted file system -
I don't know. I'll try it and report back.

I'm somewhat curious there isn't more interest in this issue, and more
importantly, that the solution isn't already commonly available. Most
new disks of 1TB or more have 4KB sectors and I find it surprising
that there isn't more noise being made about a potential 50%
performance degradation from unaligned sector access.

Gordan Bobic

unread,
Mar 27, 2011, 10:23:53 AM3/27/11
to kqstor-zf...@googlegroups.com

Well, so far so good. I modified the zpool sources with the above patch
and rebuilt it, and it sets the ashift correctly:

# zdb -C | grep ashift
ashift: 12

I tested it with a 100GB data set with:
dedup=off,compression=off
and with
dedup=on,compression=on

and compared the results with the source data set using:

find . -type f -exec md5sum '{}' \; > /tmp/zfs.md5

And found no differences in the data sets.

zpool scrub also found no problems.

So all I can say is that "it works for me".

Having done a bit more digging into the partition alignments, when ZFS
is given a whole disk to manage the first partition by default seems to
start at one sector past 1MiB, which means it is correctly aligned for
4KiB sectors.

So apart from the ashift, it seems that everything else should be fine
right out of the box.

Gordan

Clemens Fruhwirth

unread,
Mar 29, 2011, 4:43:54 PM3/29/11
to kqstor-zf...@googlegroups.com
On Sat, Mar 26, 2011 at 8:44 PM, Gordan Bobic <gordan...@gmail.com> wrote:

> I'm somewhat curious there isn't more interest in this issue, and more
> importantly, that the solution isn't already commonly available. Most
> new disks of 1TB or more have 4KB sectors and I find it surprising
> that there isn't more noise being made about a potential 50%
> performance degradation from unaligned sector access.

Unfortunately I can't remember all the details here, but FreeBSD
doesn't handle this problem properly, that is if the partition is
misaligned ZFS won't work at all. I only found that after migrating to
such a setting and I have seen read rates as bad as 2 MB/s, bad enough
that I couldn't even properly migrate using zfs send | zfs receive
tricks as it would have taken ages for 1 TB to be migrated that way.
There are no easy tricks to get of such a setting with FreeBSD.

At the end I turned to iscsi export all zfs partition and use zfs-fuse
under Linux to do a copy to an aligned partition local partition.
Apparently Linux and/or ISCSI is smart, when it comes to block level
access and does a lot of caching here, although I haven't found hard
evidence for that theory. I just found that this setting magically
worked, and I was happy that this problem has gone away. Since then I
am on FreeBSD with aligned partition and everything seems ok.
--
Fruhwirth Clemens http://clemens.endorphin.org

Reply all
Reply to author
Forward
0 new messages