QFS chunk servers segfault when running on top of ZFS file system

50 views
Skip to first unread message

Marios Hadjieleftheriou

unread,
May 1, 2013, 10:38:16 AM5/1/13
to qfs-...@googlegroups.com
I am running QFS without any issues on ext4 file systems on a cluster of 15 nodes.

I decided to try using ZFS (http://zfsonlinux.org/) as the underlying file system for each chunk server, for extra durability on a per node basis. But, the chunk servers segfault immediately (on first write), without any indications of what happened in the logs.

I have been using ZFS as ext4 replacement to store data without any issues whatsoever for quite some time, so this problem is really vexing.

Has anyone had prior experience with such a setup?

Marios Hadjieleftheriou

unread,
May 1, 2013, 12:24:41 PM5/1/13
to qfs-...@googlegroups.com
I found the problem. QFS uses O_DIRECT to open files, and ZFS does not support the O_DIRECT flag.

I removed O_DIRECT from QFS and recompiled and now it works great on top of ZFS.

Silvius Rus

unread,
May 1, 2013, 1:01:28 PM5/1/13
to qfs-...@googlegroups.com
You do not need to modify the code to disable O_DIRECT.  You can toggle it using this flag:

chunkServer.bufferedIo0Controls buffered I/O. By default the chunk server will bypass the OS buffer cache and instead use direct I/O when supported. It is conceivable that enabling buffered I/O might help with short reads for "broadcast" / "web server" type loads. For the "typical" large I/O (1MB) request sequential type loads, enabling caching will likely lower cluster performance due to higher system (OS) cpu overhead and memory contention.
See https://github.com/quantcast/qfs/wiki/Configuration-Reference for more.

Note that we have only tested it with O_DIRECT at scale.

Silvius
Reply all
Reply to author
Forward
0 new messages