Native ZFS for Linux!

702 views
Skip to first unread message

Brian Behlendorf

unread,
Jun 3, 2010, 6:33:20 PM6/3/10
to zfs-fuse
Hello!

As part of a joint effort with Sun/Oracle to augment the Lustre file
system with ZFS support, we've been engaged in porting ZFS natively to
the Linux kernel. So far we have pretty much everything working
except the ZPL - this is because Lustre interfaces directly with the
DMU and the ZPL was not a priority for us. However, we connected with
folks at KQ Infotech who are also interested in a Linux kernel port
and they are working on the ZPL so it is on the way.

Anyway, the fruits of our labor are available here http://github.com/behlendorf/zfs/.
I don't know to what extent it's practical for the zfs-fuse community
and our project to collaborate. But since there is ZFS expertise on
both sides and a lot of common code, I wanted to propose that we at
least consider how we might help each other out.

Any thoughts?

Brian Behlendorf (no not the apache one, the other one)

Stan Seibert

unread,
Jun 3, 2010, 7:48:26 PM6/3/10
to zfs-fuse
The obvious question is the license issue:

If Sun/Oracle is involved, does this mean that the ZFS code (or at
least the part you are porting to the Linux kernel) will be licensed
in a way to allow it to be distributed generally? zfs-fuse began as a
way to dodge the licenses problem and simplify development. Are you
planning to merge your work upstream into the mainline kernel?

On Jun 3, 4:33 pm, Brian Behlendorf <brianbehlendo...@gmail.com>
wrote:
> Hello!
>
> As part of a joint effort with Sun/Oracle to augment the Lustre file
> system with ZFS support, we've been engaged in porting ZFS natively to
> the Linux kernel.  So far we have pretty much everything working
> except the ZPL - this is because Lustre interfaces directly with the
> DMU and the ZPL was not a priority for us.  However, we connected with
> folks at KQ Infotech who are also interested in a Linux kernel port
> and they are working on the ZPL so it is on the way.
>
> Anyway, the fruits of our labor are available herehttp://github.com/behlendorf/zfs/.

Brian Behlendorf

unread,
Jun 3, 2010, 8:04:45 PM6/3/10
to zfs-fuse
On Jun 3, 4:48 pm, Stan Seibert <vols...@gmail.com> wrote:
> The obvious question is the license issue:

I thought this might be the first question asked. Below is an extract
from the zfs github project wiki FAQ page which tries to address this,
http://wiki.github.com/behlendorf/zfs/faq.


"What about the “licensing” issue?"

In a nutshell, the issue is that the Linux kernel which is licensed
under the GNU General Public License is incompatible with ZFS which is
licensed under the Sun CDDL. While both the GPL and CDDL are open
source licenses their terms are such that it is impossible to
simultaneously satisfy both licenses. This means that a single derived
work of the Linux kernel and ZFS cannot be legally distributed.

One way to resolve this issue is to implement ZFS in user space with
FUSE where it is not considered a derived work of the kernel. This
approach resolves the licensing issues but it has some technical
drawbacks. There is another option though. The CDDL does not restrict
modification and release of the ZFS source code which is publicly
available as part of OpenSolaris. The ZFS code can be modified to
build as a CDDL licensed kernel module which is not distributed as
part of the Linux kernel. This makes a Native ZFS on Linux
implementation possible if you are willing to download and build it
yourself.

Emmanuel Anne

unread,
Jun 3, 2010, 8:42:08 PM6/3/10
to zfs-...@googlegroups.com
Interesting idea, I tried to build it to see how it works, and got stuck on spl (solaris porting layer). Even a google search could not find it.
Anyway maybe it's because it's too later for that now, I'll have another look tomorrow !

2010/6/4 Brian Behlendorf <brianbeh...@gmail.com>

--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/



--
zfs-fuse git repository : http://rainemu.swishparty.co.uk/cgi-bin/gitweb.cgi?p=zfs;a=summary

Stan Seibert

unread,
Jun 3, 2010, 8:46:52 PM6/3/10
to zfs-fuse
OK, so same as when zfs-fuse started. I had hoped that perhaps the
licensing situation had changed due to the Sun/Oracle/Lustre
involvement.


On Jun 3, 6:04 pm, Brian Behlendorf <brianbehlendo...@gmail.com>
wrote:
> On Jun 3, 4:48 pm, Stan Seibert <vols...@gmail.com> wrote:
>
> > The obvious question is the license issue:
>
> I thought this might be the first question asked.  Below is an extract
> from the zfs github project wiki FAQ page which tries to address this,http://wiki.github.com/behlendorf/zfs/faq.

devsk

unread,
Jun 3, 2010, 10:10:59 PM6/3/10
to zfs-fuse
Brian,

Is there a howto or simple instructions on how to go about using that
code? I think I have to build the kernel module and the utilities.
Which kernel have you built and tested it with?

I will try it in a VM.

-devsk


On Jun 3, 3:33 pm, Brian Behlendorf <brianbehlendo...@gmail.com>
wrote:
> Hello!
>
> As part of a joint effort with Sun/Oracle to augment the Lustre file
> system with ZFS support, we've been engaged in porting ZFS natively to
> the Linux kernel.  So far we have pretty much everything working
> except the ZPL - this is because Lustre interfaces directly with the
> DMU and the ZPL was not a priority for us.  However, we connected with
> folks at KQ Infotech who are also interested in a Linux kernel port
> and they are working on the ZPL so it is on the way.
>
> Anyway, the fruits of our labor are available herehttp://github.com/behlendorf/zfs/.

Brian Behlendorf

unread,
Jun 4, 2010, 12:15:13 AM6/4/10
to zfs-fuse
Let me try and answer some of the above questions with single reply.

> I had hoped that perhaps the licensing situation had changed due to
> the Sun/Oracle/Lustre involvement.

We have been working on this for some time now and have been strongly
urging Sun/Oracle to make a change to the licensing. I'm sorry to say
we have not yet had any luck. However, we did feel it was time to
move
our efforts to a public forum.

> Is there a howto or simple instructions on how to go about using that
> code? I think I have to build the kernel module and the utilities.

Yes. I've tried to put this kind of information on the projects wiki
page.
There are instructions on how to build both the spl and zfs code at
the
following page. If you run in to an issue or gotcha please open an
issue
so I can either fix it or update the documention.

http://wiki.github.com/behlendorf/zfs/building-zfs

As for how to use the code I have posted a few examples but more of
this sort of thing is welcome! Here's an example of creating an ext2
file
system using a zvol block device and taking a snapshot and mounting
it read-only. The basic idea is it should work pretty much exactly
like it
does on OpenSolaris. Load the zfs module stack and your good to go
with the exact same utilities.

http://wiki.github.com/behlendorf/zfs/example-zvol

> Which kernel have you built and tested it with?

Quite a few actually. I mainly do my development using the current
Fedora kernels but I've been careful to support all the way back to
the
rhel5 2.6.18 based kernels. There is fairly elaborate autoconf build
system which detects the interfaces provided by your kernel and
tries to do the right thing. I've done testing with the following
kernels
but there's a decent chance it'll work with a random 2.6.18 - 2.6.32
based kernel. If it doesn't please open an issue.

http://wiki.github.com/behlendorf/zfs/tested-platforms

Finally one more thing deserves special mention and that is that it
works
best with a 64-bit kernel. There are still issues with the limited
virtual
address space in the kernel on 32-bit platforms. Since our target
server
platforms for lustre are all 64-bit I haven't invested too much effort
in
dealing with 32-bit issues.

Thanks for the interest and please let me know what you think. I'd
love
the feedback!

Thanks,
Brian Behlendorf

Emmanuel Anne

unread,
Jun 4, 2010, 5:09:12 AM6/4/10
to zfs-...@googlegroups.com
Next try then, this time I got stuck with these undefined symbols in spl module :
first_online_pgdat
next_zone
I tracked these 2 in the mm directory of the kernel sources, so I guess it has something to do with the memory model ?
Which option exactly enables these symbols in the kernel then ?

2010/6/4 Brian Behlendorf <brianbeh...@gmail.com>
--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/

Emmanuel Anne

unread,
Jun 4, 2010, 12:17:55 PM6/4/10
to zfs-...@googlegroups.com
Well after quite a few tests :
 - kernel must be < 2.6.33 (did it with 2.6.32.15).
 - must use sparse memory model, allow hotplug memory (had to compile my own kernel for that).
 - In the end, everything installs correctly, the test script works, but I can't mount anything (zpool create seems to work, zfs list and zpool list work correctly too, but zfs mount does absolutely nothing and even ignores its arguments after mount, zpool import returns immediately and nothing is mounted).

So is it still just highly experimental and totally unusable for now or did I miss something?

Geting back to my good old zfs-fuse for now ! ;-)

2010/6/4 Emmanuel Anne <emmanu...@gmail.com>

Brian Behlendorf

unread,
Jun 4, 2010, 12:59:34 PM6/4/10
to zfs-fuse
Emmanuel,

Thanks for taking the time to build the projects and
kick the tires a bit. I'm sorry you had some trouble
getting it to build it is still a work in progress,

> So is it still just highly experimental and totally unusable for now or did
> I miss something?

Yes, the project it is still under very active development.
As I mentioned in my original post and on the wiki the
ZFS posix layer has not yet been implemented. Since
lustre directly links against DMU this isn't critical for our
purposes. That said there is work being done to get this
working. For the moment the only usable user space
interface is the ZVOL.

My main motivation for posting the code even if it's not
100% complete was to make it available for those who
are interested in pursuing a native Linux port. Either just
to follow along as development progresses or to actively
help develop and test.

Thanks,
Brian Behlendorf

devsk

unread,
Jun 4, 2010, 1:03:32 PM6/4/10
to zfs-fuse
I think he mentioned ZPL (ZFS Posix Layer) is not there yet. So, none
of the FS stuff is going to work. He is looking towards KQ Infotech
folks who haven't done anything since Oct 2009 (just a post on their
site) and emails asking them for details resulted in nothing. Whatever
they have, they are holding it to their chest.

The other option Brian is looking towards is zfs-fuse itself. If we
can utilize zfs-fuse somehow to insert the missing ZPL part, then we
can have a functional ZFS. But I am not sure about this approach
because the biggest advantage of native linux port is lost:
performance! Or may be I don't understand the overall architecture.

At least that's what I understood.

Brian, that zvol example is really cool BTW.

-devsk


On Jun 4, 9:17 am, Emmanuel Anne <emmanuel.a...@gmail.com> wrote:
> Well after quite a few tests :
>  - kernel must be < 2.6.33 (did it with 2.6.32.15).
>  - must use sparse memory model, allow hotplug memory (had to compile my own
> kernel for that).
>  - In the end, everything installs correctly, the test script works, but I
> can't mount anything (zpool create seems to work, zfs list and zpool list
> work correctly too, but zfs mount does absolutely nothing and even ignores
> its arguments after mount, zpool import returns immediately and nothing is
> mounted).
>
> So is it still just highly experimental and totally unusable for now or did
> I miss something?
>
> Geting back to my good old zfs-fuse for now ! ;-)
>
> 2010/6/4 Emmanuel Anne <emmanuel.a...@gmail.com>
>
>
>
> > Next try then, this time I got stuck with these undefined symbols in spl
> > module :
> > first_online_pgdat
> > next_zone
> > I tracked these 2 in the mm directory of the kernel sources, so I guess it
> > has something to do with the memory model ?
> > Which option exactly enables these symbols in the kernel then ?
>
> > 2010/6/4 Brian Behlendorf <brianbehlendo...@gmail.com>
> >> To visit our Web site, click onhttp://zfs-fuse.net/

Emmanuel Anne

unread,
Jun 4, 2010, 2:16:55 PM6/4/10
to zfs-...@googlegroups.com
Yes and I don't even have a clue on how this zpl layer is supposed to work in the 1st place... Maybe the code can still be used anyway, I know wa had someone who tried to convert the zfs-fuse code to kernel space and gave up because he got some very unstable results, so maybe it's a better start this time. Anyway still a lot of work to do apparently...

2010/6/4 devsk <dev...@gmail.com>
To visit our Web site, click on http://zfs-fuse.net/

Brian Behlendorf

unread,
Jun 4, 2010, 6:44:53 PM6/4/10
to zfs-fuse
On Jun 4, 10:03 am, devsk <dev...@gmail.com> wrote:
> I think he mentioned ZPL (ZFS Posix Layer) is not there yet. So, none
> of the FS stuff is going to work. He is looking towards KQ Infotech

Actually, in the long term I would love to support both a native in-
kernel posix layer and a fuse based posix layer. The way the code is
structured you actually build the same ZFS code once in the kernel as
a set of modules and a second time as a set of shared libraries. The
in-kernel version is used by Lustre, the ZVOL, and will eventually be
used by the native posix layer. Currently the shared libraries are
only used by ztest for regression testing. However, they could be
used to form the basis of a fuse based implementation. The major
missing bit is the glue to tie it to fuse which you guys have already
shown can be done. IMHO the real advantage of this would be a shared
code base.

> so maybe it's a better start this time. Anyway still a lot
> of work to do apparently...

I hope so. I've spent the last year building a stable foundation in
the kernel so when we finally get to the posix layer, which is
probably the hardest part, we have a good chance of success. There's
a lot of work to be done but I think it is all very do able with
enough effort!

Xavier Mehaut

unread,
Jun 5, 2010, 12:51:56 AM6/5/10
to zfs-...@googlegroups.com
hi brian
could you please summurize your ideas with a archtectural picture of
both solutions, ie the zfs-fuse and the native ones?
and also give us what do you expect of the native zfs for linux
performances and features wrt the zfs-fuse solution, considering if i
well understood that the bridge between the kernel and the both
solutions is still fuse?
could we for instance expect building a root zfs for launching the os
in mirror?
regards

Envoyé de mon iPhone

Le 5 juin 2010 à 00:44, Brian Behlendorf <brianbeh...@gmail.com>
a écrit :

Emmanuel Anne

unread,
Jun 5, 2010, 2:59:07 AM6/5/10
to zfs-...@googlegroups.com
2010/6/5 Brian Behlendorf <brianbeh...@gmail.com>

Actually, in the long term I would love to support both a native in-
kernel posix layer and a fuse based posix layer.  The way the code is
structured you actually build the same ZFS code once in the kernel as
a set of modules and a second time as a set of shared libraries.  The
in-kernel version is used by Lustre, the ZVOL, and will eventually be
used by the native posix layer.  Currently the shared libraries are
only used by ztest for regression testing.  However, they could be
used to form the basis of a fuse based implementation.  The major
missing bit is the glue to tie it to fuse which you guys have already
shown can be done.  IMHO the real advantage of this would be a shared
code base.

After a quick look, it's not so easy.
Fuse interfaces the lowlevel file functions directly, so for example it needs zfs_readdir.
Which is only in the zfs kernel module apparently (module/zfs/zfs_vnops.c). I guess ztest is only using the ioctl calls which makes it easier to build. Here you would have to reinvent completely zfs-fuse by extracting all the file functions from the kernel layer, no magical glue which would make everything to work together here...

Have you thought about a 3rd solution : interfacing directly the kernel filesystem functions without using a compatibility layer ? A lot of the function interfaces look very similar, now I didn't review everything so I don't know how hard it would be, but it seems to be worth a try at least !

sgheeren

unread,
Jun 5, 2010, 6:27:03 AM6/5/10
to zfs-...@googlegroups.com
Hi guys

About time to chime in.

Now I have hardly any kernel experience (just forget about that). But it
is pretty obvious that interfacing to the fuse interface is much easier
than programming a ZPL from scratch.

My syllogism would be

* If you want to be in-kernel,

AND

* You wish to share codebase with the fuse implementation

THEN

* You should obviously look at the fuse kernel module

Only in that way can you hope to reuse significant bits of zfs-fuse. In
a thought experiment you can easily show that it should be a welldefined
effort to:

(a) fork the fuse module
(b) adapt the interface so you can have the same functional blocks BUT
don't need to cross-over to user space at the fuse-layer
(c) patch-up where a switch to user-space is still desired

So the real /job/ is in (c) only. I have a suspicion that it would be
feasible to have 1:1 port into kernel space pretty soonish (unless there
are technical reasons why the libzfs socket .e.g. could not continue to
be a socket interface, e.g.).
I have another suspicion that the important bits that now rely on being
in userspace should be relatively few and probably not needed anyway
once you are in kernel (e.g. the mounting operations)

$0.02

jafo

unread,
Jun 6, 2010, 3:56:04 PM6/6/10
to zfs-fuse
On Jun 3, 6:04 pm, Brian Behlendorf <brianbehlendo...@gmail.com>
wrote:
> implementation possible if you are willing to download and build it
> yourself.

Modules being unavailable in the stock kernel have never really been a
big
issue, IMHO. For example, DRBD was only recently merged, BTRFS was an
external module, Ceph only recently got merged, the Nvidia and ATI
binary
drivers of course, Xen largely exists outside the kernel...

So, I've never seen what the big deal was about having a kernel module
that
wouldn't be released in the kernel.org kernel is. It seemed to me
that we
had some people in power turn up their noses at ZFS, and that slowed
down
adoption of ZFS under Linux by, what, 5+ years?

Anyway, ZFS-FUSE has mostly saved the day (though there are people who
shy away from it because it's FUSE and it's "gotta be slow" or
similar.

Thanks so much to both the FUSE and the Lustre efforts to bring us
Linux
users a modern file-system now. :-)

Sean

Stan Seibert

unread,
Jun 6, 2010, 4:27:06 PM6/6/10
to zfs-fuse

On Jun 6, 1:56 pm, jafo <jaf...@gmail.com> wrote:
> On Jun 3, 6:04 pm, Brian Behlendorf <brianbehlendo...@gmail.com>
> wrote:
>
> > implementation possible if you are willing to download and build it
> > yourself.
>
> Modules being unavailable in the stock kernel have never really been a
> big
> issue, IMHO.  For example, DRBD was only recently merged, BTRFS was an
> external module, Ceph only recently got merged, the Nvidia and ATI
> binary
> drivers of course, Xen largely exists outside the kernel...
>
> So, I've never seen what the big deal was about having a kernel module
> that
> wouldn't be released in the kernel.org kernel is.  It seemed to me
> that we
> had some people in power turn up their noses at ZFS, and that slowed
> down
> adoption of ZFS under Linux by, what, 5+ years?

I don't think the issue is inclusion in the kernel.org release so much
as the illegality of shipping a binary package with "kernel ZFS". I
don't mind compiling stuff, but it would be a lot nicer if ZFS could
be included in the Ubuntu/Fedora/SuSE/RHEL repositories some day.
btrfs is already in Ubuntu 10.04, and might be an installation option
in two releases.

Emmanuel Anne

unread,
Jun 6, 2010, 4:29:35 PM6/6/10
to zfs-...@googlegroups.com
Hey you'll wonder about that when the code will actually work.
For now it's just a base, probably a good base, but it needs very motivated people to make it work...

2010/6/6 Stan Seibert <vol...@gmail.com>
--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/

Stan Seibert

unread,
Jun 6, 2010, 4:57:47 PM6/6/10
to zfs-fuse
Sure, I hope this works! I'd use it. :)

Just want to make sure people realize that this can't be bundled with
any Linux distribution without Oracle deciding to change the ZFS
license.


On Jun 6, 2:29 pm, Emmanuel Anne <emmanuel.a...@gmail.com> wrote:
> Hey you'll wonder about that when the code will actually work.
> For now it's just a base, probably a good base, but it needs very motivated
> people to make it work...
>
> 2010/6/6 Stan Seibert <vols...@gmail.com>
> > To visit our Web site, click onhttp://zfs-fuse.net/

David Sanders

unread,
Jun 7, 2010, 3:11:16 AM6/7/10
to zfs-...@googlegroups.com
On 6 June 2010 22:57, Stan Seibert <vol...@gmail.com> wrote:
> Sure, I hope this works!  I'd use it.  :)
>
> Just want to make sure people realize that this can't be bundled with
> any Linux distribution without Oracle deciding to change the ZFS
> license.

It can certainly be bundled with Linux distributions in much the same
way as other kernel-tainting modules. Ubuntu already bundles in a ton
of binary-only drivers (Broadcom WiFi/ NVidia/ fglrx) and this would
be no different. What makes you think it can't be included?

Aneurin Price

unread,
Jun 7, 2010, 7:28:11 AM6/7/10
to zfs-...@googlegroups.com

It's different because it's almost the complete opposite :P. This can
only be legally distributed in source form, to be compiled by the user
(who can't then distribute the resulting binary of course).

I believe what you're saying though is that it has the same legal
status as the shim layer between the binary blob and the kernel - it
just happens that rather than being a thin wrapper that has to be
built by the end user, it's the whole thing. I don't know if
module-assistant (or other distributions' equivalents) could be used
as-is, but I'd expect so. The real question is whether you can
persuade distributions that it is worth going to that effort. They
understandably don't like having to jump through hoops to work around
legal problems unless there is tremendous demand - probably no point
in even trying until the ZPL is usable.

Nye

jafo

unread,
Jun 7, 2010, 10:04:29 AM6/7/10
to zfs-fuse
On Jun 7, 5:28 am, Aneurin Price <aneurin.pr...@gmail.com> wrote:
> It's different because it's almost the complete opposite :P. This can
> only be legally distributed in source form, to be compiled by the user
> (who can't then distribute the resulting binary of course).

What is it that makes it not possible to distribute binary copies? I
mean,
OpenSolaris and Nexenta are distributing binary copies of CDDL
sources.

As I understand it, the CDDL is based on the Mozilla Public License,
and
Mozilla licensed code is distributed as binaries all over the place.

Sean

David Sanders

unread,
Jun 7, 2010, 10:09:24 AM6/7/10
to zfs-...@googlegroups.com
> It's different because it's almost the complete opposite :P. This can
> only be legally distributed in source form, to be compiled by the user
> (who can't then distribute the resulting binary of course).

Are you sure about this? BSD et al all distribute this as binary -
what evidence do you have that the CDDL denies this distribution? I
thought it was just a standard "incompatible with GPL" problem.

jafo

unread,
Jun 7, 2010, 10:12:28 AM6/7/10
to zfs-fuse
> btrfs is already in Ubuntu 10.04, and might be an installation option
> in two releases.

btrfs is already an installation option in Fedora, but the one test
install
with it for the root fs I did with F13 hung. Could have been
hardware, but
that same install worked fine with ext4.

I've been running a ported zfsstress against btrfs on a F13 test
machine
here. The stock kernel had some issues, but after "yum update" it has
been
running fine for 35 days so far.

That said, it still feels like btrfs has a long way to go to catch up
with
ZFS. It wasn't that long ago that you couldn't delete btrfs snapshots
for
example. File-systems just always seem to take a lot longer than
you'd
like to get to production-ready status.

Sean

Aneurin Price

unread,
Jun 7, 2010, 10:37:37 AM6/7/10
to zfs-...@googlegroups.com

I don't mean that binary distribution is prohibited in general, just
in this case. I'll clarify: to generate the binary module you need to
link against both ZFS (CDDL) and Linux (GPL), hence the binary has no
terms under which it can be legally distributed. That's what the
incompatibility is: in source form it is possible to honour the terms
of both licenses - trivially, because there is no combined work yet so
the kernel's license doesn't apply. As soon as the module is compiled
and linked with both the GPL and CDDL code there are no longer any
terms under which the resultant binary can be distributed.

The userland tools would be redistributable as they aren't considered
derivative works of the kernel.

Nye

David Sanders

unread,
Jun 7, 2010, 11:18:19 AM6/7/10
to zfs-...@googlegroups.com
> I don't mean that binary distribution is prohibited in general, just
> in this case. I'll clarify: to generate the binary module you need to
> link against both ZFS (CDDL) and Linux (GPL), hence the binary has no
> terms under which it can be legally distributed. That's what the
> incompatibility is: in source form it is possible to honour the terms
> of both licenses - trivially, because there is no combined work yet so
> the kernel's license doesn't apply. As soon as the module is compiled
> and linked with both the GPL and CDDL code there are no longer any
> terms under which the resultant binary can be distributed.

Right - of course this was (and still is) the big problem when
Opensolaris/ZFS was first open-sourced. I have massive respect for
what the FSF/GNU project have achieved but I'm starting to like the
the idea of BSD-style licensing more and more!

Nikola M

unread,
Jun 7, 2010, 8:08:45 PM6/7/10
to zfs-...@googlegroups.com
Aneurin Price wrote:
> the kernel's license doesn't apply. As soon as the module is compiled
> and linked with both the GPL and CDDL code there are no longer any
> terms under which the resultant binary can be distributed.
>
Problem is then with GPL.

But if on-site compilation before install is required,
that would be just Ok.

Fajar A. Nugraha

unread,
Jun 7, 2010, 9:32:42 PM6/7/10
to zfs-...@googlegroups.com

AFAIK that's how nvidia module is distributed under Ubuntu: using dkms
which compiles the module after user installs it.

--
Fajar

devsk

unread,
Jun 7, 2010, 9:44:07 PM6/7/10
to zfs-fuse
I don't see what the big deal with license incompatibility is. Its
quite obvious that:

1. The implementation has to be module based. It can be inside kernel
but then the whole kernel has to be built on the customer machine,
which no distro will choose to do. But the source can still live
inside kernel tree.
2. The implementation has to be built from source at the customer's
system.

I have no problem with that because all the necessary infra-structure
(gcc/binutils/make/initramfs etc.) required to build and use that
module during boot is there already.

I think we are distracted by this licensing issue. So much so that we
don't even realized that we don't have the code there yet!

-devsk


On Jun 7, 6:32 pm, "Fajar A. Nugraha" <fa...@fajar.net> wrote:

Will

unread,
Jun 8, 2010, 1:27:57 AM6/8/10
to zfs-fuse
As someone who presently is running OpenSolaris inside a VirtualBox
instance on Ubuntu as a ZFS fileserver, I can safely say that the day
the only thing I need to do is compile a kernel module for native ZFS
in Linux is the day I'll stop doing that.

Nikola M

unread,
Jun 8, 2010, 4:00:20 AM6/8/10
to zfs-...@googlegroups.com
Will wrote:
> As someone who presently is running OpenSolaris inside a VirtualBox
> instance on Ubuntu as a ZFS fileserver, I can safely say that the day
> the only thing I need to do is compile a kernel module for native ZFS
> in Linux is the day I'll stop doing that.
>
Just as a curiosity, how it behaves running in VBox (Zfs speed etc) and
I suppose you give to Opensolaris virt.machine access to physical disks
if I understand right?

devsk

unread,
Jun 8, 2010, 4:25:10 PM6/8/10
to zfs-fuse
I run OpenSolaris in Virtualbox as well but use ZFS-FUSE as the file-
server/backup for my LAN.

-devsk

Will

unread,
Jun 9, 2010, 1:52:09 AM6/9/10
to zfs-fuse
The array I originally created with ZFS-fuse, but I wanted faster
write-performance. So VirtualBox uses the disks via the raw-disk VMDK
feature. Write-speeds are sitting at about 40 mb/s over my network,
and I could probably tweak that to be faster still (it might actually
be a cabling issue around my house too). On the machine itself
throughput is much faster - scrubs and resilvering run very fast.

It does feel like somewhere between the ZFS kernel module we now have
access to and ZFS fuse it should be possible to get the POSIX layer
working under Linux though, which would be the ideal solution (but I
am not a kernel programmer so \o/).

Jan Ploski

unread,
Jun 14, 2010, 3:27:53 PM6/14/10
to zfs-fuse
I also thought about using a similar setup. However, currently I
manage to hard-freeze the VirtualBox OpenSolaris multiple times during
a single few hours session. I'd be also concerned about poor network
performance due to missing support for virtio in OpenSolaris guest.
Also about the general stability (would caching/flushing IO work
reliably?) due to wild layers of abstractions and about the
possibility of future versions of VirtualBox/OpenSolaris/Debian
becoming incompatible and cutting me off from data. The last nail in
the coffin of this idea was the (apparent?) lack of usable encryption
for ZFS/OpenSolaris. A combination of dm-crypt + zfs-fuse works well
on the other hand.

zfs-fuse just seems slow. Still, in my backup use case performance
doesn't matter so much: I'm incrementally backing up a 170 GB dataset
in 45 minutes using rsync. I'm also using zfs-fuse as a snapshot
factory for development MySQL databases, which in itself is a "killer
application" (but can also be achieved using just LVM).

Will

unread,
Jun 14, 2010, 11:25:40 PM6/14/10
to zfs-fuse
You can configure VirtualBox to respect disk flushing requests, which
is what I did. At the moment I'm being cautious - my Zpool version is
compatible with ZFS-fuse 0.6.0 so I can go back and forwards between
them quite easily.

If the hard locks you were experiencing were the Kernel use jumping to
100% then I think I have found an explanation - power management in
OpenSolaris doesn't behave right under the default settings. By
changing the power.conf file "cpupm" setting to "cpupm enable poll-
mode" I seem to have stopped it from locking up completely.

sgheeren

unread,
Jun 15, 2010, 2:47:34 AM6/15/10
to zfs-...@googlegroups.com
On 06/14/2010 09:27 PM, Jan Ploski wrote:
> zfs-fuse just seems slow. Still, in my backup use case performance
> doesn't matter so much: I'm incrementally backing up a 170 GB dataset
> in 45 minutes using rsync. I'm also using zfs-fuse as a snapshot
> factory for development MySQL databases, which in itself is a "killer
> application" (but can also be achieved using just LVM).
>
Yeah, but LVM has _major_ pitfalls and performance drawbacks in using
snapshots:

1. will not scale (can only have 1 snapshot, or need to duplicate
allocation space for each snapshot)
2. will not scale (write performance degrades linearly with each snapshots)
3. must have a snapshot volume of matching size (at least as much as
used blocks in origin), see next
4. no rollback/restore mechanism (if you accidentally think you're smart
and rsync the snapshot back to the original, you will _by definition_
run out of free blocks on the snapshot; this corrupts your snapshot
(unrecoverable, not even by extending it) and your origin will be
half-way an rsync. Don't ask _who_ learned this the hard way).
5. in terms of disk access, it is hard[1] to tune the disk layout so
that writes to origin+snapshot go to different spindles. In write
performance view, you can view lvm snapshot like a limping mirror/funny
mirror: it needs to write both volumes. If you let lvm do it's default
block allocation, chances are that seek times are going through the roof.

The only boon I remember was that lvcreate -s knows how to xfs_freeze
and xfs_unfreeze, which is really a gimmick, but still very nice. I use
that for some of my older cloud backups (where I don't want to use ZFS,
for reasons of variety)


[1] maybe possible?

Mike Hommey

unread,
Jun 15, 2010, 3:22:19 AM6/15/10
to zfs-...@googlegroups.com
On Tue, Jun 15, 2010 at 08:47:34AM +0200, sgheeren wrote:
> On 06/14/2010 09:27 PM, Jan Ploski wrote:
> > zfs-fuse just seems slow. Still, in my backup use case performance
> > doesn't matter so much: I'm incrementally backing up a 170 GB dataset
> > in 45 minutes using rsync. I'm also using zfs-fuse as a snapshot
> > factory for development MySQL databases, which in itself is a "killer
> > application" (but can also be achieved using just LVM).
> >
> Yeah, but LVM has _major_ pitfalls and performance drawbacks in using
> snapshots:
>
> 1. will not scale (can only have 1 snapshot, or need to duplicate
> allocation space for each snapshot)

IIRC you can't make a snapshot of a snapshot, but the sad thing is that
the underlying device mapper layer is able to do it.

> 2. will not scale (write performance degrades linearly with each snapshots)

That depends on which device you write to. If you write on the original
device, then yes, performance degrades, as writes there are going to be
translated to a read on the same spot first then a write in the
snapshot. If you want good write performance, never write on the
original device. Sadly, LVM doesn't do to that, as when you create a
snapshot, the device in use stays the original one. Basically, LVM is
really bad at showing how the device mapper is powerful.

> 3. must have a snapshot volume of matching size (at least as much as
> used blocks in origin), see next

It's not a must.

> 4. no rollback/restore mechanism (if you accidentally think you're smart
> and rsync the snapshot back to the original, you will _by definition_
> run out of free blocks on the snapshot; this corrupts your snapshot
> (unrecoverable, not even by extending it) and your origin will be
> half-way an rsync. Don't ask _who_ learned this the hard way).

Actually, there is one, now, at the device mapper level, though I don't
know if LVM uses it. google for snapshot-merge. The main difference with
zfs snapshots is that there is a merge phase with a lot of I/O, here.
Quite like what happens with VMware's VMDKs when consolidating snapshots.

Mike

sgheeren

unread,
Jun 15, 2010, 3:36:03 AM6/15/10
to zfs-...@googlegroups.com
On 06/15/2010 09:22 AM, Mike Hommey wrote:
> Actually, there is one, now, at the device mapper level, though I don't
> know if LVM uses it. google for snapshot-merge. The main difference with
> zfs snapshots is that there is a merge phase with a lot of I/O, here.
> Quite like what happens with VMware's VMDKs when consolidating snapshots.
>
Ah. I knew there was work ongoing in that area for > 1 y. I lost
interest, I solved my issues by not depending on lvm snapshots (for more
than the backup windows time which is <30 minutes for the systems I use
still). In that case I won't run into that brick wall of running out of
allocation on the snapshot, which is really quite a destructive failure
mode...

Thanks for another useful tip, I just _might_ use that in the future.

Matt

unread,
Jun 3, 2010, 6:43:21 PM6/3/10
to zfs-fuse
Brian,

Very cool indeed! I would most certainly like to test this out and
report and issues that I might encounter. I am wondering: has anyone
tested iSCSI export of zvols?

Matt

On Jun 3, 4:33 pm, Brian Behlendorf <brianbehlendo...@gmail.com>
wrote:
> Hello!
>
> As part of a joint effort with Sun/Oracle to augment the Lustre file
> system with ZFS support, we've been engaged in porting ZFS natively to
> the Linux kernel.  So far we have pretty much everything working
> except the ZPL - this is because Lustre interfaces directly with the
> DMU and the ZPL was not a priority for us.  However, we connected with
> folks at KQ Infotech who are also interested in a Linux kernel port
> and they are working on the ZPL so it is on the way.
>
> Anyway, the fruits of our labor are available herehttp://github.com/behlendorf/zfs/.
> I don't know to what extent it's practical for the zfs-fuse community
> and our project to collaborate.  But since there is ZFS expertise on
> both sides and a lot of common code, I wanted to propose that we at
> least consider how we might help each other out.
>
> Any thoughts?
>
> Brian Behlendorf (no not the apache one, the other one)

Wenqiang Song

unread,
Jul 4, 2010, 11:50:21 PM7/4/10
to zfs-...@googlegroups.com
Hi, It's really cool. Thanks for the effort!

I just played a little bit on my Xen guest which is a 32 bit Debian Squeeze and came out several questions:
1. I can't create volume bigger than 2G.  (Error message "volume size exceeds limit for this system"). 
2. I found the zpool version is 18 from the source code. But when I run "zfs get all pool" I get a "version 4" from the output. Am I missing something here?
3. I can't find where to set up dedup. Is it not available now?

Thanks
Wenqiang.

On Fri, Jun 4, 2010 at 6:33 AM, Brian Behlendorf <brianbeh...@gmail.com> wrote:
Hello!

As part of a joint effort with Sun/Oracle to augment the Lustre file
system with ZFS support, we've been engaged in porting ZFS natively to
the Linux kernel.  So far we have pretty much everything working
except the ZPL - this is because Lustre interfaces directly with the
DMU and the ZPL was not a priority for us.  However, we connected with
folks at KQ Infotech who are also interested in a Linux kernel port
and they are working on the ZPL so it is on the way.

Anyway, the fruits of our labor are available here http://github.com/behlendorf/zfs/.

I don't know to what extent it's practical for the zfs-fuse community
and our project to collaborate.  But since there is ZFS expertise on
both sides and a lot of common code, I wanted to propose that we at
least consider how we might help each other out.

Any thoughts?

Brian Behlendorf (no not the apache one, the other one)
--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/



--
有志者,事竟成,破釜沉舟,百二秦关终属楚
苦心人,天不负,卧薪尝胆,三千越甲可吞吴

Gavin Chappell

unread,
Jul 5, 2010, 2:56:34 AM7/5/10
to zfs-...@googlegroups.com
> 2. I found the zpool version is 18 from the source code. But when I run "zfs
> get all pool" I get a "version 4" from the output. Am I missing something
> here?

zpool version and zfs version are different things. "zfs get all" will refer to the filesystem on top of the physical pool, while "zpool get all" will refer to the physical pool.

gavin@XBMCLive:~$ sudo zpool get version seagate
NAME     PROPERTY  VALUE    SOURCE
seagate  version   23       default
gavin@XBMCLive:~$ sudo zfs get version seagate
NAME     PROPERTY  VALUE    SOURCE
seagate  version   4        -

sgheeren

unread,
Jul 5, 2010, 3:39:43 AM7/5/10
to zfs-...@googlegroups.com
--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/

See also http://zfs-fuse.net/documentation/upgrading/upgrading-pool-versions

which links to the relevant Sun wiki pages

Wenqiang Song

unread,
Jul 5, 2010, 9:19:46 PM7/5/10
to zfs-...@googlegroups.com
Thanks, got it now.

--
To post to this group, send email to zfs-...@googlegroups.com
To visit our Web site, click on http://zfs-fuse.net/
Reply all
Reply to author
Forward
0 new messages