"All I can say for the moment is... stay tuned." What the heck is this
I hope it's something that gets ZFS into the kernel. It looked to me
like Linus was
being bribed with a Guiness.
Patrick Draper |Don't |si...@pdrap.org
Austin, Texas |Fear |Father Order runs at a
http://www.pdrap.org |The |good pace, but old Mother
Be Microsoft Free - Use Linux|Penguin|Chaos is winning the race.
The only way that can happen is if ZFS gets relicensed to GPLv2 since
the kernel has too many different authors to be relicensed. Either
that, or http://liquidat.wordpress.com/2006/08/30/new-driver-interface-for-linux-kernel/
is being extended to support filesystems, or the FUSE driver is
getting some serious attention. I hope ZFS is going GPLv2.
Some comments on that blog and others suggest that Linus may be going
to work for Sun, which wouldn't be totally out of the question. I
believe he worked for Transmeta for a while.
> The only way that can happen is if ZFS gets relicensed to GPLv2 since
> the kernel has too many different authors to be relicensed
Not to mention the dead copyright holders (assuming their code was GPLv2
only, and not GPLv2 and later).
Even if it does get relicensed it may be that the interest has already
passed to other projects. Chris Mason's btrfs has a lot of what ZFS had
going for it (including multiple checksums and multiple device support)
even though it's still at 0.15. It's not for production use yet as it's
not got real ENOSPC support and the disk format isn't fixed yet, but it's
looking good and it doesn't break the layering that some of developers
have said would be a technical barrier to ZFS.
It even beats XFS for some of the Bonnie++ testing I've done with it:
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
This email may come with a PGP signature as a file. Do not panic.
For more info see: http://en.wikipedia.org/wiki/OpenPGP
To be fair, what ZFS has that btrfs doesn't?
- built-in scrub
- snapshots accessible from .zfs/
- pretty generic meta attributes storage (for things like share=, etc.)
- NFS4 ACLs
- external log and L2ARC (but see point on ARC at the end of this email)
What both of those filesystem have?
- device pooling
- replicated metadata and data
- checksums and returning only good data
- snapshot, clones
What btrfs has over ZFS?
- ability to remove devices from pool
ZFS also has *very* nice set of userspace tools. They are orders of
magnitude more intuitive and easier to use than btrfs counterparts.
But frankly, it's all userspace and scripts can be written to make
btrfs manageable like ZFS. Again, it's a matter of tools, not core
Also, ZFS implements it's own I/O scheduler and cache layer (ARC).
Which btrfs don't and won't. Linux provides such things as generic
infrastructure for all filesystems, without need to reimplement for
each one. So I/O scheduler and ARC aren't "ZFS features", they are
needed because of Solaris' weaknesses.
Then it should be pushed into the main kernel so a) it plays nicely
with other filesystems on the same block device and b) so other
filesystems (such as btrfs :) can make use of it as well.
Also, btrfs isn't remotely stable enough to use for anything serious
yet, but it looks very promising IMO.
I suspect licensing problems will prevent doing that, much in the same way
ZFS can't be pushed into the kernel.
> Also, btrfs isn't remotely stable enough to use for anything serious
> yet, but it looks very promising IMO.
I think I'll stick with ZFS for now.
I was intending to refer to the solaris kernel in this case - ZFS
should just use whatever scheduler and cache the kernel has available,
and if it's not particularly good, well, fix the kernel.
>> Also, btrfs isn't remotely stable enough to use for anything serious
>> yet, but it looks very promising IMO.
> I think I'll stick with ZFS for now.
That's probably a good bet. Give it a few years to mature.
> BTRFS has a pitiful subset of ZFS features.
But it'll be a GPL'd, in-kernel filesystem, and that gives it big
advantages in terms of development speed. It's also still pre-alpha!
Meanwhile ZFS-FUSE development seems pretty much moribund, and I'm still
getting occasional crashes with rsync for backups. :-(
> It is also five or six years behind in implementing its ideas.
That's a bit hard, it's barely a year since the btrfs announcement!
> I have not gotten any crashes at all, and I've been running a 400 GB
> disk with it for the last week or so.
Usually it's just the ZFS-FUSE daemon that dies, but this morning
when I stopped ZFS after the nightly rsync/snapshot run it locked
up one of the cores on the box and I had to reset it via the front
panel (not even alt-sysrq worked).
Jun 21 11:01:17 quad kernel: [95777.184192] PGD 131d59067 PUD 225d42067 PMD 0
Jun 21 11:01:17 quad kernel: [95777.184199] CPU 3
Jun 21 11:01:17 quad kernel: [95777.184201] Modules linked in: zc0301 tun binfmt_misc snd_rtctimer af_packet rfcomm l2cap
bluetooth i915 drm ppdev ipv6 acpi_cpufreq cpufreq_users
pace cpufreq_stats cpufreq_powersave cpufreq_conservative video output sbs sbshc container battery iptable_filter ip_tables x_tables
ext3 jbd mbcache dm_crypt crypto_blkcipher ac
sbp2 lp loop snd_usb_audio snd_pcm_oss snd_mixer_oss snd_hda_intel saa7134_alsa snd_pcm snd_seq_dummy snd_usb_lib
snd_hwdep snd_seq_oss saa7134 snd_seq_midi snd_rawmidi compat_i
octl32 snd_seq_midi_event videodev v4l1_compat snd_seq v4l2_common videobuf_dma_sg videobuf_core ir_kbd_i2c ir_common
snd_timer iTCO_wdt snd_seq_device tveeprom parport_pc parpor
t pcspkr iTCO_vendor_support snd i2c_core snd_page_alloc soundcore shpchp pci_hotplug button intel_agp evdev xfs sg sr_mod
cdrom sd_mod pata_jmicron usbhid hid pata_acpi ahci ata
_generic ohci1394 ieee1394 libata scsi_mod dock r8169 ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy
async_tx xor rai
Jun 21 11:01:17 quad kernel: 1 raid0 multipath linear md_mod dm_mirror dm_snapshot dm_mod thermal processor fan fuse
Jun 21 11:01:17 quad kernel: [95777.184277] Pid: 27558, comm: umount Not tainted 18.104.22.168-cs1 #1
Jun 21 11:01:17 quad kernel: [95777.184280] RIP: 0010:[__slab_free+0x195/0x320] [__slab_free+0x195/0x320]
Jun 21 11:01:17 quad kernel: [95777.184284] RSP: 0018:ffff8101004f9ca8 EFLAGS: 00010046
Jun 21 11:01:17 quad kernel: [95777.184287] RAX: 0000000000200200 RBX: ffff81022ff02a18 RCX: ffffe200017ab318
Jun 21 11:01:17 quad kernel: [95777.184289] RDX: 0000000000100100 RSI: ffffe200017ab2f0 RDI: ffff81022ff02a18
Jun 21 11:01:17 quad kernel: [95777.184291] RBP: ffff8101004f9cd8 R08: 000000000000004e R09: 8000000000000000
Jun 21 11:01:17 quad kernel: [95777.184294] R10: 0000000000000002 R11: 000000000000056a R12: ffffe200017ab2f0
Jun 21 11:01:17 quad kernel: [95777.184296] R13: ffff81022ff02a00 R14: 000000000000004e R15: ffffffff88006ccc
Jun 21 11:01:17 quad kernel: [95777.184300] FS: 00007fa7fd3de6e0(0000) GS:ffff81022fc0e880(0000)
Jun 21 11:01:17 quad kernel: [95777.184303] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 21 11:01:17 quad kernel: [95777.184305] CR2: 0000000000100108 CR3: 000000022eded000 CR4: 00000000000006e0
Jun 21 11:01:17 quad kernel: [95777.184307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 21 11:01:17 quad kernel: [95777.184310] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 21 11:01:17 quad kernel: [95777.184312] Process umount (pid: 27558, threadinfo ffff8101004f8000, task ffff8101004f0000)
Jun 21 11:01:17 quad kernel: [95777.184315] Stack: 0000000000000001 0000000000000287 ffff81006c332000
Jun 21 11:01:17 quad kernel: [95777.184321] 000000000007ab84 ffff8101b8d69200 ffff8101004f9d08 ffffffff8029ea61
Jun 21 11:01:17 quad kernel: [95777.184325] ffff8101004f9cf8 ffff81006c332000 ffff81006c332000 ffff81000b413540
Jun 21 11:01:17 quad kernel: [95777.184330] Call Trace:
Jun 21 11:01:17 quad kernel: [95777.184340] [ext3:kmem_cache_free+0x81/0x4d0] kmem_cache_free+0x81/0xb0
Jun 21 11:01:17 quad kernel: [95777.184354] [fuse:fuse_destroy_inode+0x3c/0x50] :fuse:fuse_destroy_inode+0x3c/0x50
Jun 21 11:01:17 quad kernel: [95777.184361] [destroy_inode+0x36/0x60] destroy_inode+0x36/0x60
Jun 21 11:01:17 quad kernel: [95777.184366] [fuse:generic_delete_inode+0x100/0x440] generic_delete_inode+0x100/0x140
Jun 21 11:01:17 quad kernel: [95777.184373] [ext3:iput+0x78/0x490] iput+0x78/0x90
Jun 21 11:01:17 quad kernel: [95777.184379] [shrink_dcache_for_umount_subtree+0x12f/0x250]
Jun 21 11:01:17 quad kernel: [95777.184385] [__down_read_trylock+0x45/0x60] ? __down_read_trylock+0x45/0x60
Jun 21 11:01:17 quad kernel: [95777.184393] [shrink_dcache_for_umount+0x31/0x60] shrink_dcache_for_umount+0x31/0x60
Jun 21 11:01:17 quad kernel: [95777.184400] [generic_shutdown_super+0x1a/0x110] generic_shutdown_super+0x1a/0x110
Jun 21 11:01:17 quad kernel: [95777.184406] [fuse:kill_anon_super+0x11/0x40] kill_anon_super+0x11/0x40
Jun 21 11:01:17 quad kernel: [95777.184412] [deactivate_super+0x71/0x90] deactivate_super+0x71/0x90
Jun 21 11:01:17 quad kernel: [95777.184418] [fuse:mntput_no_expire+0x55/0x4920] mntput_no_expire+0x55/0x90
Jun 21 11:01:17 quad kernel: [95777.184424] [sys_umount+0x6d/0x3c0] sys_umount+0x6d/0x3c0
Jun 21 11:01:17 quad kernel: [95777.184433] [__up_read+0x46/0xb0] ? __up_read+0x46/0xb0
Jun 21 11:01:17 quad kernel: [95777.184442] [sys_newstat+0x27/0x50] ? sys_newstat+0x27/0x50
Jun 21 11:01:17 quad kernel: [95777.184459] [system_call_after_swapgs+0x7b/0x80] system_call_after_swapgs+0x7b/0x80
Jun 21 11:01:17 quad kernel: [95777.184472]
Jun 21 11:01:17 quad kernel: [95777.184473]
Jun 21 11:01:17 quad kernel: [95777.184485] RSP <ffff8101004f9ca8>
Jun 21 11:01:17 quad kernel: [95777.184485] ---[ end trace 63bd37445af828b0 ]---
Could be a FUSE or kernel issue there - this is 22.214.171.124.
I'm using 2.6.26-rc6, so no problems here. Maybe a heisenbug?
Miklos Szeredi (FUSE) already debugged it (he pretty quickly debugs all
reported issues). It was a SLUB corruption which could have been caused
basically by anythig in the kernel, including the recently added SLUB
The only sure thing at the moment is, that people are not reporting this
oops with stable kernels, so it could have been hardware, -rc kernel
specific or recently introduced anywhere in the development kernels,
triggered by an unmount.
Chris couldn't reproduce the oops (yet) with newer kernels. With the extra
kernel config options the root cause will be trivially found if it still
> Chris couldn't reproduce the oops (yet) with newer kernels. With the
> extra kernel config options the root cause will be trivially found if it
> still exists.
It also only happened once with that 126.96.36.199 kernel too. That was a stable
kernel too, not an RC..
I'm now running 188.8.131.52 as I was getting too many issues with Intel
graphics with the 2.6.26-rc series (though rc9 might fix that).