You can't really make this work as threadsafe by putting "biglock" around every call, then things like this will happen, VNODE_LOOKUP results in a taking biglock, which calls vnode_put and vnode_put calls VNOP_FSYNC, but fuse4x deadlocks because it has already taken biglock. It may work if this lock is made recursive, which will allow this thread to continue. May break somewhere else, didn't spend too much time on it. The "right" thing to do is to take the biglock out completely, multithreaded fuse fs's should work just fine with multithreaded VNOP's,
but of course running the single threaded session loop leads us back to the same problem, regardless of biglock... :-/
Running under Lion/64.
#0 0xffffff80002c0939 in machine_switch_context (old=0xffffff8000691f5a, continuation=0, new=0xffffff8020dbcdc0) at pcb.c:526#1 0xffffff800022f11d in thread_invoke (self=0xffffff802006eb80, thread=0xffffff8020dbcdc0, reason=1690875392) at sched_prim.c:2146#2 0xffffff800022f39b in thread_block_reason (continuation=0xffffff8020dbcdc0, parameter=0xffffff802006ebb0, reason=0) at sched_prim.c:2398#3 0xffffff80002b770f in lck_mtx_lock_wait_x86 (mutex=0xffffff8030b2ae80) at locks_i386.c:2083#4 0xffffff80002b39fd in lck_mtx_lock () at cpu_data.h:402#5 0xffffff7f8079c1d3 in fuse_biglock_vnop_fsync (ap=0xffffff8164c8b740) at fuse_biglock_vnops.c:156#6 0xffffff8000316adf in VNOP_FSYNC (vp=0xffffff8164c8b740, waitfor=-2113298603, ctx=0x20dbcdf000000000) at kpi_vfs.c:4168#7 0xffffff80002fd1db in vclean (vp=0xffffff80316118b8, flags=4) at vfs_subr.c:2066#8 0xffffff80002fceae in vgone [inlined] () at /net/sgilardi-dev/SourceCache/xnu/xnu-1699.24.8/bsd/vfs/vfs_subr.c:2255#9 0xffffff80002fceae in vnode_reclaim_internal (vp=0xffffff80316118b8, locked=1690875888, reuse=1690875888, flags=0) at vfs_subr.c:4137#10 0xffffff80002fd4ee in vnode_put_locked (vp=0xffffff80316118b8) at vfs_subr.c:3906#11 0xffffff80002fd54c in vnode_put (vp=Cannot access memory at address 0x0) at vfs_subr.c:3861#12 0xffffff7f807a7dce in FSNodeGetOrCreateFileVNodeByID (vnPtr=0xffffff8164c8bb38, flags=0, feo=0xffffff803163d600, mp=0xffffff801eb643d0, dvp=0xffffff80315d40f8, context=0xffffff801dd95578, oflags=0x0) at fuse_node.c:218#13 0xffffff7f807a7ee8 in fuse_vget_i (vpp=0xffffff8164c8bb38, flags=0, feo=0xffffff803163d600, cnp=0xffffff8164c8bee8, dvp=0xffffff80315d40f8, mp=0xffffff801eb643d0, context=0xffffff801dd95578) at fuse_node.c:254#14 0xffffff7f807b0791 in fuse_vnop_lookup (ap=0xffffff8164c8bc38) at fuse_vnops.c:1466#15 0xffffff7f8079c85f in fuse_biglock_vnop_lookup (ap=0xffffff8164c8bc38) at fuse_biglock_vnops.c:297#16 0xffffff80003169a4 in VNOP_LOOKUP (dvp=0xffffff8164c8bc38, vpp=0x64c8bb3800000000, cnp=0xffffff803163d600, ctx=0xffffff8164c8bee8) at kpi_vfs.c:3039#17 0xffffff80002f494a in lookup (ndp=0xffffff80315d40f8) at vfs_lookup.c:1015#18 0xffffff80002f37bc in namei (ndp=0xffffff8164c8bd98) at vfs_lookup.c:352#19 0xffffff80002e3abe in getattrlist (p=0xffffff801fda8100, uap=0xffffff801dc38f54, retval=0x0) at vfs_attrlist.c:2042#20 0xffffff80005caa9b in unix_syscall64 (state=0xffffff8164c8bfb0) at systemcalls.c:379
HiOn Mon, Dec 19, 2011 at 4:19 PM, Debabrata Banerjee <dbav...@gmail.com> wrote:
You can't really make this work as threadsafe by putting "biglock" around every call, then things like this will happen, VNODE_LOOKUP results in a taking biglock, which calls vnode_put and vnode_put calls VNOP_FSYNC, but fuse4x deadlocks because it has already taken biglock. It may work if this lock is made recursive, which will allow this thread to continue. May break somewhere else, didn't spend too much time on it. The "right" thing to do is to take the biglock out completely, multithreaded fuse fs's should work just fine with multithreaded VNOP's,Fully agree! Replacing "biglock" in 64bit kernel with cleaner solution is one of the long-term goals of fuse4x project (the current 64bit code is inherited from macfuse project).One of the ideas is to replace biglock with lock-per-vnode solution that is used in 32bit kernel (provided by kernel if you use VFS_TBLTHREADSAFE flag).
Hi
Finder drag and drop copy of a zip file to a fuse4x volume from a fuse_lowlevel implementation with x86_64 kernel, 10.7.2. This is not the only time it hangs, although I haven't bothered to debug the other hangs. I'm running the kernel in i386 for now to use fuse4x.
From experience converting filesystems to use the VBL_THREADSAFE flag, the less locks the better. Many things that you would think need to be locked actually don't, there will just be a winner and a loser in a race, which is OK. VNOP calls coming from the kernel do have certain guarantees, such as you will not get a free of a vnode while the kernel has other VNOP's pending, so it's not necessary to lock the vnode.
Let me see if I can describe this correctly -
Vnode locking was removed from the kernel a while ago. This is because it's no longer necessary for anyone using the exported KPI's to lock any vnodes. There is a lifetime guarantee from the time you allocate it, to the time it finally gets an VNOP_RECLAIM, especially inside VNOP's. So, it's not necessary to do any locking for the benefit of the kernel. This relies on using vnode_get/vnode_put properly, when the reference count on a vnode drops to zero the kernel gives us a VNOP_INACTIVE, and "may" give us a VNOP_RECLAIM. What should be locked is any internal data structures to "your" filesystem. There is an exception for iterating all the vnodes you "own". It is not correct to do this inside your implementation, instead, you call vnode_iterate, which hands you back one vnode at a time in a callback with the same lifetime guarantee's as above. So everything here should be happy.
So now FUSE - what needs to be figured out is how threadsafe different FUSE implementations are. In addition, what happens if a FUSE filesystem only works because the single threaded session loop is the only one allowed? I'm not sure of the answer to this. One thing I can think of is to keep the biglock, but make them recursive locks.
Recursive locks also have the advantage of storing their owner thread, which makes debugging much easier :) This kind of sounds wrong to me however. Perhaps thread safety should be taken care of in the userspace lib, not in the kernel. Keeping in mind again, this is thread safety internal to a FUSE filesystem implementation - not to the kernel. This may already be the case, as it does not go through a funnel under linux. (The darwin kernel funnel is the origin of the VBL_THREADSAFE flag, and Apple has been trying to extract the thing which is why this flag must be used in K64, no funnel (well around vnodes anyway))
I just pushed a huge biglock refactoring to github and hopefully it
will resolve all those deadlock issues once and forever.
When you have chance please pull changes from here
https://github.com/anatol/kext test it and let me know if you see any
issues. I suspect that this refactoring might have regressions that
lead to race conditions in the kext. All those race condition issues
have to be identified and fixed.
Here is an update. Things look extremely good. Biglock refactoring is
over and all regressions that I was able to find were fixed.
It is official now - the next version of fuse4x (0.10.0) will contain
the simple-lock feature. It greatly improves fuse4x scalability and
gets rid of the annoying kernel deadlocks.
You can test fuse4x with simplelock feature by downloading the binary
package from
http://dl.dropbox.com/u/3842605/Fuse4X-0.10.0-beta.dmg
or by building sources from github.com/fuse4x/
Thanks a lot to everyone who helped to make this refactoring happen,
thanks everyone who helped to test it.
On Tue, Dec 27, 2011 at 11:44 AM, Debabrata Banerjee <dbav...@gmail.com> wrote:
> Well it doesn't deadlock anymore where it used to. Still messing around with
> it.. My guess is that it will probably be just fine. My filesystem is
> designed to be threadsafe/parallel anyway.
>
> I can fix a bunch of stuff here:
> -shouldn't be a problem to get it to unload/install/load the kext without a
> reboot
> -MAX_UPL_TRANSFER is really an old limit. I worked with Apple to get this
> increased in the kernel, MAX_UPL_SIZE is the real value - allows for 32MB
> IO's. It has huge performance benefits. I don't know what MAX_UPL_TRANSFER
> is still doing floating around the kernel... I would make the default IO
> size 32MB and not 128KB also.
The current "default io size" is 64K which is really tiny by modern
standards. This small iosize becomes a bottleneck while copying large
files over sshfs on local network. The situation even worse in case of
filesystems that work with local disks (ntfs, ext2, ...)
I agree that we should change default io size from 64K to something
bigger. What about 256K or 512K? You said that default should be 32MB,
is there any drawback in using such big buffers?
Thanks for the info. Please review this change
https://github.com/anatol/kext/commit/8e3524f09e2adad2a0cb3992c25b967e3e1c0b68
It sets 32M/16M io size depending on sdk version we compile for.
While I am here I would like to ask if we need to use PAGE_SIZE
constant for io/userbuffer size measurement. Is PAGE_SIZE going to
stay constant for x86? Or maybe we shoud use constants like (32 * 1024
* 1024)?
Using PAGE_SIZE is correct but IMHO it will never change.
-Deb
Hi
> Thanks for the info. Please review this change
> https://github.com/anatol/kext/commit/8e3524f09e2adad2a0cb3992c25b967e3e1c0b68
> It sets 32M/16M io size depending on sdk version we compile for.I tested 32M default io and the only thing that I do not like is a Finder response time when copying to slow filesystem (e.g. sshfs over WAN). If you decide to stop copying then you have to wait while the whole block write will be finished. Writing 32M over WAN might take quite a lot of time. The progress bar update is also not very smooth. I think the default value should be lowered to something like 1M/512K, or maybe even to 256K.
Although decreasing the iosize does not reduce the "Cancel" response time. Hm... Even if I click Cancel in Finder I see that kernel still sends WRITE requests to fuse (~dozen of them).It does not matter what iosize parameter value, the first update that Copy Progress Bar gets is 8M. Is there any way to improve fidelity of the Copy Progress Bar?
Hi,
I'm developing a filesystem based on FUSE and run into the problem on Mac OS.
I need to obtain a PID of a parent of the process which access to FUSE volume. I use sysctl function with KERN_PROC; KERN_PROC_PID parameters, to obtain PPID. It works fine until any executable file is started from FUSE volume, after that sysctl function cause deadlock inside FUSE. I've tested in with all Mac FUSE implementations with same results. To my surprise, this beta partially resolve the problem. Under OS versions 10.6 and 10.8 sysctl doesn't more cause deadlock, but unfortunately under 10.7 it does. Are there new binaries or I have build it manually? How can I help to improve the project and resolve my problem?