Custom filesystem: troubleshooting high-level apps and a weird scenario

121 views
Skip to first unread message

Cristian Marius

unread,
Jun 9, 2024, 9:44:26 PMJun 9
to macFUSE
Hello,

I am using hanwen/go-fuse and implemented my own filesystem.

I have noticed that high-level apps on each file edit and save call the rename method with the RENAME_SWAP. Due to no support in golang for the syscall renamex_np/ renameat2, I have implemented my own logic for rename with swap, that only supports file renames.
If the rename with swap is between a directory and a file, or two directories I return EINVAL.

This seems to work, and I can freely edit text documents, pdfs, .numbers files.

The problem is the following: After I finish such an edit and completely exit the high level app, when I call open myfile.txt, I get the following error:

```

The application cannot be opened for an unexpected reason, error=Error Domain=NSOSStatusErrorDomain Code=-43 "fnfErr: File not found" UserInfo={_LSLine=4129, _LSFunction=_LSOpenStuffCallLocal}

```
After this happens, I observed the following:
If I first open TextEdit and then from TextEdit open the file, it works as expected.
If I call a regular rename myFile.txt myFile1.txt, open myFile1.txt works as expected, until I perform edits, and exit the app.

I tried to dtruss the problem, but the whole system freezes when I invoke dtruss on my filesystem on the above scenario, and after a restart, the output file from my dtruss has 0 bytes.

Does anybody have any insight about what could trigger the above error?

Cristian Marius

unread,
Jun 9, 2024, 10:02:33 PMJun 9
to macFUSE
I am thinking: When I perform my swap rename, do I need to perform some action on the .DS_Store file? (I do not have apple double files, due to the use of extended attributes - and my rename with swap also swaps the extended attributes).

Cristian Marius

unread,
Jun 10, 2024, 1:16:14 AMJun 10
to macFUSE
I managed to debug a bit and finally trace the open (1) call.

It seems that before the fail, it tries to get the attribute for the file, it succeeds, but then it also tries to get the attribute for the directory.sb--6c28314c-zqOUh7/myFile.txt which returns ENOENT.

From the logs of my app I can see that unlink (rmdir) was called on the .sb directory when I closed TextEdit.

My mystery is, why would it try to get the attribute for the file that was removed by a system call triggered by the app on close, and fail so disastrously on ENOENT.

Cristian Marius

unread,
Jun 10, 2024, 8:27:28 PMJun 10
to macFUSE
I have cross-posted the issue on the apple developer forum and it can be followed here: https://developer.apple.com/forums/thread/756553

Perry Smith

unread,
Jun 11, 2024, 8:38:10 AMJun 11
to osxfus...@googlegroups.com
Just a guess… but a file (including a directory) is not actually removed until its use count goes to zero.  The child is still using the directory (by definition).  So the directory of the child must exist.

On Jun 10, 2024, at 12:16 AM, Cristian Marius <cristian.ma...@gmail.com> wrote:

I managed to debug a bit and finally trace the open (1) call.
--
You received this message because you are subscribed to the Google Groups "macFUSE" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osxfuse-grou...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osxfuse-group/ca4da130-1729-49c4-a2b3-a362e5c9e330n%40googlegroups.com.

Cristian Marius

unread,
Jun 11, 2024, 10:49:08 PMJun 11
to macFUSE
Thank you for your input, it was a good guess, I had a bug in my unlink (rmdir) that allowed the node to be removed even if the children had open file descriptors. I have addressed the issue and return syscall.EBUSY if that is the case. However nothing changed, I think the unlink is called when the folder is no longer in use (TextEdit exits). I also got a reply from the apple forum that I am investigating now.

sergey0

unread,
Jun 11, 2024, 11:01:28 PMJun 11
to macFUSE
I can't think of any scenario where you'd need to modify .DS_Store files directly. That's not your job as a virtual file system developer. If you handle all callbacks correctly, macOS itself will make sure your file system's .DS_Store files are updated as needed.

Here is something you may want to look in to instead: on macOS 13/14 Apple changed the way files are written. Make sure your file system's preallocate and truncate callbacks are doing the right thing. Preallocate didn't matter much on older OSes, but now I'm seeing it being called a lot. As always, use LoopbackFS as a starting point.

Cristian Marius

unread,
Jun 11, 2024, 11:05:28 PMJun 11
to macFUSE
Thank you for the input, I don't support preallocate I simply return syscall.ENOTSUP; for truncate I support only reducing the file size and return syscall.EINVAL if it tried to expand the file.
I will try to experiment with preallocate.

Cristian Marius

unread,
Jun 11, 2024, 11:20:28 PMJun 11
to macFUSE
I didn't get any calls to preallocate in the above scenario, nor to truncate.

sergey0

unread,
Jun 12, 2024, 12:51:04 AMJun 12
to macFUSE
If you're setting breakpoints, that might not do it, because debugging itself interferes with what you're trying to look at. I'd try making the file system as simple as possible and add logging to all callbacks. Dumb as it is, it helped me with many FUSE puzzles. 

Cristian Marius

unread,
Jun 12, 2024, 1:30:39 AMJun 12
to macFUSE
Yes, I don't use breakpoints, just logging to the callbacks and trace for the high level program (which in this case freezes the os sometimes).
I am still investigating. By inspecting my VFT, Rename SWAP behaves as expected. I am confident at this point there is some flag in some callback in my implementation that I ignore and has some catastrophic  effect.

sergey0

unread,
Jun 12, 2024, 3:40:16 AMJun 12
to macFUSE
Perhaps you can narrow it down by trying the same steps on LoopbackFS with logging, and there is also that 'debug' mount option. If you record callbacks and metadata they might hint at what breaks or is missing. 

Another simple thing to try: turn off caching (mount options novncache/nolocalcaches/noubc). Depending on what kind of physical files system is used on your volume options like noappledouble might help too.

Cristian Marius

unread,
Jun 12, 2024, 4:22:50 AMJun 12
to macFUSE
Thanks for the suggestion, I have tried the novncache/nolocalcaches/noubc/noappledouble it seems there is no difference.

Using the loopback example for macFUSE I cannot reproduce the issue. I have also used the loopback example for cgofuse (the go fuse library from the winfsp project) and couldn't reproduce the example.

The underlying filesystem is just a regular folder in the users home directory. I don't keep file descriptors open to the underlying files, my FileNodes/DirectoryNodes keep track of the underlying path, and perform the operation on them. It's a loopbackFS with some encryption added on the underlying folder and some file metadata stored in a local database. I am not using directly the macFUSE demos. I use hanwen/go-fuse library for Linux/MacOS and cgofuse for Windows; the underlying logic is the same for all platforms, it's just some High level calls that interpret flags based on the platform, and some conversion on Windows. Until now the logic is solid on Linux - tested with different programs/ file integrity checkers, I also found no issues on Windows. Only OSX programs seem to me that rely on extended attributes, and rename with swap flag.

From my debugging this is what is happening during a rename swap my Inodes look like this: 
ParentDirectoryNode = root;
ParentDirectoryNode.Child = FileNode(myFile.txt); // Target. - open on TextEdit
DirectoryNode = myFile.txt-sb.123.. // Folder created by text edit.
DirectoryNode.Child = FileNode(myFile.txt) // The file with edits from TextEdit

The initiator of the RenameSwap: DirectoryNode: oldName myFile.txt, targetDir: ParentDirectoryNode, newName myFile.txt, flag: RENAME_SWAP)

I lock the DirectoryNode, ParentNode, ParentDirectoryNode.Child
I atomically try to swap the underlying file DirectoryNode.Child with the ParentDirectoryNode.Child underlying file - no renamex_np call, just rename with retries and rollbacks
I also perform some metadata swap in the local database (the metadata swapped, it's just information on how to encrypt/decrypt the files).
The Inodes suffer the following modification:
ParentDirectoryNode.Child swaps attribute size and the metadata used for encryption with DirectoryNode.Child; I preserve the Inode ID, and pointers
Release the locks.

On every save in TextEdit, the swap happens and I can perform multiple saves and I can open from another program, or just cat the ParentDirectoryNode.Child (the original file under edit by TextEdit) and list it's contents without a problem.

The problem happens after I completely exit TextEdit. I can still cat the root/myFile.txt or edit it using nano or other programs; but open (1) on it fails.

I managed to trace the open (1) call; and the weird thing to me is that: getattrlist (2) is called on root/myFile.txt no error code - expected, and after that getattrlist (2) is called 3 times on root/myFile.txt-sb123../myFile.txt which fails with ENOENT, after which the fnfErr: File Not Found is thrown. - this is the unexpected part.

After all this if I perform a regular rename on root/myFile.txt  (for example mv root/myFile.txt root/a.txt, and even - but not necessary - mv root/a.txt root/myFile.txt)- at the end of the rename the node get's inserted in my VFT "root" Directory node under a different name, and then the old entry is removed. (the Inode ID is unchanged), open (1) succeeds.

I feel the perspectives here refreshing as I am afraid I might be stuck in brain rot and miss something obvious.

Cristian Marius

unread,
Jun 17, 2024, 4:13:32 AMJun 17
to macFUSE
>Just a guess… but a file (including a directory) is not actually removed until its use count goes to zero.  The child is still using the directory (by definition).  So the directory of the child must exist.

The library I am using considers the inode still in use by the kernel, thus it is not dropped, but my implementation is low IQ and fails to check for it. The open (1) calls getattrlist (2) on the inode from the kernel that is technically alive, and fails on my filesystem because I consider it removed.

Thank you all for the help, it clicked for me today after also reading the answer from the apple forum - and I focused on debugging focusing on the inode number that is getting passed around, and if it is still considered alive by the kernel. I think I need to have a background task that removes the directory/ file if a node is marked for deletion and has no more open references and it is no longer in use by the kernel. I have tried to Notify the kernel of a change post factum of deleting the inode,  but the os seems to end up hanging.
Reply all
Reply to author
Forward
0 new messages