Implement own file system / mount type

126 views
Skip to first unread message

Soeren Balko

unread,
Mar 31, 2014, 8:11:42 AM3/31/14
to native-cli...@googlegroups.com
Hi,

I would like to implement my own file system, as described in fuse.h, i.e., like so:

struct fuse_operations g_my_fuse_operations = { ... };
...
nacl_io_register_mount_type("myfusefs", &g_my_fuse_operations);
...
mount("", "/mnt/fuse", "myfusefs", 0, NULL);

The idea is to allow my PNaCl module to access Javascript File and Blob objects, which are passed to the module as a URL created by URL.createObjectURL(file). 

When I follow above's instructions from fuse.h, I cannot seem to get access to the "source" and "data" parameters of the mount(...) call. In fact, I probably had to create a C++ wrapper like to the built-in file systems (html5fs and friends). As this is somewhat of an overhead for what I would like to accomplish (let my PNaCl module use POSIX file I/O calls to read from a single Javascript file), I would like to have my code as compact as possible (reuse is less of an issue). However, I need to find a way how to pass the URL of the file to the fuse operations. Is there any advised method of doing so (apart from creating the aforementioned C++ wrapper)?

Btw, a generic, built-in file system for JS File and Blob objects would be fantastic. Ideally, I could specify an entire directory tree within the "data" argument of the mount(...) call, where I can map file names to FIle/Blob URLs. Alternatively, the existing httpfs could be extended, to be able to deal with other URL schemes, namely blob: and data: URLs. Not sure if it is already supposed to do that - I tried to use it this way, but it would not work. 

Soeren


Ben Smith

unread,
Mar 31, 2014, 8:19:18 PM3/31/14
to native-cli...@googlegroups.com
Hi Soeren,

On Monday, March 31, 2014 5:11:42 AM UTC-7, Soeren Balko wrote:
Hi,

I would like to implement my own file system, as described in fuse.h, i.e., like so:

struct fuse_operations g_my_fuse_operations = { ... };
...
nacl_io_register_mount_type("myfusefs", &g_my_fuse_operations);
...
mount("", "/mnt/fuse", "myfusefs", 0, NULL);

The idea is to allow my PNaCl module to access Javascript File and Blob objects, which are passed to the module as a URL created by URL.createObjectURL(file). 

When I follow above's instructions from fuse.h, I cannot seem to get access to the "source" and "data" parameters of the mount(...) call. In fact, I probably had to create a C++ wrapper like to the built-in file systems (html5fs and friends). As this is somewhat of an overhead for what I would like to accomplish (let my PNaCl module use POSIX file I/O calls to read from a single Javascript file), I would like to have my code as compact as possible (reuse is less of an issue). However, I need to find a way how to pass the URL of the file to the fuse operations. Is there any advised method of doing so (apart from creating the aforementioned C++ wrapper)?

I think this is just an oversight. Feel free to file a bug to expose this, but note that it won't be terribly easy to access this from the FUSE function callbacks; none of them have a "user data" pointer, so you'll likely have to use global data in either case.
 

Btw, a generic, built-in file system for JS File and Blob objects would be fantastic. Ideally, I could specify an entire directory tree within the "data" argument of the mount(...) call, where I can map file names to FIle/Blob URLs. Alternatively, the existing httpfs could be extended, to be able to deal with other URL schemes, namely blob: and data: URLs. Not sure if it is already supposed to do that - I tried to use it this way, but it would not work.

Yes, this is an old known issue http://crbug.com/241464. I believe the problem is that the httpfs mount does HEAD requests, which are not supported by blob and data URLs.
 

Soeren


Soeren Balko

unread,
Apr 1, 2014, 1:21:37 AM4/1/14
to native-cli...@googlegroups.com
Hi Ben,

I implemented a quick-and-dirty hack to work around the HEAD request, which is performed against a blob: or data: URL. The idea is to simply pass a "cached_filesize=..." parameter to the data attribute, which is given to mount. Whenever a >=0 cached_filesize is found at runtime, the HEAD request to determine the size of the resource is bypassed. Of course, this requires the size of the Javascript Blob or File to be known at the time, when the httpfs file system is mounted. As File and Blob objects are read-only, this translates into a simple blob.size property read in Javascript, which is then (alongside the Blob/File's URL) passed to the PNaCl module. 

Of course, a single file size limits the entire file system to a single file. However, if more than a single file was needed, one could easily mount multiple httpfs file systems at different mount points. And HTTP URLs wouldn't use the "cached_filesize" property and simply continue using HEAD requests.

Not sure if this is the right place to discuss this, but can you kindly let me know if this is an acceptable solution? If so, I would be happy to contribute it to the native client SDK. I have currently implemented this on top of the pepper_33 codeline, but would be happy to up-port it to a more recent branch (I saw some changes to the directory structure, starting with pepper_34). And apart from the fact that I would like to merge this patch to upstream for obvious reasons, I believe a number of people would benefit from this patch. The question of accessing a JS FIle/Blob object from a PNaCl module WITHOUT first copying into the html5fs file system seems to be recurrently coming up in here...

Soeren

Ben Smith

unread,
Apr 3, 2014, 3:49:46 PM4/3/14
to native-cli...@googlegroups.com
I think a better solution here is to be more robust to a failing HEAD request. There already is code to attempt to download the entire file if the HEAD succeeds but doesn't have a Content-Length header.

In any case, yes, we definitely appreciate contributions! Please see the docs here: http://dev.chromium.org/developers/contributing-code. The relevant code lives in the native_client_sdk/src/libraries/nacl_io directory. You can browse it online here https://code.google.com/p/chromium/codesearch#chromium/src/native_client_sdk/src/libraries/nacl_io/.

Feel free to send your patch to <binji at chromium.org> to review.

Soeren Balko

unread,
Apr 5, 2014, 5:10:55 AM4/5/14
to native-cli...@googlegroups.com
Not sure that I would go for an approach where we need to fetch the entire file to figure out its length. In my case, I need to process very large files, easily some GBs in size. Besides querying the file size in JS and passing it as a mount parameter, one could employ a binary search approach where we try to seek to the end of the file, fetching a one-byte partial content answer.

Ultimately, I think Chrome (and other browsers for that matter) should IMHO support HEAD requests for blob and data URLs, making it more compatible to http URLs.

Ben Smith

unread,
Apr 8, 2014, 1:48:40 PM4/8/14
to native-cli...@googlegroups.com
I think the entire file will have to be fetched anyway, because IIRC blob URLs don't support partial requests. Where are these very large files coming from? Do they live on the user's machine or on the server? Are they static?

Sören Balko

unread,
Apr 8, 2014, 7:27:05 PM4/8/14
to native-cli...@googlegroups.com
Actually, partial requests do work for blob URLs. You can even watch them in the network tab of the debugging tools. The file comes straight from the users's local file system, i.e. from a file input form element. 

I have the patch I proposed running and it works. Will push it for review in the coming days.
--
You received this message because you are subscribed to a topic in the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/native-client-discuss/mOD46_QDjoc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to native-client-di...@googlegroups.com.
To post to this group, send email to native-cli...@googlegroups.com.
Visit this group at http://groups.google.com/group/native-client-discuss.
For more options, visit https://groups.google.com/d/optout.

Ben Smith

unread,
Apr 8, 2014, 8:32:17 PM4/8/14
to native-cli...@googlegroups.com
Ah, you're right, it does work! Even better, the response returns Content-Range: xx-xx/xxx, so that can be parsed for the full blob size.

Perhaps the best solution is to check if the URL scheme is "blob", and if so, do a partial request of one byte instead of a HEAD request, then parse the Content-Range header. I was curious if this check would have to be extended for data URIs, but they don't work at all. This bug explains a bit about why that is for JavaScript (and Pepper seems to fail for the same reason).

Thanks for looking into this!

On Tuesday, April 8, 2014 4:27:05 PM UTC-7, Soeren Balko wrote:
Actually, partial requests do work for blob URLs. You can even watch them in the network tab of the debugging tools. The file comes straight from the users's local file system, i.e. from a file input form element. 

I have the patch I proposed running and it works. Will push it for review in the coming days.

Am 9 Apr 2014 um 3:48 am schrieb Ben Smith <bi...@chromium.org>:

I think the entire file will have to be fetched anyway, because IIRC blob URLs don't support partial requests. Where are these very large files coming from? Do they live on the user's machine or on the server? Are they static?

On Saturday, April 5, 2014 2:10:55 AM UTC-7, Soeren Balko wrote:
Not sure that I would go for an approach where we need to fetch the entire file to figure out its length. In my case, I need to process very large files, easily some GBs in size. Besides querying the file size in JS and passing it as a mount parameter, one could employ a binary search approach where we try to seek to the end of the file, fetching a one-byte partial content answer.

Ultimately, I think Chrome (and other browsers for that matter) should IMHO support HEAD requests for blob and data URLs, making it more compatible to http URLs.

--
You received this message because you are subscribed to a topic in the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/native-client-discuss/mOD46_QDjoc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to native-client-discuss+unsub...@googlegroups.com.

Sören Balko

unread,
Apr 8, 2014, 8:36:13 PM4/8/14
to native-cli...@googlegroups.com
Good catch, I didn’t even check that - definitely the better approach over passing the resource length as a mount parameter. Data URLs do not worry me too much. They can be programmatically de-serialised and put into memfs...

To unsubscribe from this group and all its topics, send an email to native-client-di...@googlegroups.com.

Soeren Balko

unread,
May 1, 2014, 9:43:43 PM5/1/14
to native-cli...@googlegroups.com
I prepared a patch for pepper_canary, which let's nacl_io's httpfs work for httpfs: http://pastebin.com/Haaehwrf (git diff output)

@Google: please feel free to incorporate it into your development code line. I evaded setting up the development code line locally myself (the patch didn't warrant for the effort). 

When reading from the blob URL through the POSIX file operations, one must directly read from the mount point. Appending further path/file elements invalidates the blob URL (which can AFAIK only represent a single blob). 

Soeren
To post to this group, send email to native-client-discuss@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages