Using sandboxed file system from Renderer via C++

Christian Fremerey

unread,

Jul 21, 2017, 2:05:06 PM7/21/17

to stora...@chromium.org

Dear storage devs,

I am investigating a feature request in the context of MediaRecorder, which is

a Web API that allows things like recording video from a webcam into a WebM

blob. Currently, the produced WebM data does not get properly indexed, because

indexing requires post-processing the data after the recording is complete. The

basic idea of the feature request is to buffer the recorded data into temporary

storage and then do indexing before handing it out to the Web API client, see

also this design doc.

Since recordings can easily become quite large, using the system memory as

temporary storage does not seem like a good solution. Using a sandboxed file

system seems like a good option. I was wondering if there is a suitable C++ API

that can be called from a Renderer. Following along the implementation of the

File API exposed to JavaScrip clients, I found that it calls into

FileSystemDispatcher, but I am not sure it is suitable for my use

case. The code that produces the WebM data and does the post-processing requires

a read/write stream that allows seeking to arbitrary positions and expects the

API for read/write/seek operations to be synchronous, i.e. blocking.

Please let me know you recommendations, comments, and concerns.

Thank you and best regards,

Christian

Joshua Bell

unread,

Jul 24, 2017, 3:57:10 PM7/24/17

to Christian Fremerey, stora...@chromium.org

On Fri, Jul 21, 2017 at 11:05 AM, Christian Fremerey <chfr...@chromium.org> wrote:

Dear storage devs,

I am investigating a feature request in the context of MediaRecorder, which is
a Web API that allows things like recording video from a webcam into a WebM
blob. Currently, the produced WebM data does not get properly indexed, because
indexing requires post-processing the data after the recording is complete. The
basic idea of the feature request is to buffer the recorded data into temporary
storage and then do indexing before handing it out to the Web API client, see
also this design doc.

Since recordings can easily become quite large, using the system memory as
temporary storage does not seem like a good solution. Using a sandboxed file
system seems like a good option.

If this temporary data is not web-exposed then storing it in the origin-scoped sandboxed file system directly seems like a poor fit. Web sites would be able to observe and modify the data while your code was using it, and you'd potentially run into unexpected quota limitations, or conflicts with existing files. We also do not have the sandboxed filesystem implemented for incognito sessions (which is problematic). Finally, we're hoping to deprecate/remove the sandboxed filesystem API when possible, as it is Chrome-only and other browsers have not indicated any intent to implement it.

To be clear here, "sandboxed" applies to the notion that the data is scoped to an origin (e.g. script running http://foo.com has access to a distinct filesystem than http://bar.com), not sandboxing of execution (i.e. renderer process). See https://cs.chromium.org/chromium/src/storage/common/fileapi/file_system_types.h for an explanation.

Given the design doc and your questions, I'm inferring what you want is a file-system like API (i.e. read/write to multiple streams of bytes), that is accessible in the renderer process (for security purposes), but is not web-exposed (since this is an "implementation detail" to post-process WebM) ?

Using parts of the FileSystem implementation here is perhaps plausible, but not the web-exposed types. I should note that the FS API implementation hasn't had active development for several years.

I was wondering if there is a suitable C++ API
that can be called from a Renderer. Following along the implementation of the
File API exposed to JavaScrip clients, I found that it calls into
FileSystemDispatcher, but I am not sure it is suitable for my use
case.

It may be simpler to spin up a Mojo service which will give you Mojo FS objects (components/filesystem/public/interfaces/file.mojom) which give you the low level read/write/seek operations. You can (in theory) transfer those from the browser to renderer since they just wrap file descriptors. (I haven't tried this myself, though.)

(At some point I assume we'll replace FileSystemDispatcher with something that uses Mojo and removes a bunch of the complexity.)

The code that produces the WebM data and does the post-processing requires
a read/write stream that allows seeking to arbitrary positions and expects the
API for read/write/seek operations to be synchronous, i.e. blocking.

Yeah, we don't expose a synchronous FS API to the renderer; the synchronous variations exposed to Worker threads are implemented by having the worker thread post the request to the IO thread and block waiting for the response.

Are you planning to spin up dedicated thread in the renderer for this? Have you talked with the folks on platform-arc...@chromium.org about your scheduling model?

Christian Fremerey

unread,

Jul 24, 2017, 4:36:51 PM7/24/17

to Joshua Bell, e...@chromium.org, stora...@chromium.org

+ Elliot as FYI for file service

On Mon, Jul 24, 2017 at 12:57 PM, Joshua Bell <jsb...@chromium.org> wrote:

On Fri, Jul 21, 2017 at 11:05 AM, Christian Fremerey <chfr...@chromium.org> wrote:
Dear storage devs,

I am investigating a feature request in the context of MediaRecorder, which is
a Web API that allows things like recording video from a webcam into a WebM
blob. Currently, the produced WebM data does not get properly indexed, because
indexing requires post-processing the data after the recording is complete. The
basic idea of the feature request is to buffer the recorded data into temporary
storage and then do indexing before handing it out to the Web API client, see
also this design doc.

Since recordings can easily become quite large, using the system memory as
temporary storage does not seem like a good solution. Using a sandboxed file
system seems like a good option.

If this temporary data is not web-exposed then storing it in the origin-scoped sandboxed file system directly seems like a poor fit. Web sites would be able to observe and modify the data while your code was using it, and you'd potentially run into unexpected quota limitations, or conflicts with existing files. We also do not have the sandboxed filesystem implemented for incognito sessions (which is problematic). Finally, we're hoping to deprecate/remove the sandboxed filesystem API when possible, as it is Chrome-only and other browsers have not indicated any intent to implement it.

To be clear here, "sandboxed" applies to the notion that the data is scoped to an origin (e.g. script running http://foo.com has access to a distinct filesystem than http://bar.com), not sandboxing of execution (i.e. renderer process). See https://cs.chromium.org/chromium/src/storage/common/fileapi/file_system_types.h for an explanation.

Given the design doc and your questions, I'm inferring what you want is a file-system like API (i.e. read/write to multiple streams of bytes), that is accessible in the renderer process (for security purposes), but is not web-exposed (since this is an "implementation detail" to post-process WebM) ?

Yes, that is exactly right. Thanks for clarifying the differences.

Using parts of the FileSystem implementation here is perhaps plausible, but not the web-exposed types. I should note that the FS API implementation hasn't had active development for several years.

I was wondering if there is a suitable C++ API
that can be called from a Renderer. Following along the implementation of the
File API exposed to JavaScrip clients, I found that it calls into
FileSystemDispatcher, but I am not sure it is suitable for my use
case.

It may be simpler to spin up a Mojo service which will give you Mojo FS objects (components/filesystem/public/interfaces/file.mojom) which give you the low level read/write/seek operations. You can (in theory) transfer those from the browser to renderer since they just wrap file descriptors. (I haven't tried this myself, though.)

I already had a brief email exchange with erg@ where I asked about using the FileSystem service for this. Based on the info you provided, it seems that the FileSystem service would be the better fit for my use case, but it is not currently hardened for use from untrusted renderers. Elliot indicated that this hardening could be done, though, and I think I may want to give it a try. Giving Renderers an API for creating temp files may be useful for other future use cases as well.

(At some point I assume we'll replace FileSystemDispatcher with something that uses Mojo and removes a bunch of the complexity.)

The code that produces the WebM data and does the post-processing requires
a read/write stream that allows seeking to arbitrary positions and expects the
API for read/write/seek operations to be synchronous, i.e. blocking.

Yeah, we don't expose a synchronous FS API to the renderer; the synchronous variations exposed to Worker threads are implemented by having the worker thread post the request to the IO thread and block waiting for the response.

Are you planning to spin up dedicated thread in the renderer for this? Have you talked with the folks on platform-architecture-dev@chromium.org about your scheduling model?

I had not yet decided how to do the threading. In general I would have tried to avoid spinning up another thread, as long as I could find two existing threads suitable for the task (one for sending/receiving the IPC messages, and one that gets blocked waiting for the IPC round trip). I assume these questions have to be resolved regardless of which FS API would be used. I will reach out to platform-architecture-dev@chromium.org when I have a clearer idea of what the API will look like.

Reply all

Reply to author

Forward