File in the renderer process with Network Service

1 view
Skip to first unread message

Kinuko Yasuda

unread,
Nov 17, 2017, 4:58:41 AM11/17/17
to network-service-dev, storage-dev, Matt Menke, Marijn Kruisselbrink
Hi,

Just wanted to make sure if I'm / we can be on the same page. Currently we use a string file path for representing a File object in Blink, i.e. when a user picks up or drag-and-drop a file for reading out data or uploading the file contents.  The file path in the File object is passed as  storage::DataElement::TYPE_FILE when appending it to a Blob, or uploading it via a Form (e.g. input type=file).

This was handy but given that we'll need to be passing these files to Network Service for uploading (which shouldn't need to know how to deal with file paths) we probably want to pass a file as, say, mojo.common.mojo.File (== base::File), rather than a string path.  (mmenke@'s this patch's adding some code path that handles base::File in upload case)

This seems to mean we eventually want to do some or all of:
  • Start passing mojo.File (rather than a string path) to the renderer, after a necessary security check in the browser process, when a file is picked or dropped.  The file handle must be read-only.
  • Allow appending a file to Blob using mojo.File (rather than a string path).
  • Maybe also expose a file service that returns mojo.File from a file path string (with proper security checks) in case mojo.File can't be passed in some cases.
Does this sound plausible, or did/does anyone have different ideas? It looks it'd be good if we have storage team's consensus there.

Kinuko

Matt Menke

unread,
Nov 17, 2017, 6:48:45 AM11/17/17
to Kinuko Yasuda, network-service-dev, storage-dev, Marijn Kruisselbrink
For file uploads, we expose the file path being uploaded to extensions.  So we'll presumably still need that in addition to the file itself.

Marijn Kruisselbrink

unread,
Nov 17, 2017, 2:50:03 PM11/17/17
to Matt Menke, Kinuko Yasuda, network-service-dev, storage-dev
I do think it makes sense to pass files as something other than a file path. I don't think it is reasonable for that to just be a straight up base::File/mojo.common.mojo.File (or even a filesystem.mojom.File). For one it would lock up file descriptors for files that might or might not actually get read. And it would make creating blobs out of these files more tricky as well (are we going to keep the file open potentially indefinitely utill all blobs that reference the file have gone away? It would certainly complicate the blob system quite a bit if now suddenly a blob could be made up of files-that-are-already-opened. Would it make sense to instead have something that can vend base::File/mojo.File objects, as opposed to directly being it?

On Fri, Nov 17, 2017 at 3:48 AM, Matt Menke <mme...@chromium.org> wrote:
For file uploads, we expose the file path being uploaded to extensions.  So we'll presumably still need that in addition to the file itself.

On Fri, Nov 17, 2017 at 4:58 AM, Kinuko Yasuda <kin...@google.com> wrote:
Hi,

Just wanted to make sure if I'm / we can be on the same page. Currently we use a string file path for representing a File object in Blink, i.e. when a user picks up or drag-and-drop a file for reading out data or uploading the file contents.  The file path in the File object is passed as  storage::DataElement::TYPE_FILE when appending it to a Blob, or uploading it via a Form (e.g. input type=file).
The distinction that you seem to be making between Blink File objects and Blobs isn't really a distinction that exists of course. Any Blob can be a file, and any File object can be an arbitrary blob not backed by a file. We just happen to have some special-casing around File objects that wer initialized to be a blob that happens to be backed by the full contents of a single file, where we know the filename of that file. But ultimately a (dom) File is really nothing more than a Blob that happens to also have a filename and lastModified timestamp (and of course that name doesn't generally have to actually match what the filename of the file backing the blob is, if it happens to be a blob backed by a single file).
 
This was handy but given that we'll need to be passing these files to Network Service for uploading (which shouldn't need to know how to deal with file paths) we probably want to pass a file as, say, mojo.common.mojo.File (== base::File), rather than a string path.  (mmenke@'s this patch's adding some code path that handles base::File in upload case)
So since every File in blink is really a Blob anyway, I wonder if we really need to be special casing a certain subset of Files in the network service? What would happen/break if we just treated them the same way we treat other blobs?

Matt Menke

unread,
Nov 17, 2017, 2:56:01 PM11/17/17
to Marijn Kruisselbrink, Kinuko Yasuda, network-service-dev, storage-dev
The network service should have no knowledge of blobs, can't depend on blink (In fact, blink may depend on the network service), and will be sandboxed so it can't open files itself.  So from a network service standpoint, everything needs to end up as one or more data pipes, files, or strings (Or we need a way to get data pipes, files, or strings from them through a mojo API).

Marijn Kruisselbrink

unread,
Nov 17, 2017, 3:00:47 PM11/17/17
to Matt Menke, Kinuko Yasuda, network-service-dev, storage-dev
On Fri, Nov 17, 2017 at 11:55 AM, Matt Menke <mme...@chromium.org> wrote:
The network service should have no knowledge of blobs, can't depend on blink (In fact, blink may depend on the network service), and will be sandboxed so it can't open files itself.  So from a network service standpoint, everything needs to end up as one or more data pipes, files, or strings (Or we need a way to get data pipes, files, or strings from them through a mojo API).

Sure, but you need to support arbitrary blobs somehow (probably as you say through some API that either is a data-pipe/file/string, or some abstract interface that lets you get data pipes/strings/files, like blobs BytesProvider interface). And if you need to support that anyway, why special case a certain type of blob rather than treating all blobs equally.

Chris Palmer

unread,
Nov 17, 2017, 3:02:07 PM11/17/17
to Marijn Kruisselbrink, Daniel Cheng, tse...@chromium.org, Matt Menke, Kinuko Yasuda, network-service-dev, storage-dev
+dcheng and tsepez explicitly; they might have security thoughts.

--
You received this message because you are subscribed to the Google Groups "network-service-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to network-service-dev+unsub...@chromium.org.
To post to this group, send email to network-service-dev@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/network-service-dev/CA%2BOSsVazybGiL9XoM%2BuaKdFLYjT7Gf_%3DMpSXg_JsBPXRUB87FA%40mail.gmail.com.

Marijn Kruisselbrink

unread,
Nov 17, 2017, 3:05:17 PM11/17/17
to Chris Palmer, Daniel Cheng, tse...@chromium.org, Matt Menke, Kinuko Yasuda, network-service-dev, storage-dev
On Fri, Nov 17, 2017 at 12:01 PM, Chris Palmer <pal...@chromium.org> wrote:
+dcheng and tsepez explicitly; they might have security thoughts.

On Fri, Nov 17, 2017 at 12:00 PM, Marijn Kruisselbrink <m...@chromium.org> wrote:


On Fri, Nov 17, 2017 at 11:55 AM, Matt Menke <mme...@chromium.org> wrote:
The network service should have no knowledge of blobs, can't depend on blink (In fact, blink may depend on the network service), and will be sandboxed so it can't open files itself.  So from a network service standpoint, everything needs to end up as one or more data pipes, files, or strings (Or we need a way to get data pipes, files, or strings from them through a mojo API).

Sure, but you need to support arbitrary blobs somehow (probably as you say through some API that either is a data-pipe/file/string, or some abstract interface that lets you get data pipes/strings/files, like blobs BytesProvider interface). And if you need to support that anyway, why special case a certain type of blob rather than treating all blobs equally.
(and probably the existing BytesProvider interface is not the right match here, as that lets the consumer decide what the best format is to transfer these bytes, while in this case you probably want the producer of the bytes to be able to make the decision. So some kind of method that could return (File or handle<data_pipe_producer> or array<uint8> data) might be a closer match for this use case? Then the blob implementation could decide "hey, this blob is backed by a single file, let me just return that file", or "hmm, let me give a data-pipe and go through my normal code path" for more complicated blobs.



On Fri, Nov 17, 2017 at 2:50 PM, Marijn Kruisselbrink <m...@chromium.org> wrote:
I do think it makes sense to pass files as something other than a file path. I don't think it is reasonable for that to just be a straight up base::File/mojo.common.mojo.File (or even a filesystem.mojom.File). For one it would lock up file descriptors for files that might or might not actually get read. And it would make creating blobs out of these files more tricky as well (are we going to keep the file open potentially indefinitely utill all blobs that reference the file have gone away? It would certainly complicate the blob system quite a bit if now suddenly a blob could be made up of files-that-are-already-opened. Would it make sense to instead have something that can vend base::File/mojo.File objects, as opposed to directly being it?

On Fri, Nov 17, 2017 at 3:48 AM, Matt Menke <mme...@chromium.org> wrote:
For file uploads, we expose the file path being uploaded to extensions.  So we'll presumably still need that in addition to the file itself.

On Fri, Nov 17, 2017 at 4:58 AM, Kinuko Yasuda <kin...@google.com> wrote:
Hi,

Just wanted to make sure if I'm / we can be on the same page. Currently we use a string file path for representing a File object in Blink, i.e. when a user picks up or drag-and-drop a file for reading out data or uploading the file contents.  The file path in the File object is passed as  storage::DataElement::TYPE_FILE when appending it to a Blob, or uploading it via a Form (e.g. input type=file).
The distinction that you seem to be making between Blink File objects and Blobs isn't really a distinction that exists of course. Any Blob can be a file, and any File object can be an arbitrary blob not backed by a file. We just happen to have some special-casing around File objects that wer initialized to be a blob that happens to be backed by the full contents of a single file, where we know the filename of that file. But ultimately a (dom) File is really nothing more than a Blob that happens to also have a filename and lastModified timestamp (and of course that name doesn't generally have to actually match what the filename of the file backing the blob is, if it happens to be a blob backed by a single file).
 
This was handy but given that we'll need to be passing these files to Network Service for uploading (which shouldn't need to know how to deal with file paths) we probably want to pass a file as, say, mojo.common.mojo.File (== base::File), rather than a string path.  (mmenke@'s this patch's adding some code path that handles base::File in upload case)
So since every File in blink is really a Blob anyway, I wonder if we really need to be special casing a certain subset of Files in the network service? What would happen/break if we just treated them the same way we treat other blobs?


This seems to mean we eventually want to do some or all of:
  • Start passing mojo.File (rather than a string path) to the renderer, after a necessary security check in the browser process, when a file is picked or dropped.  The file handle must be read-only.
  • Allow appending a file to Blob using mojo.File (rather than a string path).
  • Maybe also expose a file service that returns mojo.File from a file path string (with proper security checks) in case mojo.File can't be passed in some cases.
Does this sound plausible, or did/does anyone have different ideas? It looks it'd be good if we have storage team's consensus there.

Kinuko





--
You received this message because you are subscribed to the Google Groups "network-service-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to network-service-dev+unsubscribe...@chromium.org.

To post to this group, send email to network-service-dev@chromium.org.

Kinuko Yasuda

unread,
Nov 19, 2017, 11:15:08 PM11/19/17
to Marijn Kruisselbrink, Chris Palmer, Daniel Cheng, tse...@chromium.org, Matt Menke, network-service-dev, storage-dev
Btw there's one aspect I haven't been taking a closer look, that is history serialization; this seems to mean that at least for main resource loading we would need to keep using file path abstraction until at the navigation layer.  Also this made me wonder what we should do for Blobs when, say, users navigate back and that navigation has a post data backed by a Blob.  Legacy code just stores Blob uuid, and Network Service code stores a Data Pipe, the legacy code doesn't work if a backing blob is dropped (we don't seem to guarantee its lifetime), and the new code just drops everything at serialization layer.  If it's fine to drop Blob post data at back navigation it's probably fine but I'm not sure what's the expected behavior.

Getting back to the file one, with these aspects and from mek@'s comments I feel for files that are passed from the browser to the renderer (i.e. on a input form etc) we should probably just keep using file paths.  At some point we should fork the DataElement into two; upper-layer one that is used by the renderer process and by the navigation code, and lower-layer one that is used by, and in communication between, the Network Service.  For files upper-layer one would probably just use file paths, while lower-layer one would be using base::File (or mojo.common.mojo.File).  We might also want to keep using Blobs for the upper-layer one while we use data pipes in the Network Service-related code.

On Sat, Nov 18, 2017 at 5:05 AM, Marijn Kruisselbrink <m...@chromium.org> wrote:
On Fri, Nov 17, 2017 at 12:01 PM, Chris Palmer <pal...@chromium.org> wrote:
+dcheng and tsepez explicitly; they might have security thoughts.

On Fri, Nov 17, 2017 at 12:00 PM, Marijn Kruisselbrink <m...@chromium.org> wrote:
On Fri, Nov 17, 2017 at 11:55 AM, Matt Menke <mme...@chromium.org> wrote:
The network service should have no knowledge of blobs, can't depend on blink (In fact, blink may depend on the network service), and will be sandboxed so it can't open files itself.  So from a network service standpoint, everything needs to end up as one or more data pipes, files, or strings (Or we need a way to get data pipes, files, or strings from them through a mojo API).

Sure, but you need to support arbitrary blobs somehow (probably as you say through some API that either is a data-pipe/file/string, or some abstract interface that lets you get data pipes/strings/files, like blobs BytesProvider interface). And if you need to support that anyway, why special case a certain type of blob rather than treating all blobs equally.
(and probably the existing BytesProvider interface is not the right match here, as that lets the consumer decide what the best format is to transfer these bytes, while in this case you probably want the producer of the bytes to be able to make the decision. So some kind of method that could return (File or handle<data_pipe_producer> or array<uint8> data) might be a closer match for this use case? Then the blob implementation could decide "hey, this blob is backed by a single file, let me just return that file", or "hmm, let me give a data-pipe and go through my normal code path" for more complicated blobs.

 

On Fri, Nov 17, 2017 at 2:50 PM, Marijn Kruisselbrink <m...@chromium.org> wrote:
I do think it makes sense to pass files as something other than a file path. I don't think it is reasonable for that to just be a straight up base::File/mojo.common.mojo.File (or even a filesystem.mojom.File). For one it would lock up file descriptors for files that might or might not actually get read. And it would make creating blobs out of these files more tricky as well (are we going to keep the file open potentially indefinitely utill all blobs that reference the file have gone away? It would certainly complicate the blob system quite a bit if now suddenly a blob could be made up of files-that-are-already-opened. Would it make sense to instead have something that can vend base::File/mojo.File objects, as opposed to directly being it?

On Fri, Nov 17, 2017 at 3:48 AM, Matt Menke <mme...@chromium.org> wrote:
For file uploads, we expose the file path being uploaded to extensions.  So we'll presumably still need that in addition to the file itself.

On Fri, Nov 17, 2017 at 4:58 AM, Kinuko Yasuda <kin...@google.com> wrote:
Hi,

Just wanted to make sure if I'm / we can be on the same page. Currently we use a string file path for representing a File object in Blink, i.e. when a user picks up or drag-and-drop a file for reading out data or uploading the file contents.  The file path in the File object is passed as  storage::DataElement::TYPE_FILE when appending it to a Blob, or uploading it via a Form (e.g. input type=file).
The distinction that you seem to be making between Blink File objects and Blobs isn't really a distinction that exists of course. Any Blob can be a file, and any File object can be an arbitrary blob not backed by a file. We just happen to have some special-casing around File objects that wer initialized to be a blob that happens to be backed by the full contents of a single file, where we know the filename of that file. But ultimately a (dom) File is really nothing more than a Blob that happens to also have a filename and lastModified timestamp (and of course that name doesn't generally have to actually match what the filename of the file backing the blob is, if it happens to be a blob backed by a single file).

Yes, a file is a blob and any blob can be a file, while we have multiple code that checks if a blob is a file and has a real file on disk, and do different thing, and when other process like Network Service wants to the data it could make a bit of perf/memory difference if it's getting data from blob service / browser process or directly reading the file or not.
Reply all
Reply to author
Forward
0 new messages