Chunked upload Cronet API.

569 views
Skip to first unread message

Matt Menke

unread,
Aug 22, 2014, 12:06:00 PM8/22/14
to net-dev
For anyone unfamiliar with Cronet, it's a project to give the Chrome network stack an officially supported API for external embedders.  The project is currently focused on Android and iOS wrappers.

Currently, the network stack supports two basic types of uploads:  Those where the size is known in advance, which consist of one or more files and/or memory buffers, and chunked uploads, where the size is not know.  Because of lack of server support, and existing web APIs don't need it, chunked uploads are only used for internal Chrome requests, to the extent of my knowledge.

Currently, the way the network stack supports chunked uploads (Both within Chrome and outside of Chrome) is for the embedder to just keep on pushing data to the UploadDataStream object, which buffers all the data pushed to it until the request is complete.  The buffering is needed for retries and redirects.  There's no way for an embedder to know when the stream actually needs more data, so they just push data as soon as they get it.  This may be convenient when streaming video or audio as it's being recorded, but it relinquishes all control over the size of the buffer, which grows without bound.

Currently, Cronet on Android has a Java interface for this push-based chunked behavior, which ends up copying all data twice on its way to the network stack.  It also has a interface to allow uploads when data size is known, that does not support retries or redirects.  The latter API uses a blocking API to read data, but does manage to use two fewer copies.

Neither of these APIs is great, and we're looking to replace them with a single non-blocking API.  I've been toying with two different designs:  One works just like the chunked design, except it allows buffering to be disabled, and has a notification when more data is needed and when the data is to be rewound.  Advantage is it's completely compatible completely with the current chunked behavior, and works quite easily with the just want to push streaming input case.  It also gives the embedder control over how much data is sent to the network stack at once.  Downside is it has the same two extra copies.  Simplified API looks something like this:

class UploadStream {
  UploadStream(DataPusher pusher, Boolean bufferData);
  // Can be called at any point.  Data is copied into our private buffer when called.
  void AppendData(ByteBuffer buffer);
  // Just fails the request.
  void OnReadError();
};

interface DataPusher {
  // Called when all appended data has been sent.  Can be ignored,
  // replied to synchronously, or replied to asynchronously.
  void OnNeedsData();
  // Returns false if rewind isn't supported.
  Boolean  RewindData();
  void OnCancel();
};

The other option is to do something pull-based instead.  The network stack would provide the buffer, and then wait for a callback.  When buffering is not being used, this would require two fewer copies.  We could even just not support buffering, and leave it up to the client, if we so desired.  The network stack has complete control over its buffer size.  Downsides are that it doesn't work with the old chunked upload model, and there are some buffer ownership issues on cancellation that need to be worked through.  API would look something like:

class UploadStream {
  UploadStream(DataProvider provider, Boolean bufferData);
  // result > 0 means data was read, result == 0 means we're done, result < 0 means error.
  void OnReadComplete(int result);
  void OnRewindComplete(Boolean success);
};

interface DataProvider {
  // The provider writes to the buffer, and calls OnReadComplete when done.
  // It may be possible to even reuse the same buffer for all calls to this, but that may be too complicated,
  // without adding an extra copy.
  void GetData(ByteBuffer buffer);
  // Embedder must call OnRewindComplete once done.
  void RewindData();
  // We'll probably need to require the embedder call into UploadStream to acknowledge this,
  // so it can safely free the native buffer - definitely need to think about this case a bit more.
  void OnCancel();
};

So...  Anyone have any thoughts?  Certainly open to other ideas.

Note:  In both cases, I'm ignoring the threading model, but I suspect in the first case we'd just take an Executor, and require the UploadStream be called on that thread.  In the second case, I believe we can pretty easily allow the UploadStream to be called from any thread, though we'd still want an executor to know what thread to call into the DataProvider on.  We could also take an optional length, to allow the same interface to be used for non-chunked uploads as well.

Ryan Sleevi

unread,
Aug 22, 2014, 12:17:37 PM8/22/14
to Matt Menke, net-dev


On Aug 22, 2014 9:06 AM, "'Matt Menke' via net-dev" <net...@chromium.org> wrote:
>
> For anyone unfamiliar with Cronet, it's a project to give the Chrome network stack an officially supported API for external embedders.  The project is currently focused on Android and iOS wrappers.
>
> Currently, the network stack supports two basic types of uploads:  Those where the size is known in advance, which consist of one or more files and/or memory buffers, and chunked uploads, where the size is not know.  Because of lack of server support, and existing web APIs don't need it, chunked uploads are only used for internal Chrome requests, to the extent of my knowledge.

Nope. We use these on the Web in Chromium. Voice search, for example.

>
> Currently, the way the network stack supports chunked uploads (Both within Chrome and outside of Chrome) is for the embedder to just keep on pushing data to the UploadDataStream object, which buffers all the data pushed to it until the request is complete.  The buffering is needed for retries and redirects.  There's no way for an embedder to know when the stream actually needs more data, so they just push data as soon as they get it.

Are you sure about this? I worked with the voice team on refactoring UploadDataStream so that the IO completion of the previous write is the signal to write more data.

> This may be convenient when streaming video or audio as it's being recorded, but it relinquishes all control over the size of the buffer, which grows without bound.

Agreed.

>
> Currently, Cronet on Android has a Java interface for this push-based chunked behavior, which ends up copying all data twice on its way to the network stack.

That's odd, as we explicitly moved away from that sort of interface after it was repeatedly showing perf and correctness issues in Chrome's implementation.

> It also has a interface to allow uploads when data size is known, that does not support retries or redirects.  The latter API uses a blocking API to read data, but does manage to use two fewer copies.

Blocking?

:(

>
> Neither of these APIs is great, and we're looking to replace them with a single non-blocking API.  I've been toying with two different designs:  One works just like the chunked design, except it allows buffering to be disabled, and has a notification when more data is needed and when the data is to be rewound.  Advantage is it's completely compatible completely with the current chunked behavior, and works quite easily with the just want to push streaming input case.  It also gives the embedder control over how much data is sent to the network stack at once.  Downside is it has the same two extra copies.

Where are the copies?

>  Simplified API looks something like this:
>
> class UploadStream {
>   UploadStream(DataPusher pusher, Boolean bufferData);

Why should the creator specify buffering? Isn't that something internal to the net stack? Isn't the assumption always buffer? Or is cronet exposing some API to explicitly disable redirect handling?

>   // Can be called at any point.  Data is copied into our private buffer when called.
>   void AppendData(ByteBuffer buffer);
>   // Just fails the request.
>   void OnReadError();
> };
>
> interface DataPusher {
>   // Called when all appended data has been sent.  Can be ignored,
>   // replied to synchronously, or replied to asynchronously.
>   void OnNeedsData();

This is effectively an edge-triggered notification, but the majority of Chromium has preferred level triggered (or at least ways to query the state).

This seems harder to reason about if the caller is going to try to handle this asynchronously.

>   // Returns false if rewind isn't supported.
>   Boolean  RewindData();
>   void OnCancel();
> };
>
> The other option is to do something pull-based instead.  The network stack would provide the buffer, and then wait for a callback.  When buffering is not being used, this would require two fewer copies.  We could even just not support buffering, and leave it up to the client, if we so desired.  The network stack has complete control over its buffer size.  Downsides are that it doesn't work with the old chunked upload model, and there are some buffer ownership issues on cancellation that need to be worked through.  API would look something like:
>
> class UploadStream {
>   UploadStream(DataProvider provider, Boolean bufferData);
>   // result > 0 means data was read, result == 0 means we're done, result < 0 means error.
>   void OnReadComplete(int result);
>   void OnRewindComplete(Boolean success);
> };
>
> interface DataProvider {
>   // The provider writes to the buffer, and calls OnReadComplete when done.
>   // It may be possible to even reuse the same buffer for all calls to this, but that may be too complicated,
>   // without adding an extra copy.
>   void GetData(ByteBuffer buffer);
>   // Embedder must call OnRewindComplete once done.
>   void RewindData();
>   // We'll probably need to require the embedder call into UploadStream to acknowledge this,
>   // so it can safely free the native buffer - definitely need to think about this case a bit more.
>   void OnCancel();
> };
>

I'll have to think about this more as well. We have had something similar in our other APIs (like SSL), and its been really tricky to use.

A meaningful experiment would be to work from some sample code backwards, and let that gauge the complexity of the interfaces. I feel one of these (won't say which) will be substantially easier and more natural to use.

> So...  Anyone have any thoughts?  Certainly open to other ideas.
>
> Note:  In both cases, I'm ignoring the threading model, but I suspect in the first case we'd just take an Executor, and require the UploadStream be called on that thread.  In the second case, I believe we can pretty easily allow the UploadStream to be called from any thread, though we'd still want an executor to know what thread to call into the DataProvider on.  We could also take an optional length, to allow the same interface to be used for non-chunked uploads as well.
>

> --
> You received this message because you are subscribed to the Google Groups "net-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
> To post to this group, send email to net...@chromium.org.
> To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/CAEK7mvpRAys-NPiwrjLRurQdA9iiuG15XTGXK_NUv1iNth9u9A%40mail.gmail.com.

Matt Menke

unread,
Aug 22, 2014, 1:28:21 PM8/22/14
to Ryan Sleevi, net-dev
On Fri, Aug 22, 2014 at 12:17 PM, Ryan Sleevi <rsl...@chromium.org> wrote:


On Aug 22, 2014 9:06 AM, "'Matt Menke' via net-dev" <net...@chromium.org> wrote:
>
> For anyone unfamiliar with Cronet, it's a project to give the Chrome network stack an officially supported API for external embedders.  The project is currently focused on Android and iOS wrappers.
>
> Currently, the network stack supports two basic types of uploads:  Those where the size is known in advance, which consist of one or more files and/or memory buffers, and chunked uploads, where the size is not know.  Because of lack of server support, and existing web APIs don't need it, chunked uploads are only used for internal Chrome requests, to the extent of my knowledge.

Nope. We use these on the Web in Chromium. Voice search, for example.

XHRs don't support chunked uploads, do they?

>
> Currently, the way the network stack supports chunked uploads (Both within Chrome and outside of Chrome) is for the embedder to just keep on pushing data to the UploadDataStream object, which buffers all the data pushed to it until the request is complete.  The buffering is needed for retries and redirects.  There's no way for an embedder to know when the stream actually needs more data, so they just push data as soon as they get it.

Are you sure about this? I worked with the voice team on refactoring UploadDataStream so that the IO completion of the previous write is the signal to write more data.

I'm pretty sure.  If we know size in advance, ElementReaders can provide data on demand.  But if we're doing a chunked upload, this is our only interface.  The HttpStream doesn't have support for another method of chunking, and UploadElementReaders are expected to always know their size, after initialization.

Also, if you don't buffer, reusing a network connection is always dangerous - we have to retry on certain errors, even with POSTs (Just imagine the case a middlebox silently timed out the connection).
 

> This may be convenient when streaming video or audio as it's being recorded, but it relinquishes all control over the size of the buffer, which grows without bound.

Agreed.

>
> Currently, Cronet on Android has a Java interface for this push-based chunked behavior, which ends up copying all data twice on its way to the network stack.

That's odd, as we explicitly moved away from that sort of interface after it was repeatedly showing perf and correctness issues in Chrome's implementation.

> It also has a interface to allow uploads when data size is known, that does not support retries or redirects.  The latter API uses a blocking API to read data, but does manage to use two fewer copies.

Blocking?

:(

>
> Neither of these APIs is great, and we're looking to replace them with a single non-blocking API.  I've been toying with two different designs:  One works just like the chunked design, except it allows buffering to be disabled, and has a notification when more data is needed and when the data is to be rewound.  Advantage is it's completely compatible completely with the current chunked behavior, and works quite easily with the just want to push streaming input case.  It also gives the embedder control over how much data is sent to the network stack at once.  Downside is it has the same two extra copies.

Where are the copies?

We copy from the embedder-owned buffer to the UploadStream's buffer, and then we copy from the UploadStream's buffer to the buffer used to write to the socket.  With the other API, we can write directly into the buffer used to write to the socket (Unless buffering is enabled).
 

>  Simplified API looks something like this:
>
> class UploadStream {
>   UploadStream(DataPusher pusher, Boolean bufferData);

Why should the creator specify buffering? Isn't that something internal to the net stack? Isn't the assumption always buffer? Or is cronet exposing some API to explicitly disable redirect handling?

If we don't buffer, we hard fail on method-preserving redirects, and retries, which is something that would matter to the embedder.  If we do buffer, we have to buffer it all, which can use a ton lot of memory.  I don't think we're in a position to make the decision for the embedder here.  There's a very real tradeoff.

>   // Can be called at any point.  Data is copied into our private buffer when called.
>   void AppendData(ByteBuffer buffer);
>   // Just fails the request.
>   void OnReadError();
> };
>
> interface DataPusher {
>   // Called when all appended data has been sent.  Can be ignored,
>   // replied to synchronously, or replied to asynchronously.
>   void OnNeedsData();

This is effectively an edge-triggered notification, but the majority of Chromium has preferred level triggered (or at least ways to query the state).

This seems harder to reason about if the caller is going to try to handle this asynchronously.

I agree.  I was thinking about this approach when I was focusing on preserving behavior, but sufficiently hated the current behavior that I wanted to also support something I viewed as more reasonable.

>   // Returns false if rewind isn't supported.
>   Boolean  RewindData();
>   void OnCancel();
> };
>
> The other option is to do something pull-based instead.  The network stack would provide the buffer, and then wait for a callback.  When buffering is not being used, this would require two fewer copies.  We could even just not support buffering, and leave it up to the client, if we so desired.  The network stack has complete control over its buffer size.  Downsides are that it doesn't work with the old chunked upload model, and there are some buffer ownership issues on cancellation that need to be worked through.  API would look something like:
>
> class UploadStream {
>   UploadStream(DataProvider provider, Boolean bufferData);
>   // result > 0 means data was read, result == 0 means we're done, result < 0 means error.
>   void OnReadComplete(int result);
>   void OnRewindComplete(Boolean success);
> };
>
> interface DataProvider {
>   // The provider writes to the buffer, and calls OnReadComplete when done.
>   // It may be possible to even reuse the same buffer for all calls to this, but that may be too complicated,
>   // without adding an extra copy.
>   void GetData(ByteBuffer buffer);
>   // Embedder must call OnRewindComplete once done.
>   void RewindData();
>   // We'll probably need to require the embedder call into UploadStream to acknowledge this,
>   // so it can safely free the native buffer - definitely need to think about this case a bit more.
>   void OnCancel();
> };
>

I'll have to think about this more as well. We have had something similar in our other APIs (like SSL), and its been really tricky to use.

A meaningful experiment would be to work from some sample code backwards, and let that gauge the complexity of the interfaces. I feel one of these (won't say which) will be substantially easier and more natural to use.

I think that sounds reasonable.  While I currently prefer the pull-based approach, I think it does have some serious concerns about cleanup if cancelled, and breaks the existing model.  Currently embedders would need to adopt to the new API, or a rather ugly wrapper around the old one, though there aren't too many Cronet embedders, yet.

Something else to consider:  Both APIs as written about above rely completely on (possibly re-entrant) callbacks, but we could modify the pull-based model to be a bit more Chrome-like, with return values indicated synchronous or asynchronous completion.  Would save us a Java->C++ transition.  Not sure how high-cost that is.

Ryan Sleevi

unread,
Aug 22, 2014, 1:36:35 PM8/22/14
to Matt Menke, net-dev


On Aug 22, 2014 10:28 AM, "Matt Menke" <mme...@google.com> wrote:
>
> On Fri, Aug 22, 2014 at 12:17 PM, Ryan Sleevi <rsl...@chromium.org> wrote:
>>
>>
>> On Aug 22, 2014 9:06 AM, "'Matt Menke' via net-dev" <net...@chromium.org> wrote:
>> >
>> > For anyone unfamiliar with Cronet, it's a project to give the Chrome network stack an officially supported API for external embedders.  The project is currently focused on Android and iOS wrappers.
>> >
>> > Currently, the network stack supports two basic types of uploads:  Those where the size is known in advance, which consist of one or more files and/or memory buffers, and chunked uploads, where the size is not know.  Because of lack of server support, and existing web APIs don't need it, chunked uploads are only used for internal Chrome requests, to the extent of my knowledge.
>>
>> Nope. We use these on the Web in Chromium. Voice search, for example.
>
> XHRs don't support chunked uploads, do they?

Voice search does it as a form post with chunked upload.

I think you may be confusing the two ways we have to voice search - one within web content, the other within Chromium itself (via URLRequest)

>>
>> >
>> > Currently, the way the network stack supports chunked uploads (Both within Chrome and outside of Chrome) is for the embedder to just keep on pushing data to the UploadDataStream object, which buffers all the data pushed to it until the request is complete.  The buffering is needed for retries and redirects.  There's no way for an embedder to know when the stream actually needs more data, so they just push data as soon as they get it.
>>
>> Are you sure about this? I worked with the voice team on refactoring UploadDataStream so that the IO completion of the previous write is the signal to write more data.
>
> I'm pretty sure.  If we know size in advance, ElementReaders can provide data on demand.  But if we're doing a chunked upload, this is our only interface.  The HttpStream doesn't have support for another method of chunking, and UploadElementReaders are expected to always know their size, after initialization.

Um, UploadDataStream explicitly doesn't use UploadElementReaders when chunking. Its initialized as chunk and uses AppendChunk directly. The only requirement is that each chunk have a discrete size, which is unsurprising.

Perhaps I misunderstood your response?

>
> Also, if you don't buffer, reusing a network connection is always dangerous - we have to retry on certain errors, even with POSTs (Just imagine the case a middlebox silently timed out the connection).
>  
>>
>> > This may be convenient when streaming video or audio as it's being recorded, but it relinquishes all control over the size of the buffer, which grows without bound.
>>
>> Agreed.
>>
>> >
>> > Currently, Cronet on Android has a Java interface for this push-based chunked behavior, which ends up copying all data twice on its way to the network stack.
>>
>> That's odd, as we explicitly moved away from that sort of interface after it was repeatedly showing perf and correctness issues in Chrome's implementation.
>>
>> > It also has a interface to allow uploads when data size is known, that does not support retries or redirects.  The latter API uses a blocking API to read data, but does manage to use two fewer copies.
>>
>> Blocking?
>>
>> :(
>>
>> >
>> > Neither of these APIs is great, and we're looking to replace them with a single non-blocking API.  I've been toying with two different designs:  One works just like the chunked design, except it allows buffering to be disabled, and has a notification when more data is needed and when the data is to be rewound.  Advantage is it's completely compatible completely with the current chunked behavior, and works quite easily with the just want to push streaming input case.  It also gives the embedder control over how much data is sent to the network stack at once.  Downside is it has the same two extra copies.
>>
>> Where are the copies?
>
> We copy from the embedder-owned buffer to the UploadStream's buffer, and then we copy from the UploadStream's buffer to the buffer used to write to the socket.  With the other API, we can write directly into the buffer used to write to the socket (Unless buffering is enabled).

Or using SPDY, QUIC, or TLS.

That was a real issue with the old code and something we tried to resolve. I'm surprised to hear you say its still an issue.

Matt Menke

unread,
Aug 22, 2014, 1:43:05 PM8/22/14
to Ryan Sleevi, net-dev
On Fri, Aug 22, 2014 at 1:36 PM, Ryan Sleevi <rsl...@chromium.org> wrote:


On Aug 22, 2014 10:28 AM, "Matt Menke" <mme...@google.com> wrote:
>
> On Fri, Aug 22, 2014 at 12:17 PM, Ryan Sleevi <rsl...@chromium.org> wrote:
>>
>>
>> On Aug 22, 2014 9:06 AM, "'Matt Menke' via net-dev" <net...@chromium.org> wrote:
>> >
>> > For anyone unfamiliar with Cronet, it's a project to give the Chrome network stack an officially supported API for external embedders.  The project is currently focused on Android and iOS wrappers.
>> >
>> > Currently, the network stack supports two basic types of uploads:  Those where the size is known in advance, which consist of one or more files and/or memory buffers, and chunked uploads, where the size is not know.  Because of lack of server support, and existing web APIs don't need it, chunked uploads are only used for internal Chrome requests, to the extent of my knowledge.
>>
>> Nope. We use these on the Web in Chromium. Voice search, for example.
>
> XHRs don't support chunked uploads, do they?

Voice search does it as a form post with chunked upload.

I think you may be confusing the two ways we have to voice search - one within web content, the other within Chromium itself (via URLRequest)

>>
>> >
>> > Currently, the way the network stack supports chunked uploads (Both within Chrome and outside of Chrome) is for the embedder to just keep on pushing data to the UploadDataStream object, which buffers all the data pushed to it until the request is complete.  The buffering is needed for retries and redirects.  There's no way for an embedder to know when the stream actually needs more data, so they just push data as soon as they get it.
>>
>> Are you sure about this? I worked with the voice team on refactoring UploadDataStream so that the IO completion of the previous write is the signal to write more data.
>
> I'm pretty sure.  If we know size in advance, ElementReaders can provide data on demand.  But if we're doing a chunked upload, this is our only interface.  The HttpStream doesn't have support for another method of chunking, and UploadElementReaders are expected to always know their size, after initialization.

Um, UploadDataStream explicitly doesn't use UploadElementReaders when chunking. Its initialized as chunk and uses AppendChunk directly. The only requirement is that each chunk have a discrete size, which is unsurprising.

Perhaps I misunderstood your response?

UploadDataStream provides no callbacks to inform anything when it needs more data.  And it's not completely clear from code that it's impossible to hack up an UploadElementReader to allow it, which is why I was arguing that's also not possible.
 

>
> Also, if you don't buffer, reusing a network connection is always dangerous - we have to retry on certain errors, even with POSTs (Just imagine the case a middlebox silently timed out the connection).
>  
>>
>> > This may be convenient when streaming video or audio as it's being recorded, but it relinquishes all control over the size of the buffer, which grows without bound.
>>
>> Agreed.
>>
>> >
>> > Currently, Cronet on Android has a Java interface for this push-based chunked behavior, which ends up copying all data twice on its way to the network stack.
>>
>> That's odd, as we explicitly moved away from that sort of interface after it was repeatedly showing perf and correctness issues in Chrome's implementation.
>>
>> > It also has a interface to allow uploads when data size is known, that does not support retries or redirects.  The latter API uses a blocking API to read data, but does manage to use two fewer copies.
>>
>> Blocking?
>>
>> :(
>>
>> >
>> > Neither of these APIs is great, and we're looking to replace them with a single non-blocking API.  I've been toying with two different designs:  One works just like the chunked design, except it allows buffering to be disabled, and has a notification when more data is needed and when the data is to be rewound.  Advantage is it's completely compatible completely with the current chunked behavior, and works quite easily with the just want to push streaming input case.  It also gives the embedder control over how much data is sent to the network stack at once.  Downside is it has the same two extra copies.
>>
>> Where are the copies?
>
> We copy from the embedder-owned buffer to the UploadStream's buffer, and then we copy from the UploadStream's buffer to the buffer used to write to the socket.  With the other API, we can write directly into the buffer used to write to the socket (Unless buffering is enabled).

Or using SPDY, QUIC, or TLS.

Of course, just wasn't sure how else to describe the buffer.  "The HttpBasicStream's write buffer" seemed a bit less clear, though perhaps a bit more accurate.

Matt Menke

unread,
Aug 22, 2014, 2:02:06 PM8/22/14
to Ryan Sleevi, net-dev
On Fri, Aug 22, 2014 at 1:43 PM, Matt Menke <mme...@google.com> wrote:
On Fri, Aug 22, 2014 at 1:36 PM, Ryan Sleevi <rsl...@chromium.org> wrote:


On Aug 22, 2014 10:28 AM, "Matt Menke" <mme...@google.com> wrote:
>
> On Fri, Aug 22, 2014 at 12:17 PM, Ryan Sleevi <rsl...@chromium.org> wrote:
>>
>>
>> On Aug 22, 2014 9:06 AM, "'Matt Menke' via net-dev" <net...@chromium.org> wrote:
>> >
>> > For anyone unfamiliar with Cronet, it's a project to give the Chrome network stack an officially supported API for external embedders.  The project is currently focused on Android and iOS wrappers.
>> >
>> > Currently, the network stack supports two basic types of uploads:  Those where the size is known in advance, which consist of one or more files and/or memory buffers, and chunked uploads, where the size is not know.  Because of lack of server support, and existing web APIs don't need it, chunked uploads are only used for internal Chrome requests, to the extent of my knowledge.
>>
>> Nope. We use these on the Web in Chromium. Voice search, for example.
>
> XHRs don't support chunked uploads, do they?

Voice search does it as a form post with chunked upload.

I think you may be confusing the two ways we have to voice search - one within web content, the other within Chromium itself (via URLRequest)

>>
>> >
>> > Currently, the way the network stack supports chunked uploads (Both within Chrome and outside of Chrome) is for the embedder to just keep on pushing data to the UploadDataStream object, which buffers all the data pushed to it until the request is complete.  The buffering is needed for retries and redirects.  There's no way for an embedder to know when the stream actually needs more data, so they just push data as soon as they get it.
>>
>> Are you sure about this? I worked with the voice team on refactoring UploadDataStream so that the IO completion of the previous write is the signal to write more data.
>
> I'm pretty sure.  If we know size in advance, ElementReaders can provide data on demand.  But if we're doing a chunked upload, this is our only interface.  The HttpStream doesn't have support for another method of chunking, and UploadElementReaders are expected to always know their size, after initialization.

Um, UploadDataStream explicitly doesn't use UploadElementReaders when chunking. Its initialized as chunk and uses AppendChunk directly. The only requirement is that each chunk have a discrete size, which is unsurprising.

Perhaps I misunderstood your response?

UploadDataStream provides no callbacks to inform anything when it needs more data.  And it's not completely clear from code that it's impossible to hack up an UploadElementReader to allow it, which is why I was arguing that's also not possible.

Oh, sorry - it is immediately obvious.  I had been thinking UploadElementReaders could be appended to an UploadDataStream after creation, but before it was passed to the ULRRequest, but that's not the case (Other than for chunked uploads).

Ryan Sleevi

unread,
Aug 22, 2014, 2:08:19 PM8/22/14
to Matt Menke, Ryan Sleevi, net-dev
On Fri, Aug 22, 2014 at 11:02 AM, Matt Menke <mme...@google.com> wrote:
On Fri, Aug 22, 2014 at 1:43 PM, Matt Menke <mme...@google.com> wrote:
On Fri, Aug 22, 2014 at 1:36 PM, Ryan Sleevi <rsl...@chromium.org> wrote:


On Aug 22, 2014 10:28 AM, "Matt Menke" <mme...@google.com> wrote:
>
> On Fri, Aug 22, 2014 at 12:17 PM, Ryan Sleevi <rsl...@chromium.org> wrote:
>>
>>
>> On Aug 22, 2014 9:06 AM, "'Matt Menke' via net-dev" <net...@chromium.org> wrote:
>> >
>> > For anyone unfamiliar with Cronet, it's a project to give the Chrome network stack an officially supported API for external embedders.  The project is currently focused on Android and iOS wrappers.
>> >
>> > Currently, the network stack supports two basic types of uploads:  Those where the size is known in advance, which consist of one or more files and/or memory buffers, and chunked uploads, where the size is not know.  Because of lack of server support, and existing web APIs don't need it, chunked uploads are only used for internal Chrome requests, to the extent of my knowledge.
>>
>> Nope. We use these on the Web in Chromium. Voice search, for example.
>
> XHRs don't support chunked uploads, do they?

Voice search does it as a form post with chunked upload.

I think you may be confusing the two ways we have to voice search - one within web content, the other within Chromium itself (via URLRequest)

>>
>> >
>> > Currently, the way the network stack supports chunked uploads (Both within Chrome and outside of Chrome) is for the embedder to just keep on pushing data to the UploadDataStream object, which buffers all the data pushed to it until the request is complete.  The buffering is needed for retries and redirects.  There's no way for an embedder to know when the stream actually needs more data, so they just push data as soon as they get it.
>>
>> Are you sure about this? I worked with the voice team on refactoring UploadDataStream so that the IO completion of the previous write is the signal to write more data.
>
> I'm pretty sure.  If we know size in advance, ElementReaders can provide data on demand.  But if we're doing a chunked upload, this is our only interface.  The HttpStream doesn't have support for another method of chunking, and UploadElementReaders are expected to always know their size, after initialization.

Um, UploadDataStream explicitly doesn't use UploadElementReaders when chunking. Its initialized as chunk and uses AppendChunk directly. The only requirement is that each chunk have a discrete size, which is unsurprising.

Perhaps I misunderstood your response?

UploadDataStream provides no callbacks to inform anything when it needs more data.  And it's not completely clear from code that it's impossible to hack up an UploadElementReader to allow it, which is why I was arguing that's also not possible.

Oh, sorry - it is immediately obvious.  I had been thinking UploadElementReaders could be appended to an UploadDataStream after creation, but before it was passed to the ULRRequest, but that's not the case (Other than for chunked uploads).

No, the caller can't explicitly supply UploadElementReaders. They can only add chunks.

I'm not sure the best way to describe it, but I think we're talking about related-but-different aspects.

UploadDataStream always expects more data, until a terminal chunk is sent. After all, the request can't complete until it does.

I think you're looking for/describing a way for UploadDataStream to signal back-off to the embedder? Where it's not the terminal chunk, but there's too much data internally?

Matt Menke

unread,
Aug 22, 2014, 2:11:09 PM8/22/14
to Ryan Sleevi, net-dev
On Fri, Aug 22, 2014 at 2:08 PM, Ryan Sleevi <rsl...@chromium.org> wrote:



On Fri, Aug 22, 2014 at 11:02 AM, Matt Menke <mme...@google.com> wrote:
On Fri, Aug 22, 2014 at 1:43 PM, Matt Menke <mme...@google.com> wrote:
On Fri, Aug 22, 2014 at 1:36 PM, Ryan Sleevi <rsl...@chromium.org> wrote:


On Aug 22, 2014 10:28 AM, "Matt Menke" <mme...@google.com> wrote:
>
> On Fri, Aug 22, 2014 at 12:17 PM, Ryan Sleevi <rsl...@chromium.org> wrote:
>>
>>
>> On Aug 22, 2014 9:06 AM, "'Matt Menke' via net-dev" <net...@chromium.org> wrote:
>> >
>> > For anyone unfamiliar with Cronet, it's a project to give the Chrome network stack an officially supported API for external embedders.  The project is currently focused on Android and iOS wrappers.
>> >
>> > Currently, the network stack supports two basic types of uploads:  Those where the size is known in advance, which consist of one or more files and/or memory buffers, and chunked uploads, where the size is not know.  Because of lack of server support, and existing web APIs don't need it, chunked uploads are only used for internal Chrome requests, to the extent of my knowledge.
>>
>> Nope. We use these on the Web in Chromium. Voice search, for example.
>
> XHRs don't support chunked uploads, do they?

Voice search does it as a form post with chunked upload.

I think you may be confusing the two ways we have to voice search - one within web content, the other within Chromium itself (via URLRequest)

>>
>> >
>> > Currently, the way the network stack supports chunked uploads (Both within Chrome and outside of Chrome) is for the embedder to just keep on pushing data to the UploadDataStream object, which buffers all the data pushed to it until the request is complete.  The buffering is needed for retries and redirects.  There's no way for an embedder to know when the stream actually needs more data, so they just push data as soon as they get it.
>>
>> Are you sure about this? I worked with the voice team on refactoring UploadDataStream so that the IO completion of the previous write is the signal to write more data.
>
> I'm pretty sure.  If we know size in advance, ElementReaders can provide data on demand.  But if we're doing a chunked upload, this is our only interface.  The HttpStream doesn't have support for another method of chunking, and UploadElementReaders are expected to always know their size, after initialization.

Um, UploadDataStream explicitly doesn't use UploadElementReaders when chunking. Its initialized as chunk and uses AppendChunk directly. The only requirement is that each chunk have a discrete size, which is unsurprising.

Perhaps I misunderstood your response?

UploadDataStream provides no callbacks to inform anything when it needs more data.  And it's not completely clear from code that it's impossible to hack up an UploadElementReader to allow it, which is why I was arguing that's also not possible.

Oh, sorry - it is immediately obvious.  I had been thinking UploadElementReaders could be appended to an UploadDataStream after creation, but before it was passed to the ULRRequest, but that's not the case (Other than for chunked uploads).

No, the caller can't explicitly supply UploadElementReaders. They can only add chunks.

I'm not sure the best way to describe it, but I think we're talking about related-but-different aspects.

UploadDataStream always expects more data, until a terminal chunk is sent. After all, the request can't complete until it does.

I think you're looking for/describing a way for UploadDataStream to signal back-off to the embedder? Where it's not the terminal chunk, but there's too much data internally?

I thought that's what you were suggesting existed.  If that's not the case, I have no idea what "I worked with the voice team on refactoring UploadDataStream so that the IO completion of the previous write is the signal to write more data." means.

If you mean it's the signal to write more data from the buffer to the HttpStreamBase's buffer, I don't see how that contradicts what I was saying in the first place.

c...@google.com

unread,
Aug 25, 2014, 10:44:04 AM8/25/14
to net...@chromium.org, mme...@google.com
I vote for pull-based, because it makes it much easier to handle backpressure. On mobile, memory is precious, and having a slow network and a bunch of high-bandwidth data sources (camera, microphone) makes a push API get into bad states. Using a pull-based API makes this explicit, at least.

As for blocking vs non blocking, asynchronous file channels and seekable byte channel were added in java 7, but we're a few years away from being able to use java 7 features that aren't bytecode compatible on android. That said, we can still reimplement that same interface, and paper over the fact that FileChannel doesn't actually implement it, although the methods are the same. We could maintain our own Executor, and perform file operations there. With this design, it's reasonable to still have a fixed thread pool (so no thread-per-request) and we have a much simpler migration path for clients when we can drop support for pre-KK android and use AsynchronousFileChannel and the real SeekableByteChannel.

interface SeekableByteChannel extends ReadableByteChannel {
/** Returns this channel's position.*/
long position()
/**Sets this channel's position.*/
position(long newPosition)
};

setUploadChannel(SeekableByteChannel channel);
setUploadChannel(FileChannel channel);

As I've thought about this more and more, I think it's best for chunked uploads of almost all types to go through a file on disk or through a Pipe with configureBlocking set to false. What I haven't figured out is what the cleanest API to specify chunked upload with a FileChannel is.

William Chan (陈智昌)

unread,
Aug 25, 2014, 4:42:53 PM8/25/14
to Matt Menke, net-dev
I checked with Patrick McManus on FF, and he said that they control buffering by indicating if the request body source is seekable or not. If seekable, then the HTTP stack will record the body (buffer it) and replay as needed (e.g. following redirects). Most of their consumers use this option. A non-seekable source can't follow redirects.

I think that designing for a pull-model makes the most sense in general. I see that there are some difficulties for cases like following redirects. My typical recommendation in cases like this is to have a layered API, where simple consumers can wrap calls. But ultimately, it's important to have tight control over the buffering, so we really ought to be exposing a pull-model API.


--

Matt Menke

unread,
Aug 27, 2014, 4:34:02 PM8/27/14
to William Chan (陈智昌), net-dev
One concern raised about the APIs they involve a lot of thread hopping, which is apparently fairly heavy weight under Java.  I don't think we want to be doing blocking IO on the network thread, but something to think about.  Could imagine having an extra buffer in flight, though that adds a copy, or making read size a tune-able parameter.

William Chan (陈智昌)

unread,
Aug 27, 2014, 4:36:05 PM8/27/14
to Matt Menke, net-dev
Can you explain the thread hopping comment in more detail? And shouldn't read size be tunable? It's not obvious to me how you would control buffer utilization without that primitive.

Matt Menke

unread,
Aug 27, 2014, 4:44:01 PM8/27/14
to William Chan (陈智昌), net-dev
All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread), so we'll need to call into Java, and then hop over to another thread.  The current push-based API only does thread hops in C++.

You can control buffer utilization Java-side, though it doesn't work quite as well...I honestly hadn't given much thought to this, but tunable buffer sizes will either need an extra copy, or small code changes in SPDY, QUIC, and HTTP logic.

William Chan (陈智昌)

unread,
Aug 27, 2014, 4:56:44 PM8/27/14
to Matthew Menke, net-dev

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

> so we'll need to call into Java, and then hop over to another thread.  The current push-based API only does thread hops in C++.
>
> You can control buffer utilization Java-side, though it doesn't work quite as well...I honestly hadn't given much thought to this, but tunable buffer sizes will either need an extra copy, or small code changes in SPDY, QUIC, and HTTP logic.

I'm confused why. We definitely have internal buffer copies in our network stack but URLRequest supports tunable read buffer sizes. Why is our java interface different?

Matt Menke

unread,
Aug 27, 2014, 5:03:00 PM8/27/14
to William Chan (陈智昌), net-dev
On Wed, Aug 27, 2014 at 4:56 PM, William Chan (陈智昌) <will...@chromium.org> wrote:

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

We're called by a Java object, which gives us a Java Executor (Or some other Java interface which allows threadhops, so the caller manages threads).  I don't think we can post a message from C++ directly to an Executor, or a thread whose message loop (Or equivalent) Java is managing in some other way.

That having been said, I really don't know anything about Java threading. 

> so we'll need to call into Java, and then hop over to another thread.  The current push-based API only does thread hops in C++.
>
> You can control buffer utilization Java-side, though it doesn't work quite as well...I honestly hadn't given much thought to this, but tunable buffer sizes will either need an extra copy, or small code changes in SPDY, QUIC, and HTTP logic.

I'm confused why. We definitely have internal buffer copies in our network stack but URLRequest supports tunable read buffer sizes. Why is our java interface different?

We don't support tunable upload buffers, actually.  It's always 16k (I had thought it was 32k):  https://code.google.com/p/chromium/codesearch#chromium/src/net/http/http_stream_parser.cc&sq=package:chromium&type=cs&l=28

William Chan (陈智昌)

unread,
Aug 27, 2014, 5:12:43 PM8/27/14
to Matt Menke, net-dev
On Wed, Aug 27, 2014 at 2:02 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 4:56 PM, William Chan (陈智昌) <will...@chromium.org> wrote:

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

We're called by a Java object, which gives us a Java Executor (Or some other Java interface which allows threadhops, so the caller manages threads).  I don't think we can post a message from C++ directly to an Executor, or a thread whose message loop (Or equivalent) Java is managing in some other way.

That having been said, I really don't know anything about Java threading. 

My understanding is a Java Executor is basically a TaskRunner. They're abstract interfaces for posting work. So as long as the network stack internally uses the network TaskRunner everywhere (notably, we don't, because we directly access MessageLoop::current() in some places still), then when instantiating the network TaskRunner, it seems conceivable we can have it forward messages to the Java Executor.
 

> so we'll need to call into Java, and then hop over to another thread.  The current push-based API only does thread hops in C++.
>
> You can control buffer utilization Java-side, though it doesn't work quite as well...I honestly hadn't given much thought to this, but tunable buffer sizes will either need an extra copy, or small code changes in SPDY, QUIC, and HTTP logic.

I'm confused why. We definitely have internal buffer copies in our network stack but URLRequest supports tunable read buffer sizes. Why is our java interface different?

We don't support tunable upload buffers, actually.  It's always 16k (I had thought it was 32k):  https://code.google.com/p/chromium/codesearch#chromium/src/net/http/http_stream_parser.cc&sq=package:chromium&type=cs&l=28

Sorry, we had a terminology conflict. I assumed when you said read, you meant URLRequest::Read(). I didn't realize read buffer size meant reading from the upload buffers. Fair enough, but I also believe that our current upload API should be fixed as well.

Matt Menke

unread,
Aug 27, 2014, 5:21:25 PM8/27/14
to William Chan (陈智昌), net-dev
On Wed, Aug 27, 2014 at 5:12 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:02 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 4:56 PM, William Chan (陈智昌) <will...@chromium.org> wrote:

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

We're called by a Java object, which gives us a Java Executor (Or some other Java interface which allows threadhops, so the caller manages threads).  I don't think we can post a message from C++ directly to an Executor, or a thread whose message loop (Or equivalent) Java is managing in some other way.

That having been said, I really don't know anything about Java threading. 

My understanding is a Java Executor is basically a TaskRunner. They're abstract interfaces for posting work. So as long as the network stack internally uses the network TaskRunner everywhere (notably, we don't, because we directly access MessageLoop::current() in some places still), then when instantiating the network TaskRunner, it seems conceivable we can have it forward messages to the Java Executor.

I'd be surprised if we could get a native interface to an Executor.  I'm also not really sure what the cost is here - is it starting a Java task on a thread (In which case, we obviously have to do that once, anyways, so which it does double the penalty, at least we have an idea how bad it is, relative to what we're already doing), passing objects between threads, or some specific Java->Java hop thing...
 
 

> so we'll need to call into Java, and then hop over to another thread.  The current push-based API only does thread hops in C++.
>
> You can control buffer utilization Java-side, though it doesn't work quite as well...I honestly hadn't given much thought to this, but tunable buffer sizes will either need an extra copy, or small code changes in SPDY, QUIC, and HTTP logic.

I'm confused why. We definitely have internal buffer copies in our network stack but URLRequest supports tunable read buffer sizes. Why is our java interface different?

We don't support tunable upload buffers, actually.  It's always 16k (I had thought it was 32k):  https://code.google.com/p/chromium/codesearch#chromium/src/net/http/http_stream_parser.cc&sq=package:chromium&type=cs&l=28

Sorry, we had a terminology conflict. I assumed when you said read, you meant URLRequest::Read(). I didn't realize read buffer size meant reading from the upload buffers. Fair enough, but I also believe that our current upload API should be fixed as well.

I absolutely agree here, but thought it was worth mentioning that a nice API may have a real performance cost.

William Chan (陈智昌)

unread,
Aug 27, 2014, 5:32:50 PM8/27/14
to Matt Menke, net-dev
On Wed, Aug 27, 2014 at 2:21 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 5:12 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:02 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 4:56 PM, William Chan (陈智昌) <will...@chromium.org> wrote:

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

We're called by a Java object, which gives us a Java Executor (Or some other Java interface which allows threadhops, so the caller manages threads).  I don't think we can post a message from C++ directly to an Executor, or a thread whose message loop (Or equivalent) Java is managing in some other way.

That having been said, I really don't know anything about Java threading. 

My understanding is a Java Executor is basically a TaskRunner. They're abstract interfaces for posting work. So as long as the network stack internally uses the network TaskRunner everywhere (notably, we don't, because we directly access MessageLoop::current() in some places still), then when instantiating the network TaskRunner, it seems conceivable we can have it forward messages to the Java Executor.

I'd be surprised if we could get a native interface to an Executor.  I'm also not really sure what the cost is here - is it starting a Java task on a thread (In which case, we obviously have to do that once, anyways, so which it does double the penalty, at least we have an idea how bad it is, relative to what we're already doing), passing objects between threads, or some specific Java->Java hop thing...

Sounds like we need to consult a domain expert before being able to make a decision here. Do we have one handy?
 
 
 

> so we'll need to call into Java, and then hop over to another thread.  The current push-based API only does thread hops in C++.
>
> You can control buffer utilization Java-side, though it doesn't work quite as well...I honestly hadn't given much thought to this, but tunable buffer sizes will either need an extra copy, or small code changes in SPDY, QUIC, and HTTP logic.

I'm confused why. We definitely have internal buffer copies in our network stack but URLRequest supports tunable read buffer sizes. Why is our java interface different?

We don't support tunable upload buffers, actually.  It's always 16k (I had thought it was 32k):  https://code.google.com/p/chromium/codesearch#chromium/src/net/http/http_stream_parser.cc&sq=package:chromium&type=cs&l=28

Sorry, we had a terminology conflict. I assumed when you said read, you meant URLRequest::Read(). I didn't realize read buffer size meant reading from the upload buffers. Fair enough, but I also believe that our current upload API should be fixed as well.

I absolutely agree here, but thought it was worth mentioning that a nice API may have a real performance cost.

Good APIs are all about tradeoffs, so if the performance is unacceptable, then we need to fix the API.

Ryan Sleevi

unread,
Aug 27, 2014, 6:02:14 PM8/27/14
to William Chan (陈智昌), Matt Menke, net-dev, Philippe Liard, Marcus Bulach
On Wed, Aug 27, 2014 at 2:32 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:21 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 5:12 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:02 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 4:56 PM, William Chan (陈智昌) <will...@chromium.org> wrote:

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

We're called by a Java object, which gives us a Java Executor (Or some other Java interface which allows threadhops, so the caller manages threads).  I don't think we can post a message from C++ directly to an Executor, or a thread whose message loop (Or equivalent) Java is managing in some other way.

That having been said, I really don't know anything about Java threading. 

My understanding is a Java Executor is basically a TaskRunner. They're abstract interfaces for posting work. So as long as the network stack internally uses the network TaskRunner everywhere (notably, we don't, because we directly access MessageLoop::current() in some places still), then when instantiating the network TaskRunner, it seems conceivable we can have it forward messages to the Java Executor.

I'd be surprised if we could get a native interface to an Executor.  I'm also not really sure what the cost is here - is it starting a Java task on a thread (In which case, we obviously have to do that once, anyways, so which it does double the penalty, at least we have an idea how bad it is, relative to what we're already doing), passing objects between threads, or some specific Java->Java hop thing...

Sounds like we need to consult a domain expert before being able to make a decision here. Do we have one handy?

Summoning two domain experts (bulach@, pliard@)
 

William Chan (陈智昌)

unread,
Aug 27, 2014, 6:19:25 PM8/27/14
to Ryan Sleevi, Matt Menke, net-dev, Philippe Liard, Marcus Bulach
On Wed, Aug 27, 2014 at 3:02 PM, Ryan Sleevi <rsl...@chromium.org> wrote:



On Wed, Aug 27, 2014 at 2:32 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:21 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 5:12 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:02 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 4:56 PM, William Chan (陈智昌) <will...@chromium.org> wrote:

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

We're called by a Java object, which gives us a Java Executor (Or some other Java interface which allows threadhops, so the caller manages threads).  I don't think we can post a message from C++ directly to an Executor, or a thread whose message loop (Or equivalent) Java is managing in some other way.

That having been said, I really don't know anything about Java threading. 

My understanding is a Java Executor is basically a TaskRunner. They're abstract interfaces for posting work. So as long as the network stack internally uses the network TaskRunner everywhere (notably, we don't, because we directly access MessageLoop::current() in some places still), then when instantiating the network TaskRunner, it seems conceivable we can have it forward messages to the Java Executor.

I'd be surprised if we could get a native interface to an Executor.  I'm also not really sure what the cost is here - is it starting a Java task on a thread (In which case, we obviously have to do that once, anyways, so which it does double the penalty, at least we have an idea how bad it is, relative to what we're already doing), passing objects between threads, or some specific Java->Java hop thing...

Sounds like we need to consult a domain expert before being able to make a decision here. Do we have one handy?

Summoning two domain experts (bulach@, pliard@)

Ryan told me that in Android, Java callbacks can only run on the UI thread. And Java threads are expensive due to memory overhead (8MB?) amongst other things. Scheduling is mostly OK, although Android does have some thread scheduling issues, but I'll ignore those for now. But there's a cost for the JNI marshalling/unmarshalling. In order to limit this cost and reduce copies, we'd probably want to pass file descriptors (so we can do something like sendfile()) if possible. It's unclear if the Android APIs support the necessary low level platform access in order to pass descriptors rather than having to buffer in Java and marshall/unmarshall across JNI. Marcus? Philippe?

Marcus Bulach

unread,
Aug 27, 2014, 6:19:32 PM8/27/14
to Ryan Sleevi, William Chan (陈智昌), Matt Menke, net-dev, Philippe Liard
On Wed, Aug 27, 2014 at 6:02 PM, Ryan Sleevi <rsl...@chromium.org> wrote:



On Wed, Aug 27, 2014 at 2:32 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:21 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 5:12 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:02 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 4:56 PM, William Chan (陈智昌) <will...@chromium.org> wrote:

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

We're called by a Java object, which gives us a Java Executor (Or some other Java interface which allows threadhops, so the caller manages threads).  I don't think we can post a message from C++ directly to an Executor, or a thread whose message loop (Or equivalent) Java is managing in some other way.

That having been said, I really don't know anything about Java threading. 

My understanding is a Java Executor is basically a TaskRunner. They're abstract interfaces for posting work. So as long as the network stack internally uses the network TaskRunner everywhere (notably, we don't, because we directly access MessageLoop::current() in some places still), then when instantiating the network TaskRunner, it seems conceivable we can have it forward messages to the Java Executor.

I'd be surprised if we could get a native interface to an Executor.  I'm also not really sure what the cost is here - is it starting a Java task on a thread (In which case, we obviously have to do that once, anyways, so which it does double the penalty, at least we have an idea how bad it is, relative to what we're already doing), passing objects between threads, or some specific Java->Java hop thing...

Sounds like we need to consult a domain expert before being able to make a decision here. Do we have one handy?

Summoning two domain experts (bulach@, pliard@)

You're very kind, Ryan :)

I don't quite fully follow the discussion, but anyways, let me try...
I don't think you can get a native interface to an Executor, but you're right, it's essentially a TaskRunner interface..
Having said that, MessageLoop::current() in C++ world is roughtly equivalent to composing a Handler and a Looper in java:
then posting a Runnable (aka Task):

In other words, from the Caller Thread, you can get do stash something like "Hander h = new Handler(Looper.myLooper)", then later on, from other threads, do a h.post()...
Caveat: a "pure" java thread may not have a Looper.. that's an android component.

Long story:
In chromium code, we tried to make dependencies crossing JNI boundaries in a 1:1 relatiionship, i.e., one java object is used / uses one c++ object...
So if you need to do this thread hopping, make sure that is all implemented in the java land.
If you need to PostTask in C++, that's also totally fine...
Trying to get hold of Java Executors, or any other such stuff from C++, and you'll have a bad time :)

It introduces sort of a "diagonal" dependency that you surely don't want to manage, where one C++ class may end up depending on multiple java classes or vice versa... get hairy quite quickly.... wrap everything you need in a single place, provide an API for that, and treat "the other side" (either java or C++) as a single client.

Matt Menke

unread,
Aug 27, 2014, 6:26:22 PM8/27/14
to William Chan (陈智昌), Ryan Sleevi, net-dev, Philippe Liard, Marcus Bulach
Java has its own way to handle concurrent file operations, so it was strongly suggested to us that getting a native file handle would mess with this and was just not a very good idea.

William Chan (陈智昌)

unread,
Aug 27, 2014, 6:31:13 PM8/27/14
to Matt Menke, Ryan Sleevi, net-dev, Philippe Liard, Marcus Bulach
On Wed, Aug 27, 2014 at 3:26 PM, Matt Menke <mme...@google.com> wrote:
Java has its own way to handle concurrent file operations, so it was strongly suggested to us that getting a native file handle would mess with this and was just not a very good idea.

Being suboptimal with Java concurrent file operations is one thing, but being unable to sendfile()/splice()/etc is another. More info is needed in order to evaluate this tradeoff. Who suggested this to you and can they weigh in on this tradeoff?

Matt Menke

unread,
Aug 27, 2014, 6:32:10 PM8/27/14
to William Chan (陈智昌), Ryan Sleevi, net-dev, Philippe Liard, Marcus Bulach, Charles Munger
[+clm]

Marcus Bulach

unread,
Aug 27, 2014, 6:32:33 PM8/27/14
to William Chan (陈智昌), Ryan Sleevi, Matt Menke, net-dev, Philippe Liard, rt...@chromium.org
On Wed, Aug 27, 2014 at 6:19 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 3:02 PM, Ryan Sleevi <rsl...@chromium.org> wrote:



On Wed, Aug 27, 2014 at 2:32 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:21 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 5:12 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 2:02 PM, Matt Menke <mme...@google.com> wrote:
On Wed, Aug 27, 2014 at 4:56 PM, William Chan (陈智昌) <will...@chromium.org> wrote:

On Aug 27, 2014 1:44 PM, "Matt Menke" <mme...@google.com> wrote:
>
> All callbacks into Java to request upload data will be called on a thread other than the network thread (Which will also be a Java-managed thread),

Why? Is this a technical constraint we aren't able to overcome?

We're called by a Java object, which gives us a Java Executor (Or some other Java interface which allows threadhops, so the caller manages threads).  I don't think we can post a message from C++ directly to an Executor, or a thread whose message loop (Or equivalent) Java is managing in some other way.

That having been said, I really don't know anything about Java threading. 

My understanding is a Java Executor is basically a TaskRunner. They're abstract interfaces for posting work. So as long as the network stack internally uses the network TaskRunner everywhere (notably, we don't, because we directly access MessageLoop::current() in some places still), then when instantiating the network TaskRunner, it seems conceivable we can have it forward messages to the Java Executor.

I'd be surprised if we could get a native interface to an Executor.  I'm also not really sure what the cost is here - is it starting a Java task on a thread (In which case, we obviously have to do that once, anyways, so which it does double the penalty, at least we have an idea how bad it is, relative to what we're already doing), passing objects between threads, or some specific Java->Java hop thing...

Sounds like we need to consult a domain expert before being able to make a decision here. Do we have one handy?

Summoning two domain experts (bulach@, pliard@)

Ryan told me that in Android, Java callbacks can only run on the UI thread.

Hmm, "java callbacks" is fairly broad :)
Like C++, I think it's callbacks normally happen in the caller thread, and apps normally do this in their main/UI thread.
System callbacks are even more likely to be done in the UI thread, but there are APIs where you pass a Handler (aka MessageLoop):

 
And Java threads are expensive due to memory overhead (8MB?) amongst other things. Scheduling is mostly OK, although Android does have some thread scheduling issues, but I'll ignore those for now. But there's a cost for the JNI marshalling/unmarshalling. In order to limit this cost and reduce copies, we'd probably want to pass file descriptors (so we can do something like sendfile()) if possible. It's unclear if the Android APIs support the necessary low level platform access in order to pass descriptors rather than having to buffer in Java and marshall/unmarshall across JNI. Marcus? Philippe?

passing fd to java is very hard, there's no API to read from it..
iirc, rtoy@ played around with it for WebAudio...
another option maybe a local socket, but that may have even larger overheads and context switches than pure marshalling across JNI..
There's also java.nio that can potentially avoid that by making the buffer being managed by C++...
I never really used that though..

Philippe Liard

unread,
Aug 27, 2014, 6:35:41 PM8/27/14
to Marcus Bulach, Ryan Sleevi, William Chan (陈智昌), Matt Menke, net-dev
To confirm what Marcus said, a "regular" Java thread won't have a message loop. Is the Java thread that you are referring to an internal Java thread that you create/control (in that case you could setup a Looper if running on Android or an Executor implementation otherwise) or is it the Java thread (e.g. UI thread) that Cronet is running on (that you wouldn't control)?

Matt Menke

unread,
Aug 27, 2014, 6:48:00 PM8/27/14
to Philippe Liard, Marcus Bulach, Ryan Sleevi, William Chan (陈智昌), net-dev
We want to call into an embedder-controlled thread, with an Executor (Unless there's a reason to prefer a Looper, using the non-Android-specific class seems to make more sense), from the network thread, which is a C++ managed thread with a MessageLoop and all that fun stuff.

Marcus Bulach

unread,
Aug 27, 2014, 7:10:29 PM8/27/14
to Matt Menke, Philippe Liard, Ryan Sleevi, William Chan (陈智昌), net-dev
I suppose the the embedder will be providing the Executor?

if so, then you could just hold on to it on your own java class, and expose something through your own JNI that can be called from the C++ network thread...

if you're creating your own Executor, how is it attaching to the embedder-controlled thread?

Matt Menke

unread,
Aug 27, 2014, 7:13:23 PM8/27/14
to Marcus Bulach, Philippe Liard, Ryan Sleevi, William Chan (陈智昌), net-dev, Charles Munger
Yea, it's passed in by the embedder.  The concern mentioned by one of our embedders was that Java thread hops are very costly.  In the current API for uploads, for instance, the embedder can just keep pushing data to the network stack, which buffers it internally on its own thread.  Our new (proposed) API has the network stack calling into the embedder as it needs data, using an embedder-provided executor, and they were very concerned about performance of the extra thread hops to request data.

Philippe Liard

unread,
Aug 27, 2014, 7:52:29 PM8/27/14
to Matt Menke, Marcus Bulach, Ryan Sleevi, William Chan (陈智昌), net-dev, Charles Munger
Are they more concerned about the extra latency that would be added by thread-hoping or more by e.g. the waste of CPU cycles? I wouldn't worry too much about the latter given that this will be I/O bound. The former may make the transfer of individual data chunks be delayed my several milliseconds on Android though especially in case of CPU contention. You could probably mitigate that by increasing the size of the chunks as it was previously suggested I believe to avoid doing too many hops.

Passing tasks from a native thread to a Java thread sounds possible to me at least since in your design the embedder is required to provide the Executor. You would just need to do what Marcus suggested above.

Matt Menke

unread,
Aug 27, 2014, 8:05:55 PM8/27/14
to Philippe Liard, Marcus Bulach, Ryan Sleevi, William Chan (陈智昌), net-dev, Charles Munger
It sounded to me like the concern was CPU overhead, not latency.  If it's latency, we could easily just have two buffers, and rotate between them, writing from one to the socket, and getting data from the embedder on the other.

Charles Munger

unread,
Aug 27, 2014, 8:40:50 PM8/27/14
to Matt Menke, William Chan (陈智昌), net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard

Callbacks on android can happen on any thread, not only on the ui thread. The idea that thread contention could cause multi-millisecond delays is (I think) unsubstantiated unless there's a GC. There should not be much contention here. As for marshaling/unmarshalling data, since we're using direct ByteBuffers we don't actually have to do much work - it's basically handing off a pointer.

I will say that doing a disk read or write of the size typical for the stack in a typical pattern (i.e., sequential) is less than a millisecond on modern devices - flash storage, even crappy flash storage, handles this much better than hard drives do. Not all devices are modern however, so we should measure on an Asus transformer or similar device with notoriously bad storage speeds.

As for a CPU/memory tradeoff, CPU off the UI thread tends to be plentiful even on low end devices - it's garbage generation and UI thread work that causes dropped frames.

Passing file descriptors around isn't ideal, because in java files are guaranteed to be consistent per VM, and I think accessing them from native code breaks that guarantee.

Also worth noting that the interface is the same for the consumer regardless of whether we do the writing/reading on the network thread or not, which is nice. Same thing with two buffers vs one buffer. If we did do multiple buffers, we'd probably just have one per executor thread.

Raymond Toy

unread,
Aug 28, 2014, 1:49:38 PM8/28/14
to Marcus Bulach, William Chan (陈智昌), Ryan Sleevi, Matt Menke, net-dev, Philippe Liard
Yes, for WebAudio, a fd is passed to Java. The fd points to a file[1] which Java reads in as an encoded audio file and sends back the decoded data to C++ via a pipe (and hence another fd.)


[1] It used to be a chunk of shared memory (ashmem?) but on some devices that tickled some assert within Android. We switched to writing the data out to a file, creating an fd for that to hand to Java.

William Chan (陈智昌)

unread,
Aug 28, 2014, 2:02:35 PM8/28/14
to Matt Menke, Philippe Liard, Marcus Bulach, Ryan Sleevi, net-dev
Sorry for all the dumb questions. Why does the embedder-controlled Java thread need to be separate from the network thread? Is it impossible for the embedder to provide a java Executor which Cronet could wrap within a SequencedTaskRunner, and then as long as all //net code used this SequencedTaskRunner, we'd be good?

Matt Menke

unread,
Aug 28, 2014, 2:19:13 PM8/28/14
to William Chan (陈智昌), Philippe Liard, Marcus Bulach, Ryan Sleevi, net-dev
Something needs to block the thread while waiting for an event.  MessageLoop expects to be the one doing it in Chrome code, and Java expects to the one doing it for a Java managed thread.  I'd be surprised if Java provided an API to let its threads be controlled from C++.

William Chan (陈智昌)

unread,
Aug 28, 2014, 2:26:05 PM8/28/14
to Matt Menke, Philippe Liard, Marcus Bulach, Ryan Sleevi, net-dev
Let's distinguish event pumping from task running. When the Chromium network stack internally generates tasks for running in the future, it posts it to a C++ TaskRunner, which is an abstract interface (basically a Java Executor) for running/executing a task.

Now, for event pumping, you need a native event pumping. Chromium provides concrete pumps, like MessagePumpLibevent (for POSIX systems that libevent supports). Java will have its only event pumping. So we'd either need a way to JNI wrap the C++ event pump so we could have a Java Executor implementation pump both the Java and JNI wrapper at the same time, or we'd have to separate them out into different threads so each native pump can completely own the thread.

At least, that's my understanding. Does this sound right to you?

Matt Menke

unread,
Aug 28, 2014, 2:43:35 PM8/28/14
to William Chan (陈智昌), Philippe Liard, Marcus Bulach, Ryan Sleevi, net-dev
I'm no expert, but yes, it sounds right.

Marcus Bulach

unread,
Aug 28, 2014, 4:56:07 PM8/28/14
to Matt Menke, William Chan (陈智昌), Philippe Liard, Ryan Sleevi, net-dev
I have no idea about the APIs you're exposing, but let me expand my previous question about the "Executor"...

Client code in java can do something (simplified for brevity)
Runnable r = new Runnable() { void run() { doFoo(); callIntoCronet(); doBar(); };
Thread t = new Thread(r);
t.start();

Note that the thread |t| has no Executor, no message pumps, no Looper, no task runners... :)

The android way to get callbacks in that caller thread, is by taking a Looper in your API, like:
or alternatively, to use Looper.myLooper() at method entry (which would return null for the case above, so it needs to be clarified in the API)...

callIntoCronet() is then free to create its own threads, hop around, and later use the Looper to post a message to the caller (usual multi-threading caveats apply).

Hope that helps!!

Thanks,
Marcus

Charles Munger

unread,
Aug 28, 2014, 6:11:03 PM8/28/14
to Marcus Bulach, Matt Menke, William Chan (陈智昌), Philippe Liard, Ryan Sleevi, net-dev
I vote for using Executors, not Loopers. 


You received this message because you are subscribed to a topic in the Google Groups "net-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/a/chromium.org/d/topic/net-dev/yvmWG4hexBE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to net-dev+u...@chromium.org.

To post to this group, send email to net...@chromium.org.

William Chan (陈智昌)

unread,
Aug 29, 2014, 4:26:18 PM8/29/14
to Charles Munger, Matt Menke, net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard
On Wed, Aug 27, 2014 at 5:40 PM, Charles Munger <c...@google.com> wrote:

Callbacks on android can happen on any thread, not only on the ui thread. The idea that thread contention could cause multi-millisecond delays is (I think) unsubstantiated unless there's a GC. There should not be much contention here. As for marshaling/unmarshalling data, since we're using direct ByteBuffers we don't actually have to do much work - it's basically handing off a pointer.

I will say that doing a disk read or write of the size typical for the stack in a typical pattern (i.e., sequential) is less than a millisecond on modern devices - flash storage, even crappy flash storage, handles this much better than hard drives do. Not all devices are modern however, so we should measure on an Asus transformer or similar device with notoriously bad storage speeds.

As for a CPU/memory tradeoff, CPU off the UI thread tends to be plentiful even on low end devices - it's garbage generation and UI thread work that causes dropped frames.

Passing file descriptors around isn't ideal, because in java files are guaranteed to be consistent per VM, and I think accessing them from native code breaks that guarantee.

What does it mean for files to be consistent per VM? What kind of consistency do we need? If a Cronet application wants to issue a HTTP POST request uploading /foo/bar from the file system, can't it just hand off the handle to Cronet and let it zero copy the file to the network via sendfile()? What kind of file consistency would that violate?

Marcus Bulach

unread,
Aug 29, 2014, 4:53:31 PM8/29/14
to William Chan (陈智昌), Charles Munger, Matt Menke, net-dev, Ryan Sleevi, Philippe Liard
On Fri, Aug 29, 2014 at 4:26 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Wed, Aug 27, 2014 at 5:40 PM, Charles Munger <c...@google.com> wrote:

Callbacks on android can happen on any thread, not only on the ui thread. The idea that thread contention could cause multi-millisecond delays is (I think) unsubstantiated unless there's a GC. There should not be much contention here. As for marshaling/unmarshalling data, since we're using direct ByteBuffers we don't actually have to do much work - it's basically handing off a pointer.

I will say that doing a disk read or write of the size typical for the stack in a typical pattern (i.e., sequential) is less than a millisecond on modern devices - flash storage, even crappy flash storage, handles this much better than hard drives do. Not all devices are modern however, so we should measure on an Asus transformer or similar device with notoriously bad storage speeds.


sorry, I only saw this now... :)
gavinp and the people in Paris did a very deep analysis about flash storage on android for the http cache not too long ago:
I think they have added plenty of histograms since, so we may have this data already!

Charles Munger

unread,
Aug 29, 2014, 4:54:35 PM8/29/14
to William Chan (陈智昌), net-dev, Philippe Liard, Marcus Bulach, Ryan Sleevi, Matt Menke

There are lots of cases I can imagine where an app would want to be writing to a file while uploading it (a live streaming video app, for example).

William Chan (陈智昌)

unread,
Aug 29, 2014, 4:55:50 PM8/29/14
to Charles Munger, net-dev, Philippe Liard, Marcus Bulach, Ryan Sleevi, Matt Menke
On Fri, Aug 29, 2014 at 1:54 PM, Charles Munger <c...@google.com> wrote:

There are lots of cases I can imagine where an app would want to be writing to a file while uploading it (a live streaming video app, for example).

That's fair. But does that mean you don't believe we should provide a zero copy API at all?

Charles Munger

unread,
Aug 29, 2014, 5:07:51 PM8/29/14
to William Chan (陈智昌), net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard, Matt Menke

I'm not sure how the stack works internally, but won't it not be zero copy because of gzip?

I think a one-copy is not the end of the world in exchange for the safety of channels, but we only need that for chunked uploads. If you know the length of the file, you shouldn't be writing to it, and we don't need to worry about consistency. It shouldn't be hard to special case FileChannel for a zero copy implementation.

William Chan (陈智昌)

unread,
Aug 29, 2014, 6:05:56 PM8/29/14
to Charles Munger, net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard, Matt Menke
On Fri, Aug 29, 2014 at 2:07 PM, Charles Munger <c...@google.com> wrote:

I'm not sure how the stack works internally, but won't it not be zero copy because of gzip?


We don't gzip on upload, although I guess we could eventually support that. If you're talking about download, there's sadly plenty of non-gzip content.
 

I think a one-copy is not the end of the world in exchange for the safety of channels, but we only need that for chunked uploads. If you know the length of the file, you shouldn't be writing to it, and we don't need to worry about consistency. It shouldn't be hard to special case FileChannel for a zero copy implementation.

Can you explain why it would only be one copy? From what I can tell, if you don't zero-copy in the kernel, you get the following copies:
(1) Read from kernel file descriptor into java
(2) Marshal from java across JNI to C++, I think this is a copy unless we use shared memory or something.
(3) Write from C++ buffer into network socket descriptor

Also, going back to the chunked upload discussion, can we use a channel based approach with specializations for zero copies? And preferably, can we hand a channel into the upload API so Cronet can pull as needed, thereby controlling buffering?

Charles Munger

unread,
Aug 29, 2014, 7:46:57 PM8/29/14
to William Chan (陈智昌), net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard, Matt Menke
You get 1 and 3 but not 2. The way direct bytebuffers in java work is that they just wrap a void pointer, so I just use them to wrap an IOBuffer's data. This is why I use channels over input streams.

In the non-chunked case, we absolutely can use the zero copy API, since people shouldn't be writing to files they already know the length of, and we can make this clear in the doc. However, upon further investigation there's no easy way to get the FD out of a FileChannel, so we'll have to provide another method for it.

That's actually quite desirable in general though, because http://developer.android.com/reference/android/os/ParcelFileDescriptor.html is the standard way of getting data from contentproviders, and they don't natively support channels without wrapping an inputstream. I don't know much about low-level kernel things like this, so I'm not 100% sure they're the same file descriptor, but it should work. This seems like it would only work for non gzip content, and at least for google servers gzip is widely supported, so I'd like it. Why don't we gzip on upload?

Currently the way the API works is that you give cronet a channel, and it pulls from it as it pleases. 

William Chan (陈智昌)

unread,
Aug 31, 2014, 9:52:07 PM8/31/14
to Charles Munger, net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard, Matt Menke
On Fri, Aug 29, 2014 at 4:46 PM, Charles Munger <c...@google.com> wrote:
You get 1 and 3 but not 2. The way direct bytebuffers in java work is that they just wrap a void pointer, so I just use them to wrap an IOBuffer's data. This is why I use channels over input streams.

Ah, great. Anyway, saving 2 copies and buffer memory is great too :)
 

In the non-chunked case, we absolutely can use the zero copy API, since people shouldn't be writing to files they already know the length of, and we can make this clear in the doc. However, upon further investigation there's no easy way to get the FD out of a FileChannel, so we'll have to provide another method for it.

It's not just sendfile() with a known file length. There's also splice() where you're proxying. For example, reading data from Google Drive on one socket and outputting it to a destination socket. My specific example is of course broken because we'd have to pull into user space to decrypt, but you get the idea.
 

That's actually quite desirable in general though, because http://developer.android.com/reference/android/os/ParcelFileDescriptor.html is the standard way of getting data from contentproviders, and they don't natively support channels without wrapping an inputstream. I don't know much about low-level kernel things like this, so I'm not 100% sure they're the same file descriptor, but it should work. This seems like it would only work for non gzip content, and at least for google servers gzip is widely supported, so I'd like it. Why don't we gzip on upload?

We don't gzip on upload because the Chromium network stack's primary consumer is the open web. There are two problems with gzip on upload in the _general_ case: (1) intermediaries (2) origin servers. HTTPS mostly eliminates issues with (1), although I can't guarantee that a MITM proxy (when you have an administratively installed root cert, such as the enterprise case) won't barf on it. YMMV. (2) is simply the case because not all frontends expect to have to support gzip for request bodies. Unlike response bodies where you already have burned a roundtrip on negotiation (and thereby know it's safe), with requests you can't negotiate. Therefore, you need prior knowledge in order to gzip on upload via HTTP content coding. Of course, with Cronet, it's entirely reasonable to presume prior knowledge (since it's likely the client application and app server are owned by the same entities), so it's likely that we will eventually want to provide an option for this. Although we might be lazy about implementing this, since it's also entirely reasonable to perform the compression/decompression above the HTTP stack, rather than force Cronet to do this. This approach is also more resistant to interference from HTTP intermediaries.

Charles Munger

unread,
Sep 1, 2014, 2:18:09 AM9/1/14
to William Chan (陈智昌), net-dev, Philippe Liard, Marcus Bulach, Ryan Sleevi, Matt Menke


On Aug 31, 2014 6:52 PM, "William Chan (陈智昌)" <will...@chromium.org> wrote:
>
> On Fri, Aug 29, 2014 at 4:46 PM, Charles Munger <c...@google.com> wrote:
>>
>> You get 1 and 3 but not 2. The way direct bytebuffers in java work is that they just wrap a void pointer, so I just use them to wrap an IOBuffer's data. This is why I use channels over input streams.
>
>
> Ah, great. Anyway, saving 2 copies and buffer memory is great too :)

To clarify, is there any savings for encrypted connections?

>  
>>
>>
>> In the non-chunked case, we absolutely can use the zero copy API, since people shouldn't be writing to files they already know the length of, and we can make this clear in the doc. However, upon further investigation there's no easy way to get the FD out of a FileChannel, so we'll have to provide another method for it.
>
>
> It's not just sendfile() with a known file length. There's also splice() where you're proxying. For example, reading data from Google Drive on one socket and outputting it to a destination socket. My specific example is of course broken because we'd have to pull into user space to decrypt, but you get the idea.

I knew there was something I was missing. I'm of the opinion that we should not complicate the API in order to obtain an optimization that only has a benefit for unencrypted connections.

>  
>>
>>
>> That's actually quite desirable in general though, because http://developer.android.com/reference/android/os/ParcelFileDescriptor.html is the standard way of getting data from contentproviders, and they don't natively support channels without wrapping an inputstream. I don't know much about low-level kernel things like this, so I'm not 100% sure they're the same file descriptor, but it should work. This seems like it would only work for non gzip content, and at least for google servers gzip is widely supported, so I'd like it. Why don't we gzip on upload?
>
>
> We don't gzip on upload because the Chromium network stack's primary consumer is the open web. There are two problems with gzip on upload in the _general_ case: (1) intermediaries (2) origin servers. HTTPS mostly eliminates issues with (1), although I can't guarantee that a MITM proxy (when you have an administratively installed root cert, such as the enterprise case) won't barf on it. YMMV. (2) is simply the case because not all frontends expect to have to support gzip for request bodies. Unlike response bodies where you already have burned a roundtrip on negotiation (and thereby know it's safe), with requests you can't negotiate. Therefore, you need prior knowledge in order to gzip on upload via HTTP content coding. Of course, with Cronet, it's entirely reasonable to presume prior knowledge (since it's likely the client application and app server are owned by the same entities), so it's likely that we will eventually want to provide an option for this. Although we might be lazy about implementing this, since it's also entirely reasonable to perform the compression/decompression above the HTTP stack, rather than force Cronet to do this. This approach is also more resistant to interference from HTTP intermediaries.

Interesting. Can we not get this info after the SPDY handshake?

William Chan (陈智昌)

unread,
Sep 1, 2014, 5:24:36 PM9/1/14
to Charles Munger, net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard, Matt Menke

On Aug 31, 2014 11:18 PM, "Charles Munger" <c...@google.com> wrote:
>
>
> On Aug 31, 2014 6:52 PM, "William Chan (陈智昌)" <will...@chromium.org> wrote:
> >
> > On Fri, Aug 29, 2014 at 4:46 PM, Charles Munger <c...@google.com> wrote:
> >>
> >> You get 1 and 3 but not 2. The way direct bytebuffers in java work is that they just wrap a void pointer, so I just use them to wrap an IOBuffer's data. This is why I use channels over input streams.
> >
> >
> > Ah, great. Anyway, saving 2 copies and buffer memory is great too :)
>
> To clarify, is there any savings for encrypted connections?

Depends if you are terminating the encrypted connection. If you are a passthrough proxy that doesn't inspect the encrypted traffic, then there are some definite savings.

>
> >  
> >>
> >>
>
> >> In the non-chunked case, we absolutely can use the zero copy API, since people shouldn't be writing to files they already know the length of, and we can make this clear in the doc. However, upon further investigation there's no easy way to get the FD out of a FileChannel, so we'll have to provide another method for it.
> >
> >
> > It's not just sendfile() with a known file length. There's also splice() where you're proxying. For example, reading data from Google Drive on one socket and outputting it to a destination socket. My specific example is of course broken because we'd have to pull into user space to decrypt, but you get the idea.
>
> I knew there was something I was missing. I'm of the opinion that we should not complicate the API in order to obtain an optimization that only has a benefit for unencrypted connections.

Haha! Well played sir, you are targeting my weakness for encouraging encryption. I mostly agree with that statement, but it also so happens that I believe (in general, although I'm still trying to understand the Android specific conditions here) that a pull based channel/stream approach is best. So I need to spend a bit more time to grok the ByteBuffer suggestions here.

>

> >> That's actually quite desirable in general though, because http://developer.android.com/reference/android/os/ParcelFileDescriptor.html is the standard way of getting data from contentproviders, and they don't natively support channels without wrapping an inputstream. I don't know much about low-level kernel things like this, so I'm not 100% sure they're the same file descriptor, but it should work. This seems like it would only work for non gzip content, and at least for google servers gzip is widely supported, so I'd like it. Why don't we gzip on upload?
> >
> >
> > We don't gzip on upload because the Chromium network stack's primary consumer is the open web. There are two problems with gzip on upload in the _general_ case: (1) intermediaries (2) origin servers. HTTPS mostly eliminates issues with (1), although I can't guarantee that a MITM proxy (when you have an administratively installed root cert, such as the enterprise case) won't barf on it. YMMV. (2) is simply the case because not all frontends expect to have to support gzip for request bodies. Unlike response bodies where you already have burned a roundtrip on negotiation (and thereby know it's safe), with requests you can't negotiate. Therefore, you need prior knowledge in order to gzip on upload via HTTP content coding. Of course, with Cronet, it's entirely reasonable to presume prior knowledge (since it's likely the client application and app server are owned by the same entities), so it's likely that we will eventually want to provide an option for this. Although we might be lazy about implementing this, since it's also entirely reasonable to perform the compression/decompression above the HTTP stack, rather than force Cronet to do this. This approach is also more resistant to interference from HTTP intermediaries.
>
> Interesting. Can we not get this info after the SPDY handshake?

No. This was a big subject of debate in the httpbis working group and it was decided that we couldn't support this in the initial handshake. It's complicated. I can send you a link to the thread if you want, but the short answer is no we can't do this in a standard way. It's possible if you control both the client and server to do something that's non-standard.

Also, it's still safer for the application to do the content encoding itself. It's risky to have the library do it, because you want to make sure the application is thinking about whether or not secrets are getting compressed with attacker controlled data. They should use a separate compression context so that attackers can't probe for the secret a la CRIME. Cronet doesn't have enough information about the content to be able to ensure this, so it's up to the embedder to do it. As long as you're not sending any secrets in the data stream, then it's safe for Cronet to do the compression itself, but then someone screws up and we have a security bug.

Matt Menke

unread,
Sep 1, 2014, 5:58:11 PM9/1/14
to Charles Munger, William Chan (陈智昌), net-dev, Philippe Liard, Marcus Bulach, Ryan Sleevi
On Mon, Sep 1, 2014 at 2:18 AM, Charles Munger <c...@google.com> wrote:


On Aug 31, 2014 6:52 PM, "William Chan (陈智昌)" <will...@chromium.org> wrote:
>
> On Fri, Aug 29, 2014 at 4:46 PM, Charles Munger <c...@google.com> wrote:
>>
>> You get 1 and 3 but not 2. The way direct bytebuffers in java work is that they just wrap a void pointer, so I just use them to wrap an IOBuffer's data. This is why I use channels over input streams.
>
>
> Ah, great. Anyway, saving 2 copies and buffer memory is great too :)

To clarify, is there any savings for encrypted connections?


We'd save the same number of copies for encrypted and unencrypted connections, though encrypted ones have extra copies, of course, and a lot of other work besides, so saving 2 copies may make a negligible difference in that case, but we'd still be saving the copies.

Charles Munger

unread,
Sep 1, 2014, 6:51:58 PM9/1/14
to William Chan (陈智昌), net-dev, Philippe Liard, Marcus Bulach, Ryan Sleevi, Matt Menke


On Sep 1, 2014 2:24 PM, "William Chan (陈智昌)" <will...@chromium.org> wrote:
>
> On Aug 31, 2014 11:18 PM, "Charles Munger" <c...@google.com> wrote:
> >
> >
> > On Aug 31, 2014 6:52 PM, "William Chan (陈智昌)" <will...@chromium.org> wrote:
> > >
> > > On Fri, Aug 29, 2014 at 4:46 PM, Charles Munger <c...@google.com> wrote:
> > >>
> > >> You get 1 and 3 but not 2. The way direct bytebuffers in java work is that they just wrap a void pointer, so I just use them to wrap an IOBuffer's data. This is why I use channels over input streams.
> > >
> > >
> > > Ah, great. Anyway, saving 2 copies and buffer memory is great too :)
> >
> > To clarify, is there any savings for encrypted connections?
>
> Depends if you are terminating the encrypted connection. If you are a passthrough proxy that doesn't inspect the encrypted traffic, then there are some definite savings.

I don't think there are going to be a lot of consumers using this to run proxy servers on their phones.

>
> >
> > >  
> > >>
> > >>
> >
> > >> In the non-chunked case, we absolutely can use the zero copy API, since people shouldn't be writing to files they already know the length of, and we can make this clear in the doc. However, upon further investigation there's no easy way to get the FD out of a FileChannel, so we'll have to provide another method for it.
> > >
> > >
> > > It's not just sendfile() with a known file length. There's also splice() where you're proxying. For example, reading data from Google Drive on one socket and outputting it to a destination socket. My specific example is of course broken because we'd have to pull into user space to decrypt, but you get the idea.
> >
> > I knew there was something I was missing. I'm of the opinion that we should not complicate the API in order to obtain an optimization that only has a benefit for unencrypted connections.
>
> Haha! Well played sir, you are targeting my weakness for encouraging encryption. I mostly agree with that statement, but it also so happens that I believe (in general, although I'm still trying to understand the Android specific conditions here) that a pull based channel/stream approach is best. So I need to spend a bit more time to grok the ByteBuffer suggestions here.

OK. We're in agreement that a pull based approach is best, that's how it's currently implemented. I have a pretty good understanding of the java and android requirements and options, if you want we could meet tomorrow and post a summary to the thread.

Charles Munger

unread,
Sep 1, 2014, 7:00:17 PM9/1/14
to Matt Menke, William Chan (陈智昌), net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard


On Sep 1, 2014 2:58 PM, "Matt Menke" <mme...@google.com> wrote:
>
> On Mon, Sep 1, 2014 at 2:18 AM, Charles Munger <c...@google.com> wrote:
>>
>>
>> On Aug 31, 2014 6:52 PM, "William Chan (陈智昌)" <will...@chromium.org> wrote:
>> >
>> > On Fri, Aug 29, 2014 at 4:46 PM, Charles Munger <c...@google.com> wrote:
>> >>
>> >> You get 1 and 3 but not 2. The way direct bytebuffers in java work is that they just wrap a void pointer, so I just use them to wrap an IOBuffer's data. This is why I use channels over input streams.
>> >
>> >
>> > Ah, great. Anyway, saving 2 copies and buffer memory is great too :)
>>
>> To clarify, is there any savings for encrypted connections?
>
>
> We'd save the same number of copies for encrypted and unencrypted connections, though encrypted ones have extra copies, of course, and a lot of other work besides, so saving 2 copies may make a negligible difference in that case, but we'd still be saving the copies.

Are we? I thought the cost saved by sendfile was bringing the bytes into userspace. If we're doing that already for encryption then what are the copies we're saving?

Matt Menke

unread,
Sep 1, 2014, 7:04:36 PM9/1/14
to Charles Munger, William Chan (陈智昌), net-dev, Ryan Sleevi, Marcus Bulach, Philippe Liard
Hrm...I had thought we were still talking about a pull API vs a push API...Reading over the log, I'm not sure about that any more.

Misha Efimov

unread,
Oct 21, 2014, 11:49:40 AM10/21/14
to Charles Munger, Marcus Bulach, Matt Menke, William Chan (陈智昌), Philippe Liard, Ryan Sleevi, net-dev
Sorry to resurrect an old thread, but I'm working on initial implementation of Cronet Async API and we've circled back to Executor vs Looper choice.

Executors interface specifically allows thread pools and provides no guarantees about message processing on particular thread or even one at the time, while Looper is associated with particular thread and allows explicit checks that object is not accessed from other threads.

It seems that Looper is a better way to ensure correctness and giving strong guarantees that we won't call into
the embedder after cancellation, without adding the potential for deadlock.

Any suggestions are greatly appreciated.


Charles Munger

unread,
Oct 21, 2014, 12:04:21 PM10/21/14
to Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach, Matt Menke

I'd like to voice my dissent against loopers.

I think the API guarantee you should provide is:

1. If a request is canceled during a listener callback, the listener will not get further calls other than the error call.
2. If the request is canceled from somewhere other than than inside the listener callback, regardless of thread, then the listener will receive at most one callback before the error callback, and might receive one.

This is the same as saying: "If a cancelation event happens-before the callback, the callback will not take place", using the definition of happens-before from the java memory model.

You can absolutely provide the guarantees you want with executors.

Matt Menke

unread,
Oct 21, 2014, 12:12:25 PM10/21/14
to Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
Using Executor also requires locks, which in my experience, tends to make for less robust code than the message passing model we get from an effectively single threaded execution queue.  Hrm...I suppose we could probably get away with an AtomicBoolean instead.

I'm also thinking we should get rid of the final callback on cancellation - it's weird to say "Hey, by the way, you cancelled us!", and calling into objects that the embedder considers dead but really aren't because of Java's memory model seems a potential source of bugs.  For the C++ API, we'd like to eventually get rid of that behavior as well.

Matt Menke

unread,
Oct 21, 2014, 12:24:54 PM10/21/14
to Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
The more I think about it, the more I dislike the using Executor - forcing every embedder that ever cancels requests asynchronously to handle getting a message after cancellation (Or use a single-threaded Executor, which is something we can't check for or enforce) just seems like a recipe for bugs.

Matt Menke

unread,
Oct 21, 2014, 12:27:47 PM10/21/14
to Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
Oh, and one question - why do you strongly oppose Loopers?  (And sorry for all the spam).

Charles Munger

unread,
Oct 21, 2014, 12:57:02 PM10/21/14
to Matt Menke, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
Loopers are less flexible, they don't provide synchronization guarantees, and they require allocating a whole thread to just the net stack. On top of that, I think they don't actually offer any of the benefits you think they do. 

Using a looper will not save you from the extra callback, it's inherently racy to call cancel() while a callback is running. Using a volatile field will provide exactly the same safety that a looper would.

As for calling back when cancelled, I'm envisioning a case where we cancel a file download - if we don't provide a callback, how will we delete the partial file?

Matt Menke

unread,
Oct 21, 2014, 1:09:43 PM10/21/14
to Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
If we require cancel to be called on the Looper, too, problem solved - we can have a strong guarantee.

We don't need exclusive access to the thread - we don't do any blocking IO on it, so could even use the main thread.  If you do blocking IO on it, you need another thread, regardless.

As for cancellation...  Cancel the request, wait for your pending write to complete, if there is one, then delete the file.  I don't see how the cancel callback affects anything.

Charles Munger

unread,
Oct 21, 2014, 1:22:19 PM10/21/14
to Matt Menke, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
On Tue, Oct 21, 2014 at 10:09 AM, Matt Menke <mme...@google.com> wrote:
If we require cancel to be called on the Looper, too, problem solved - we can have a strong guarantee.
That's the exact same guarantee you have with a volatile field. 

We don't need exclusive access to the thread - we don't do any blocking IO on it, so could even use the main thread.  If you do blocking IO on it, you need another thread, regardless.
Not necessarily. If you're using a thread pool, which everyone should be, then you can share that thread with other tasks. Also, in what world would we not do blocking I/O? I thought the whole point of making the callbacks happen off the network thread was to allow blocking I/O. 

As for cancellation...  Cancel the request, wait for your pending write to complete, if there is one, then delete the file.  I don't see how the cancel callback affects anything.
Right, but now you have to handle that in the same place that you cancel, as opposed to encapsulating it all in the listener.

Milo Sredkov

unread,
Oct 21, 2014, 1:23:38 PM10/21/14
to Matt Menke, Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
I'd also like to express my preference towards the Executor approach. One could easily provide an executor based on a Looper, making a Looper from an Executor is not always possible. Therefore, for apps that internally manage their multitasking with executors (which I believe applies to all large-enough apps) it will be much harder to use the new API. 



Matt Menke

unread,
Oct 21, 2014, 1:35:38 PM10/21/14
to Milo Sredkov, Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
On Tue, Oct 21, 2014 at 1:23 PM, Milo Sredkov <milo...@google.com> wrote:
I'd also like to express my preference towards the Executor approach. One could easily provide an executor based on a Looper, making a Looper from an Executor is not always possible. Therefore, for apps that internally manage their multitasking with executors (which I believe applies to all large-enough apps) it will be much harder to use the new API. 

This is certainly a valid point - my main concern is the API in the cancellation / pause case becomes much more significant for every embedder to use if we go this route.

We could make an SequencedExectorWrapper that executes tasks on an executor in the order they're posted, but that seems like complete overkill.

Milo Sredkov

unread,
Oct 21, 2014, 2:01:04 PM10/21/14
to Matt Menke, Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
Would your proposal to get rid of the final callback on cancellation, make things simple? I think after someone has canceled the request, they wouldn't want to know anything more about it anyway (provided the cancel() call safely releases the associated native resources). 

Matt Menke

unread,
Oct 21, 2014, 2:27:49 PM10/21/14
to Milo Sredkov, Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
Getting rid of the callback on cancel is mostly to avoid calling back with a result when the embedder doesn't expect it - we want to make the same change, for the same reason, C++ side as well, and should keep the behavior consistent.

I spent some time talking to Misha about the Looper vs Executor issue (Really one thread vs thread pool).  How about taking both of them, with strong guarantees, enforced by runtime checks, in the Looper case (Including forcing cancel to be called on the looper), and much weaker guarantees in the Executor case?  Allowing both sets of guarantees does not require a whole lot of extra code, and this lets simple embedders have a simple non-blocking API that they can even use on the main looper, if they really want to be (basically) a single-threaded app, but allows apps with more demanding needs the flexibility and performance of using Executors with thread pools.

Milo Sredkov

unread,
Oct 22, 2014, 3:56:52 AM10/22/14
to Matt Menke, Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
That sounds good to me. Would a client using a single-thread executor have the same guarantees as one using a Looper?

Charles Munger

unread,
Oct 22, 2014, 5:21:52 AM10/22/14
to Milo Sredkov, William Chan (陈智昌), Misha Efimov, net-dev, Marcus Bulach, Philippe Liard, Ryan Sleevi, Matt Menke

Using loopers does not reduce complexity. Given that all the request callbacks for a given request are never concurrent, it doesn't matter whether they happen on one thread or ten. Forcing cancel to be called on the looper does not add any safety or even different behavior - if a task is running, cancel will not interrupt it, and if it isn't it has the same behavior as the executor.

Loopers are not better, they are strictly worse than using executors, and the behavior is exactly the same.

Misha Efimov

unread,
Oct 22, 2014, 8:52:10 AM10/22/14
to Charles Munger, Milo Sredkov, William Chan (陈智昌), net-dev, Marcus Bulach, Philippe Liard, Ryan Sleevi, Matt Menke
If cancel is called on looper we can guarantee that it doesn't happen in the middle of callback to listener, and with looper we can check that cancel is indeed called on that looper.

Matt Menke

unread,
Oct 22, 2014, 8:55:39 AM10/22/14
to Charles Munger, Milo Sredkov, William Chan (陈智昌), Misha Efimov, net-dev, Marcus Bulach, Philippe Liard, Ryan Sleevi
Loopers are single-threaded, with the ability to enforce single-threaded usage patterns.  Executors may be single-threaded or multi-threaded, and have no way to enforce single-threaded usage patterns.

Things that work with a single thread are not guaranteed to work if you hand them a thread pool instead.

Matt Menke

unread,
Oct 22, 2014, 8:57:48 AM10/22/14
to Milo Sredkov, Charles Munger, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
Kind of - the Looper allows us to check at runtime where cancelled/pause are being called, the executor does not.  So as long as you use a single threaded executor (Including one wrapping looper), and never call cancel/pause on another thread, you'll get the same guarantees, we just can't detect and enforce that usage pattern.

Charles Munger

unread,
Oct 22, 2014, 7:24:55 PM10/22/14
to Matt Menke, Milo Sredkov, Misha Efimov, William Chan (陈智昌), net-dev, Ryan Sleevi, Philippe Liard, Marcus Bulach
The case where you're cancelling in a callback is the uninteresting one. That's much rarer than wanting to cancel on a different thread - for example, if the user is navigating to a different activity, or if they cancel an operation. If you're calling cancel on a different thread, there's no way you can guarantee anything about when cancel is being called on a looper that you can't guarantee with an executor.
Reply all
Reply to author
Forward
0 new messages