Chrome Extension APIs and STARTTLS

143 views
Skip to first unread message

Lally Singh

unread,
Sep 11, 2014, 12:56:07 PM9/11/14
to net-dev, Ryan Sleevi, Lucas Dixon
Hi.

  We find ourselves surprisingly in the need of supporting a STARTTLS flow on a socket for chrome applications.  In the process, we found some undesirable behavior in the sockets' setPause() api (crbug.com/403076, comments #4 and #5 describe what's going on).  We have a fix for this (https://codereview.chromium.org/494573002/), but it ain't pretty -- it adds a buffer. 

  So, it's about time we ask net-dev@ for guidance.  The need for STARTTLS is pretty unavoidable for us, and likely quite useful for other chrome app developers.  But a way to avoid buffering would be wonderful.

  Any thoughts?  Suggestions?  Pointers?

Cheers,
 - Lally

Ryan Sleevi

unread,
Sep 11, 2014, 2:16:01 PM9/11/14
to Lally Singh, net-dev, Ryan Sleevi, Lucas Dixon
Lest people think this is tied to STARTTLS, this is more generally about how the chrome.tcp.sockets API is implemented. The STARTTLS is just a side-artifact of this.

Intrepid network hackers - It'd be great if other eyes looked at the existing Socket implementation and consider conrete suggestions for the Chrome Apps/Extensions teams on refactoring or improvements. The buffering is undesirable, but we've got several other weird things at play, and it'd be great to make a list of concrete suggestions and an actionable roadmap so that we can make the Sockets API every bit as robust and powerful as users need.

Ryan Sleevi

unread,
Sep 11, 2014, 9:29:18 PM9/11/14
to Ryan Sleevi, Lally Singh, net-dev, Lucas Dixon, David Benjamin, Renaud Paquay
David's volunteered to take a stab at commenting, as there were several issues that some recent interns ran into with this API.

I don't mean to pick on extensions team, or in the great work Lally and Lucas have been doing to bring TLS sockets as a first class to the extension API. Merely, I want to make sure that our extension API design is as opinionated as our //net API design, and that we correctly steer users to solutions that scale across platforms and across the various real world networks that exist.

For those not familiar, the API for Chrome Apps using sockets is at https://developer.chrome.com/apps/app_network . Note that there's an older version, deprecated-but-still-supported, at https://developer.chrome.com/apps/socket

In the deprecated interface, one would read from a socket using https://developer.chrome.com/apps/socket#method-read . The benefit is that this is rather similar to a BSD Sockets-like interface, in which you don't necessarily begin reading data until read() is requested.

In this legacy model, one would assuming the following code for establishing a secure socket:
chrome.socket.create("tcp", { "i_have_no_idea": "what_goes_here" }, (createInfo) => {
  chrome.socket.connect(createInfo.socketId, "a-hostname-unfortunately.example", 443, (result) => {
    if (result != i_dont_know) {
      throw somethingOutTheWindow(result);
    }
    chrome.socket.write(createInfo.socketId, MagicalArrayBufferFromASCIIString("STARTTLS"), (writeInfo) => {
      // error handling
      chrome.socket.read(createInfo.socketId, "STARTTLS".length, (readInfo) => {
        // error handling
        chrome.socket.secure(createInfo.socketId, { "tlsVersion": { "min": "tls1.1" } }, (result) => {
        // secure
        });
      });
    });
  });
});

Ignoring the fact that I'm doing a ton of bad things and that we'd really want to fake this all out with promises, this old API more or less matches the capability of creating a state-machine. That is, in the case of STARTTLS, you read just enough bytes for whatever positive acknowledgement you need that the peer supports STARTTLS, then you kick off the TLS handshake.

The "new" socket APIs, however, lose all of that, because they moved "read" from being an explicit, caller-driven state machine (calling read(), getting a callback) into an event-driven state machine, where onReceive() is just constantly fired at the Javascript.

There's no inherent way of backpressure signalling, except through setPaused - https://developer.chrome.com/apps/sockets_tcp#method-setPaused - which forces the host to start buffering data for some duration and not signalling the extension client via onReceive.

Put succinctly, I think the new API is bad, and we may need a New "new" API.

Lally, Lucas, Renaud, please correct me if I'm wrong: Is there ANY way for an extension using the 'new' API to signal that they only consumed a portion of bytes?

As far as I can tell, the SocketProperties governs the buffer size used for the persistent background Read()

That is, if a server/peer wrote "HELLO"+TLSHandshake, we'd return to the upper layer (the JS caller) a 4K array buffer that contains the string "HELLO" and then the TLS handshake data. At that point, any call to chrome.sockets.tcp.secure() will fail, because the TLS handshake was already consumed off the wire.

Does that sound right? If so, then I can't see any way, without a JS API change, for this to ever have a chance at working correctly/reliably.

Lally Singh

unread,
Sep 11, 2014, 11:49:42 PM9/11/14
to rsl...@chromium.org, Lally Singh, net-dev, Lucas Dixon, David Benjamin, Renaud Paquay
We can probably save the existing new API if we add a way to go back to the old behavior as an exceptional case.  Say opening up the socket in the paused state, and then having a primitive that just acts like the old read(), but really hints that it isn't the normal way.  Possible names:
  • fixedRead()
  • manualRead()
  • pausedRead()
  • etc.
Most of the time, users will probably be happy with an event-based API for incoming data.  If we provide an "escape hatch" back to the old behavior for when it's really needed, we can keep the event API from hurting the secure() case.  But, I'm really interested to hear what David has to say about experiences with the API.

Also, are there any other cases other than secure() when this matters?   If that's the only time we care about separating out ranges of bytes in a read(), then perhaps we should consider any API/functionality changes within the more limited scope of semantics related to secure().

Ryan: I don't know of any way to indicate what's consumed.  I think that way's probably rife for errors as well -- it'll be easy for users to mis-count the amount of data that they want.  So far, what I've gotten to work is when the peer's waiting for us to initiate the TLS handshake, so we've already consumed everything that we wanted before invoking secure().  In your HELLO-prefixed version, that wouldn't work.

There's also the matter of the setPaused API fix -- right now setPaused()'s callback returns immediately, even if another data event is going to come.  But that can be fixed in other ways, independently of this (probably a buffer to save data received when paused, and it may have interactions with whatever we do for read(), but it's still separate).  Right now setPaused() sets a flag to tell the runtime to stop issuing Read() calls to the actual socket.  setPaused() returns when the flag's been flipped, not when the last-issued Read() returns.  

So, there isn't buffering per-se, but the app right now would have to somehow handle onReceive events occurring after setPaused().  Probably by buffering.   Once it's pulled into userland, someone's gotta buffer it.



Cheers,
$ ls

Lally Singh | Google Ideas | Engineering | la...@google.com
 

Ryan Sleevi

unread,
Sep 12, 2014, 4:30:27 AM9/12/14
to Lally Singh, Lucas Dixon, Renaud Paquay, net-dev, Lally Singh, David Benjamin


On Sep 11, 2014 8:49 PM, "Lally Singh" <la...@google.com> wrote:
>
> We can probably save the existing new API if we add a way to go back to the old behavior as an exceptional case.  Say opening up the socket in the paused state, and then having a primitive that just acts like the old read(), but really hints that it isn't the normal way.  Possible names:
> fixedRead()
> manualRead()
> pausedRead()
> etc.
> Most of the time, users will probably be happy with an event-based API for incoming data.  If we provide an "escape hatch" back to the old behavior for when it's really needed, we can keep the event API from hurting the secure() case.  But, I'm really interested to hear what David has to say about experiences with the API.
>

That doesn't really cover starting the TLS connection AFTER reading data, which is totally part of the STARTTLS flow for some protocols.

After looking at the API more, I'm surprised more people haven't run into issues. I'd strongly recommend the deprecated API, FWIW.

> Also, are there any other cases other than secure() when this matters?   If that's the only time we care about separating out ranges of bytes in a read(), then perhaps we should consider any API/functionality changes within the more limited scope of semantics related to secure().
>

Any form of socket composition. This API forces you to consume all data immediately or to stick it in some bound state for async processing later, which is ugly. It also has this notion of persistent sockets that survive an extension termination, except because of how the API is structured, there's NL clean way to persist your unused data (except maybe serializing to a string and then storing in extension/local storage, which is ugly as all get out)

> Ryan: I don't know of any way to indicate what's consumed.  I think that way's probably rife for errors as well -- it'll be easy for users to mis-count the amount of data that they want.  So far, what I've gotten to work is when the peer's waiting for us to initiate the TLS handshake, so we've already consumed everything that we wanted before invoking secure().  In your HELLO-prefixed version, that wouldn't work.
>
> There's also the matter of the setPaused API fix -- right now setPaused()'s callback returns immediately, even if another data event is going to come. 

Yeah, I have zero faith in setPaused. And from chatting with David, it does seem that it does not work as advertised. That is, you can still have events fired after you pause, which is an invariant we'd never do in //net.

I realize I'm using fairly forceful language, and I'm not trying to trash those involved. Just that there's a ton of sharp edges, and it sounds like they're cutting people.

> But that can be fixed in other ways, independently of this (probably a buffer to save data received when paused, and it may have interactions with whatever we do for read(), but it's still separate).  Right now setPaused() sets a flag to tell the runtime to stop issuing Read() calls to the actual socket.  setPaused() returns when the flag's been flipped, not when the last-issued Read() returns.  
>
> So, there isn't buffering per-se, but the app right now would have to somehow handle onReceive events occurring after setPaused().  Probably by buffering.   Once it's pulled into userland, someone's gotta buffer it.
>

Yeah, there would be buffering, but it wouldn't be additional. The buffer is already necessary to interact with TCPClientSocket, you just wouldn't dispatch the IPC when paused, until unpaused.

I'm less concerned with duct tape fixes - as I have real concerns about the viability here. I get why some of these tradeoffs were probably made, but its trying to provide a Streams-like API, and missing all of the edge cases being discussed in WebApps about how to botch streams.

Lally Singh

unread,
Sep 12, 2014, 9:08:34 AM9/12/14
to rsl...@chromium.org, Lucas Dixon, Lally Singh, net-dev, Renaud Paquay, David Benjamin


On Sep 12, 2014 4:30 AM, "Ryan Sleevi" <rsl...@chromium.org> wrote:
>
>
> On Sep 11, 2014 8:49 PM, "Lally Singh" <la...@google.com> wrote:
> >
> > We can probably save the existing new API if we add a way to go back to the old behavior as an exceptional case.  Say opening up the socket in the paused state, and then having a primitive that just acts like the old read(), but really hints that it isn't the normal way.  Possible names:
> > fixedRead()
> > manualRead()
> > pausedRead()
> > etc.
> > Most of the time, users will probably be happy with an event-based API for incoming data.  If we provide an "escape hatch" back to the old behavior for when it's really needed, we can keep the event API from hurting the secure() case.  But, I'm really interested to hear what David has to say about experiences with the API.
> >
>
> That doesn't really cover starting the TLS connection AFTER reading data, which is totally part of the STARTTLS flow for some protocols.

I should have been clearer.  I meant that one could pause the socket before connecting, then use the old-style read/write sequence to take care of STARTTLS.  After TLS is up, stay with manual read(), or optionally go back to event-based reading by unpausing the socket.  That, with a bugfixed setPaused (), would be a minimal-change fix.  The question of the right thing to do is separate and remains unsettled.

Ryan Sleevi

unread,
Sep 12, 2014, 12:10:18 PM9/12/14
to Lally Singh, Lucas Dixon, Renaud Paquay, net-dev, Lally Singh, David Benjamin


On Sep 12, 2014 6:08 AM, "Lally Singh" <la...@google.com> wrote:
>
> I should have been clearer.  I meant that one could pause the socket before connecting, then use the old-style read/write sequence to take care of STARTTLS.  After TLS is up, stay with manual read(), or optionally go back to event-based reading by unpausing the socket.  That, with a bugfixed setPaused (), would be a minimal-change fix.  The question of the right thing to do is separate and remains unsettled.
>

That seems to be mixing the old and new APIs, which, even if it worked, I would think would be seen as a big no-no.

Renaud Paquay

unread,
Sep 12, 2014, 4:51:35 PM9/12/14
to rsl...@chromium.org, Lally Singh, Lucas Dixon, Renaud Paquay, net-dev, Lally Singh, David Benjamin
To recap, it sounds like we have 3 options:
  1. Keep using the deprecated API
  2. Figure out a way to patch the "new" API
  3. Completely redesign the socket API.

Option 1 has the advantage of requiring no work. Is that correct, Lally? The main disadvantage is that the API is deprecated and that everybody we reached out to when we publish the new API has moved to the new API.
Option 2 has the advantage of addressing a scenario that is not yet supported with the "new" API. However, I am not sure how common the scenario is, and it sounds like Ryan is saying we won't be able to cover 100% of the scenarios anyways.
Option 3 seems like the "best" option, but it requires more long lead work (more on that at the end of this mail (*))

In summary,
Option 1 -> no work needed
Option 2 -> need to design a "patch" solution to address this specific scenario (Lally has multiple proposals and one prototype implementation)
Option 3 -> long lead work needed to start an Open Web effort

Is this a proper summary of the current state of affairs?

===============================================================================================

On a somewhat unrelated note, I was directly involved in the design in the "new" API, so maybe I can give some additional context of why/how we ended up with the current "new" API design (at least iirc :).

The "old" API had a few issues
  1. The "read+callback" pattern was a "bad" pattern for Chrome Apps, because having "pending" callbacks prevents Chrome Apps from being "suspended" (http://developer.chrome.com/apps/app_lifecycle.html). We wanted to provide APIs that were "onSuspend" friendly. In general, Chrome Apps of course use callbacks for all the APIs, but this was the first time we had this semantics of "operation takes indefinite time before the callback is invoked". In all other cases, the callback is merely an artifact there is a round tip between two processes, and that should be asynchronous. In summary, having "read" and "accept" methods that use callbacks as completion mechanism is an "anti-pattern" to supporting proper "onSuspend" behavior -- in the current incarnation of Chrome Apps/Extensions platform.
  2. The "read+callback" pattern was inefficient because, in general, API calls for Chrome Apps/Extensions have a somewhat high latency (marshaling, security checks). In the UDP case, for example, it was pretty much impossible to write an app that would *not* miss packets sent at short intervals.
  3. We got feedback from developers that they would prefer have an event based API. The "old" API was forcing them to "poll", and that was tricky/annoying to get right.
  4. We got feedback from developers that having TCP Client/UDP/TCP server in a single namespace was confusing.
Given that we wanted to address "1" for sure, and other points if possible too, we looked around for other JavaScript based APIs for raw sockets, and we found 2 in particular 1) net for node.js and 2) raw-sockets for Sysapps. It turned out that both APIs seemed to address the 4 points above, and one of them (net for node.js) was already widely used and validated, so it gave us some confidence about the design. We also had a Chrome Apps API precedent for solving issue #1 (https://developer.chrome.com/apps/alarms)

All in all, we eventually decided to go with a design *similar* to net for node.js (i.e. separate namespaces, event based) but *not exactly* the same, merely for mechanical reasons: ChromeApps use raw numbers for resource identifiers, not plain JavaScript objects.

During the implementation, we validated the work by porting the chrome.socket samples to the new API, and we also ported CIRC (the most popular Chrome App using raw sockets).

As an aside, we also at that time made the tradeoff that the "setPaused" method would be slightly less strict that promised in the doc. It promises that *no* "onReceive" event will be fired after calling "setPaused", whereas the implementation effectively guarantees that "*at most 1* onReceive" event will be fired. We decided to wait for feedback from developers to see if this was a problem. We never got feedback this was a problem (until recently with the issue at hand), so we never "fixed" that behavior. Remember at the time, the scenario for the "setPaused" function was seen as a way to throttle downloads, in case the Chrome App was not able to process data coming from the peer fast enough, so an extra "onReceive" event after calling "setPaused" was not seen as a major issue.

Note: Fxing "setPaused" to offer a the stronger guarantee would have been "trivial": in TCPSocketEventDispatcher, instead of dispatching a packet when the socket is paused, buffer the packet until "setPause(false)" is called. That would have required at *most* 1 packet buffered because TCPSocketEventDispatcher is also designed to not invoked a low level receive when the socket is "paused".

Note: We ended up using the "new" API pattern with other Chrome Apps API (https://developer.chrome.com/apps/serial, https://developer.chrome.com/apps/bluetoothSocket), and the pattern seemed to generalize nicely.

A few months later, Lally, after much patience and persistence (https://codereview.chromium.org/76403004/), added TLS support with the "secure" method . This was great because we had developers asking for that feature. CIRC, for example, had to resort to use a JavaScript based API to support TLS: https://github.com/flackr/circ/blob/master/package/bin/net/ssl_socket.js. At the time, the scenario we validated was "1) create socket (in paused mode) 2) call secure 3) un-pause 4) send/receive data over TLS".

What has happened recently is that "secure" needs to be supported in the STARTTLS scenario, and in the process of trying to support that scenario, the "setPaused" behavior bit us, which leads us to the current state of affairs.


One last addition to the story: the intent behind Chrome Apps API was always to eventually become (in some form or other) APIs for the Open Web. For example, we recently starting an effort to port chrome.bluetooth.lowEnergy to the Open Web (https://groups.google.com/a/chromium.org/forum/?hl=en#!searchin/blink-dev/bluetooth/blink-dev/pfNeo5cFXCk/uwGt_U5ZQLMJ). What this means is that, in general, Chrome Apps API were/are designed with a "address 90% of the scenarios and all currently known needs" bias, which implies less common scenarios are often addressed as they arise, which sometimes leads to breaking changes -- which seems to be the case here.


(*) This leads me to the beginning of this email: regarding "option 3 - design a new "new" API", I think this would only make sense in the context of designing an Open Web API. It looks like WebApps streams (https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm) would be a good start, although I have very little context at this point.


Ryan Sleevi

unread,
Sep 12, 2014, 5:08:49 PM9/12/14
to Renaud Paquay, Lucas Dixon, Lally Singh, net-dev, Lally Singh, David Benjamin

Thanks for the history, Renaud, and again, not trying to pick on the API, even as I'm making strong statements.

I think your conclusion on 3 is right - the event-based mechanism is inherently trying to be Streams-like, and going forward with the platform, that seems where the most energy should be invested.

I do have serious reservations about exposing this to the open web - it violates the SOP in all sorts of ways, and we have already invested heavily in two different(!!!) ways to do this that respect the SOP (WebSockets and DataChannel), but I don't want to bog the discussion down on that front.

One of the things we've seen from our //net experience is that the push-like APIs have had real trouble. The only one in //net that really comes close to following a similar model here is SPDY / HTTP/2.0, which is constantly reading and deframing packets. The challenges - and bugs - here have led to multiple protocol revisions, as we effectively try to get things right. Meanwhile, we've also faced a variety of perf bugs and issues along the way.

My biggest concern with the 90% model is that, and no disrespect meant, we haven't taken an opinionated approach on the design. The API is usuable for a number of use cases, but with patterns that have real performance or usability issues. I think there's a gap between being fit for purpose and being usable, and just looking at how one might implement a variety of protocols here using this API, I feel the current situation leans more on the former than the latter.

You are absolutely correct, however, that we can (and, arguably, should) provide the stronger guarantee in setPaused, as that really is the 'least surprising' (the //net devs I talked to assumed, like I did, that it already worked that way).

For a UDP API, you're absolutely correct that an event-based system is likely strongly desirable, and it works because datagrams are discrete known quanta. Where it fails, however, is for the stream case, which inherently lacks this demarcation of defined messages.

Similarly, for Lally/Lucas's specific case, we could consider adding an API that allows, during the handling of an onRecieve callback, the caller to indicate how much they consumed. Anything not consumed would be dispatched in a subsequent onReceive - caveat the setPaused discussion above. If a caller fails to call that function during the callback (e.g. all existing code), the implementation can safely assume it was all consumed and dispatch another transport read (aka the current behavior).

This would allow the onReceive consumption of the "STARTTLS" magic, while still keeping the TLS handshake data in place. The overhead of buffering would immediately go away once the remainder was read - at that point, the buffering socket would just dispatch into the underlying transport.

This is similar, but not exact, to the existing approach, as it wholly elimates any extra buffering, while also better supporting stream composability. This doesn't hurt benefit the TLS case, but makes it easier for the socket suspension case (since the browser can hold the unconsumed data and report it once the extension unpauses), and also makes it easier for JS authors to compose sockets in userland.

Lally Singh

unread,
Sep 13, 2014, 2:18:47 AM9/13/14
to rsl...@chromium.org, Renaud Paquay, Lucas Dixon, Lally Singh, net-dev, David Benjamin
This sounds pretty reasonable to me.  Where 'this' (as I understand it) is:
 (a) don't let onReceive fire after setPaused(true)
 (b) provide an api to say 'I consumed X bytes' in an onReceive callback, where the remaining bytes delivered in that callback are then "un-read" in a sense similar to ungetc(1).  The remaining bytes will get treated like they arrive immediately after the onReceive callback finishes.  

I'm a little confused about how this eliminates buffering.  But I'm not sure where we're drawing the line for what's a buffer here.  There's memio_* buffers used in libNSS, the kernel buffers in the network stack, and whatever buffer stores the data that's passed to the onReceive callback.  We'll have to save the data that wasn't consumed (in (b) above) somewhere.




Cheers,
$ ls

Lally Singh | Google Ideas | Engineering | la...@google.com
 


William Chan (陈智昌)

unread,
Sep 15, 2014, 1:51:45 PM9/15/14
to Renaud Paquay, Ryan Sleevi, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
On Fri, Sep 12, 2014 at 1:51 PM, Renaud Paquay <rpa...@chromium.org> wrote:
>
> To recap, it sounds like we have 3 options:
>
> Keep using the deprecated API
> Figure out a way to patch the "new" API
> Completely redesign the socket API.
>
>
> Option 1 has the advantage of requiring no work. Is that correct, Lally? The main disadvantage is that the API is deprecated and that everybody we reached out to when we publish the new API has moved to the new API.
> Option 2 has the advantage of addressing a scenario that is not yet supported with the "new" API. However, I am not sure how common the scenario is, and it sounds like Ryan is saying we won't be able to cover 100% of the scenarios anyways.
> Option 3 seems like the "best" option, but it requires more long lead work (more on that at the end of this mail (*))
>
> In summary,
> Option 1 -> no work needed
> Option 2 -> need to design a "patch" solution to address this specific scenario (Lally has multiple proposals and one prototype implementation)
> Option 3 -> long lead work needed to start an Open Web effort
>
> Is this a proper summary of the current state of affairs?
>
> ===============================================================================================
>
> On a somewhat unrelated note, I was directly involved in the design in the "new" API, so maybe I can give some additional context of why/how we ended up with the current "new" API design (at least iirc :).
>
> The "old" API had a few issues
>
> The "read+callback" pattern was a "bad" pattern for Chrome Apps, because having "pending" callbacks prevents Chrome Apps from being "suspended" (http://developer.chrome.com/apps/app_lifecycle.html). We wanted to provide APIs that were "onSuspend" friendly. In general, Chrome Apps of course use callbacks for all the APIs, but this was the first time we had this semantics of "operation takes indefinite time before the callback is invoked". In all other cases, the callback is merely an artifact there is a round tip between two processes, and that should be asynchronous. In summary, having "read" and "accept" methods that use callbacks as completion mechanism is an "anti-pattern" to supporting proper "onSuspend" behavior -- in the current incarnation of Chrome Apps/Extensions platform.

Can you further explain this antipattern? We've dealt with callbacks
and suspends on all supported Chromium platforms. What is it about the
Chrome Apps/Extensions architecture that makes it an antipattern?

> The "read+callback" pattern was inefficient because, in general, API calls for Chrome Apps/Extensions have a somewhat high latency (marshaling, security checks). In the UDP case, for example, it was pretty much impossible to write an app that would *not* miss packets sent at short intervals.

I need to think a bit more about the UDP case, but for the TCP case, I
don't understand the marshaling & security checks costs. Why are you
marshaling instead of using shared memory (like what we do with
ResourceDispatcherHost and Mojo DataPipes)? And why wouldn't the
security check only happen once, when the pipe is established?

> We got feedback from developers that they would prefer have an event based API. The "old" API was forcing them to "poll", and that was tricky/annoying to get right.

Can you point out discussion threads on this? That said, I can fully
believe that for the 90% case, an event based API is preferable for
developers.

> We got feedback from developers that having TCP Client/UDP/TCP server in a single namespace was confusing.
>
> Given that we wanted to address "1" for sure, and other points if possible too, we looked around for other JavaScript based APIs for raw sockets, and we found 2 in particular 1) net for node.js and 2) raw-sockets for Sysapps. It turned out that both APIs seemed to address the 4 points above, and one of them (net for node.js) was already widely used and validated, so it gave us some confidence about the design. We also had a Chrome Apps API precedent for solving issue #1 (https://developer.chrome.com/apps/alarms)
>
> All in all, we eventually decided to go with a design *similar* to net for node.js (i.e. separate namespaces, event based) but *not exactly* the same, merely for mechanical reasons: ChromeApps use raw numbers for resource identifiers, not plain JavaScript objects.
>
> During the implementation, we validated the work by porting the chrome.socket samples to the new API, and we also ported CIRC (the most popular Chrome App using raw sockets).
>
> As an aside, we also at that time made the tradeoff that the "setPaused" method would be slightly less strict that promised in the doc. It promises that *no* "onReceive" event will be fired after calling "setPaused", whereas the implementation effectively guarantees that "*at most 1* onReceive" event will be fired. We decided to wait for feedback from developers to see if this was a problem. We never got feedback this was a problem (until recently with the issue at hand), so we never "fixed" that behavior. Remember at the time, the scenario for the "setPaused" function was seen as a way to throttle downloads, in case the Chrome App was not able to process data coming from the peer fast enough, so an extra "onReceive" event after calling "setPaused" was not seen as a major issue.

setPaused() is important for backpressure. But in order to implement
it correctly, you need to fulfill the promise of not invoking a
onReceive event. As demonstrated, an API that does not conform to this
results in bugs.

>
> Note: Fxing "setPaused" to offer a the stronger guarantee would have been "trivial": in TCPSocketEventDispatcher, instead of dispatching a packet when the socket is paused, buffer the packet until "setPause(false)" is called. That would have required at *most* 1 packet buffered because TCPSocketEventDispatcher is also designed to not invoked a low level receive when the socket is "paused".
>
> Note: We ended up using the "new" API pattern with other Chrome Apps API (https://developer.chrome.com/apps/serial, https://developer.chrome.com/apps/bluetoothSocket), and the pattern seemed to generalize nicely.
>
> A few months later, Lally, after much patience and persistence (https://codereview.chromium.org/76403004/), added TLS support with the "secure" method . This was great because we had developers asking for that feature. CIRC, for example, had to resort to use a JavaScript based API to support TLS: https://github.com/flackr/circ/blob/master/package/bin/net/ssl_socket.js. At the time, the scenario we validated was "1) create socket (in paused mode) 2) call secure 3) un-pause 4) send/receive data over TLS".
>
> What has happened recently is that "secure" needs to be supported in the STARTTLS scenario, and in the process of trying to support that scenario, the "setPaused" behavior bit us, which leads us to the current state of affairs.
>
>
> One last addition to the story: the intent behind Chrome Apps API was always to eventually become (in some form or other) APIs for the Open Web. For example, we recently starting an effort to port chrome.bluetooth.lowEnergy to the Open Web (https://groups.google.com/a/chromium.org/forum/?hl=en#!searchin/blink-dev/bluetooth/blink-dev/pfNeo5cFXCk/uwGt_U5ZQLMJ). What this means is that, in general, Chrome Apps API were/are designed with a "address 90% of the scenarios and all currently known needs" bias, which implies less common scenarios are often addressed as they arise, which sometimes leads to breaking changes -- which seems to be the case here.

I'd strongly object to exposing the current Chrome Apps socket API to
the Open Web. I don't mean to be overly harsh, but to me, it's the
wrong way to approach API design. I respect the desire to optimize for
the 90% case, but really what would be better is to create a layered
API. Expose the lower layer API that exposes something very close to
platform primitives (BSD socket API). And if you want to create a
higher layer API that can be polyfilled using the lower layer API,
with the option of explicit platform support to optimize it, then
that's fine too. A library layer that implements a higher level API
that solved the event API vs callback/polling API issues would help
flush out these higher layer API bugs and build more confidence that
baking it into the platform would work.

>
>
> (*) This leads me to the beginning of this email: regarding "option 3 - design a new "new" API", I think this would only make sense in the context of designing an Open Web API. It looks like WebApps streams (https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm) would be a good start, although I have very little context at this point.

That's deprecated. Don't use that. The active work is happening in
WHATWG at https://github.com/whatwg/streams.

>
>
>
> On Fri, Sep 12, 2014 at 9:10 AM, Ryan Sleevi <rsl...@chromium.org> wrote:
>>
>>
>> On Sep 12, 2014 6:08 AM, "Lally Singh" <la...@google.com> wrote:
>> >
>> > I should have been clearer. I meant that one could pause the socket before connecting, then use the old-style read/write sequence to take care of STARTTLS. After TLS is up, stay with manual read(), or optionally go back to event-based reading by unpausing the socket. That, with a bugfixed setPaused (), would be a minimal-change fix. The question of the right thing to do is separate and remains unsettled.
>> >
>>
>> That seems to be mixing the old and new APIs, which, even if it worked, I would think would be seen as a big no-no.
>
>
> --
> You received this message because you are subscribed to the Google Groups "net-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
> To post to this group, send email to net...@chromium.org.
> To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/CAFcJ%2Bg4WWapQ0LYJ-rSx5GK7ByvxvZzT6Z_7dK087hswPrCgqA%40mail.gmail.com.

Renaud Paquay

unread,
Sep 15, 2014, 10:06:29 PM9/15/14
to William Chan (陈智昌), Renaud Paquay, Ryan Sleevi, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
On Mon, Sep 15, 2014 at 10:51 AM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Fri, Sep 12, 2014 at 1:51 PM, Renaud Paquay <rpa...@chromium.org> wrote:
>
> To recap, it sounds like we have 3 options:
>
> Keep using the deprecated API
> Figure out a way to patch the "new" API
> Completely redesign the socket API.
>
>
> Option 1 has the advantage of requiring no work. Is that correct, Lally? The main disadvantage is that the API is deprecated and that everybody we reached out to when we publish the new API has moved to the new API.
> Option 2 has the advantage of addressing a scenario that is not yet supported with the "new" API. However, I am not sure how common the scenario is, and it sounds like Ryan is saying we won't be able to cover 100% of the scenarios anyways.
> Option 3 seems like the "best" option, but it requires more long lead work (more on that at the end of this mail (*))
>
> In summary,
> Option 1 -> no work needed
> Option 2 -> need to design a "patch" solution to address this specific scenario (Lally has multiple proposals and one prototype implementation)
> Option 3 -> long lead work needed to start an Open Web effort
>
> Is this a proper summary of the current state of affairs?
>
> ===============================================================================================
>
> On a somewhat unrelated note, I was directly involved in the design in the "new" API, so maybe I can give some additional context of why/how we ended up with the current "new" API design (at least iirc :).
>
> The "old" API had a few issues
>
> The "read+callback" pattern was a "bad" pattern for Chrome Apps, because having "pending" callbacks prevents Chrome Apps from being "suspended" (http://developer.chrome.com/apps/app_lifecycle.html). We wanted to provide APIs that were "onSuspend" friendly. In general, Chrome Apps of course use callbacks for all the APIs, but this was the first time we had this semantics of "operation takes indefinite time before the callback is invoked". In all other cases, the callback is merely an artifact there is a round tip between two processes, and that should be asynchronous. In summary, having "read" and "accept" methods that use callbacks as completion mechanism is an "anti-pattern" to supporting proper "onSuspend" behavior -- in the current incarnation of Chrome Apps/Extensions platform.

Can you further explain this antipattern? We've dealt with callbacks
and suspends on all supported Chromium platforms. What is it about the
Chrome Apps/Extensions architecture that makes it an antipattern?

I am not sure we are talking about the same "suspend" concept. I am referring to the "onSuspend" event in the diagram at: http://developer.chrome.com/apps/app_lifecycle.html. Basically, a ChromeApp is "unloaded" (i.e. its renderer process is shut down) under a certain set of conditions, including "no pending callback" -- the diagram mentions only "no window open" but that is not sufficient with the current implementation.

So, currently, any active pending callback is sufficient for preventing a ChromeApp from being "unloaded". In the case of the "old" chrome.socket API, we introduced functions (chrome.socket.read and chrome.socket.accept) that are "blocking", in the sense that the callback will only be invoked is some external event occurs after some indefinite amount of time (e.g. a client is creating a connection or sending data). This was preventing the API from enabling scenarios such as "windowless" servers, i.e. a tcp server app that can be unloaded when inactive and automatically re-loaded when a connection is attempted by a client.
 

> The "read+callback" pattern was inefficient because, in general, API calls for Chrome Apps/Extensions have a somewhat high latency (marshaling, security checks). In the UDP case, for example, it was pretty much impossible to write an app that would *not* miss packets sent at short intervals.

I need to think a bit more about the UDP case, but for the TCP case, I
don't understand the marshaling & security checks costs. Why are you
marshaling instead of using shared memory (like what we do with
ResourceDispatcherHost and Mojo DataPipes)? And why wouldn't the
security check only happen once, when the pipe is established?

I don't think there is inherently anything preventing what you are suggesting, I was merely describing the state of the ChromeApp/Extension API binding code as it was implemented at the time (It is still mostly unchanged at this point, although I believe there are early prototypes to port some extensions components to Mojo).
 

> We got feedback from developers that they would prefer have an event based API. The "old" API was forcing them to "poll", and that was tricky/annoying to get right.

Can you point out discussion threads on this? That said, I can fully
believe that for the 90% case, an event based API is preferable for
developers.

Here are 2 examples of issues due to incorrect polling:
There were also in person design meetings + comments in various Chrome bugs -- I have not kept track of everything. 


> We got feedback from developers that having TCP Client/UDP/TCP server in a single namespace was confusing.
>
> Given that we wanted to address "1" for sure, and other points if possible too, we looked around for other JavaScript based APIs for raw sockets, and we found 2 in particular 1) net for node.js and 2) raw-sockets for Sysapps. It turned out that both APIs seemed to address the 4 points above, and one of them (net for node.js) was already widely used and validated, so it gave us some confidence about the design. We also had a Chrome Apps API precedent for solving issue #1 (https://developer.chrome.com/apps/alarms)
>
> All in all, we eventually decided to go with a design *similar* to net for node.js (i.e. separate namespaces, event based) but *not exactly* the same, merely for mechanical reasons: ChromeApps use raw numbers for resource identifiers, not plain JavaScript objects.
>
> During the implementation, we validated the work by porting the chrome.socket samples to the new API, and we also ported CIRC (the most popular Chrome App using raw sockets).
>
> As an aside, we also at that time made the tradeoff that the "setPaused" method would be slightly less strict that promised in the doc. It promises that *no* "onReceive" event will be fired after calling "setPaused", whereas the implementation effectively guarantees that "*at most 1* onReceive" event will be fired. We decided to wait for feedback from developers to see if this was a problem. We never got feedback this was a problem (until recently with the issue at hand), so we never "fixed" that behavior. Remember at the time, the scenario for the "setPaused" function was seen as a way to throttle downloads, in case the Chrome App was not able to process data coming from the peer fast enough, so an extra "onReceive" event after calling "setPaused" was not seen as a major issue.

setPaused() is important for backpressure. But in order to implement
it correctly, you need to fulfill the promise of not invoking a
onReceive event. As demonstrated, an API that does not conform to this
results in bugs.

I think we all agree "setPaused()" should be fixed -- although it is not sufficient to fully address the issue raised originally in this thread. My main goal about giving history data points was to illustrate that "Hindsight is 20/20" :)


>
> Note: Fxing "setPaused" to offer a the stronger guarantee would have been "trivial": in TCPSocketEventDispatcher, instead of dispatching a packet when the socket is paused, buffer the packet until "setPause(false)" is called. That would have required at *most* 1 packet buffered because TCPSocketEventDispatcher is also designed to not invoked a low level receive when the socket is "paused".
>
> Note: We ended up using the "new" API pattern with other Chrome Apps API (https://developer.chrome.com/apps/serial, https://developer.chrome.com/apps/bluetoothSocket), and the pattern seemed to generalize nicely.
>
> A few months later, Lally, after much patience and persistence (https://codereview.chromium.org/76403004/), added TLS support with the "secure" method . This was great because we had developers asking for that feature. CIRC, for example, had to resort to use a JavaScript based API to support TLS: https://github.com/flackr/circ/blob/master/package/bin/net/ssl_socket.js. At the time, the scenario we validated was "1) create socket (in paused mode) 2) call secure 3) un-pause 4) send/receive data over TLS".
>
> What has happened recently is that "secure" needs to be supported in the STARTTLS scenario, and in the process of trying to support that scenario, the "setPaused" behavior bit us, which leads us to the current state of affairs.
>
>
> One last addition to the story: the intent behind Chrome Apps API was always to eventually become (in some form or other) APIs for the Open Web. For example, we recently starting an effort to port chrome.bluetooth.lowEnergy to the Open Web (https://groups.google.com/a/chromium.org/forum/?hl=en#!searchin/blink-dev/bluetooth/blink-dev/pfNeo5cFXCk/uwGt_U5ZQLMJ). What this means is that, in general, Chrome Apps API were/are designed with a "address 90% of the scenarios and all currently known needs" bias, which implies less common scenarios are often addressed as they arise, which sometimes leads to breaking changes -- which seems to be the case here.

I'd strongly object to exposing the current Chrome Apps socket API to
the Open Web.

Sorry I was not clear: I did *not* mean to imply that the current chrome.sockets API is (or should be) headed to the Open Web "as-is".

I was merely trying to say that *If* we decide to design a v3 of "a" raw socket API for Chrome Apps (see option 3 in my original email), my opinion is that it should be in the context of an Open Web API, i.e. I do not think it would make sense at this point to design another socket api *specific to* ChromeApps.

Bridging to a previous point I was trying to make: In general, I think it is fair to say that, from the ChromeApps effort perspective, it is a great -- probably the best possible -- outcome for any "chrome.Xxx" API to be deprecated in favor of an equivalent Open Web API -- in terms of scenarios covered, *not* shape or form . So, bringing raw-sockets support to the Open Web, in whatever form, and deprecating both the "old" and "new" chrome.socket APIs would be a "as intended". 
 
I don't mean to be overly harsh, but to me, it's the
wrong way to approach API design. I respect the desire to optimize for
the 90% case, but really what would be better is to create a layered
API. Expose the lower layer API that exposes something very close to
platform primitives (BSD socket API). And if you want to create a
higher layer API that can be polyfilled using the lower layer API,
with the option of explicit platform support to optimize it, then
that's fine too. A library layer that implements a higher level API
that solved the event API vs callback/polling API issues would help
flush out these higher layer API bugs and build more confidence that
baking it into the platform would work.

This sounds like a reasonable implementation strategy :), assuming there is agreement an Open Web API for raw sockets is indeed needed/desirable at all, of course.
  • Is there enough agreement in this group that it would be a worthwhile effort to start at this point? Reason I am asking is that, unless I am mistaken, Ryan sounded opposed in principle.
  • If yes, is a new discussion topic in this group a good starting point?
  • Also, should this group own/lead the effort?


>
>
> (*) This leads me to the beginning of this email: regarding "option 3 - design a new "new" API", I think this would only make sense in the context of designing an Open Web API. It looks like WebApps streams (https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm) would be a good start, although I have very little context at this point.

That's deprecated. Don't use that. The active work is happening in
WHATWG at https://github.com/whatwg/streams.

Ack, thanks.

Ryan Sleevi

unread,
Sep 15, 2014, 10:19:23 PM9/15/14
to Renaud Paquay, William Chan (陈智昌), Ryan Sleevi, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
On Mon, Sep 15, 2014 at 7:06 PM, Renaud Paquay <rpa...@chromium.org> wrote:


On Mon, Sep 15, 2014 at 10:51 AM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Fri, Sep 12, 2014 at 1:51 PM, Renaud Paquay <rpa...@chromium.org> wrote:
>
> To recap, it sounds like we have 3 options:
>
> Keep using the deprecated API
> Figure out a way to patch the "new" API
> Completely redesign the socket API.
>
>
> Option 1 has the advantage of requiring no work. Is that correct, Lally? The main disadvantage is that the API is deprecated and that everybody we reached out to when we publish the new API has moved to the new API.
> Option 2 has the advantage of addressing a scenario that is not yet supported with the "new" API. However, I am not sure how common the scenario is, and it sounds like Ryan is saying we won't be able to cover 100% of the scenarios anyways.
> Option 3 seems like the "best" option, but it requires more long lead work (more on that at the end of this mail (*))
>
> In summary,
> Option 1 -> no work needed
> Option 2 -> need to design a "patch" solution to address this specific scenario (Lally has multiple proposals and one prototype implementation)
> Option 3 -> long lead work needed to start an Open Web effort
>
> Is this a proper summary of the current state of affairs?
>
> ===============================================================================================
>
> On a somewhat unrelated note, I was directly involved in the design in the "new" API, so maybe I can give some additional context of why/how we ended up with the current "new" API design (at least iirc :).
>
> The "old" API had a few issues
>
> The "read+callback" pattern was a "bad" pattern for Chrome Apps, because having "pending" callbacks prevents Chrome Apps from being "suspended" (http://developer.chrome.com/apps/app_lifecycle.html). We wanted to provide APIs that were "onSuspend" friendly. In general, Chrome Apps of course use callbacks for all the APIs, but this was the first time we had this semantics of "operation takes indefinite time before the callback is invoked". In all other cases, the callback is merely an artifact there is a round tip between two processes, and that should be asynchronous. In summary, having "read" and "accept" methods that use callbacks as completion mechanism is an "anti-pattern" to supporting proper "onSuspend" behavior -- in the current incarnation of Chrome Apps/Extensions platform.

Can you further explain this antipattern? We've dealt with callbacks
and suspends on all supported Chromium platforms. What is it about the
Chrome Apps/Extensions architecture that makes it an antipattern?

I am not sure we are talking about the same "suspend" concept. I am referring to the "onSuspend" event in the diagram at: http://developer.chrome.com/apps/app_lifecycle.html. Basically, a ChromeApp is "unloaded" (i.e. its renderer process is shut down) under a certain set of conditions, including "no pending callback" -- the diagram mentions only "no window open" but that is not sufficient with the current implementation.

So, currently, any active pending callback is sufficient for preventing a ChromeApp from being "unloaded". In the case of the "old" chrome.socket API, we introduced functions (chrome.socket.read and chrome.socket.accept) that are "blocking", in the sense that the callback will only be invoked is some external event occurs after some indefinite amount of time (e.g. a client is creating a connection or sending data). This was preventing the API from enabling scenarios such as "windowless" servers, i.e. a tcp server app that can be unloaded when inactive and automatically re-loaded when a connection is attempted by a client.

In what I suspect is a divergence from this topic, I'd actually be really curious to know how the new API avoids the old API's issues. That is, I'm assuming the issue was w/r/t how events were bound and inherent state of the JS engine, but event listeners have all those same binding scopes, I thought.

But that's probably a separate topic, and more for my own education from my own ignorance than directly related to the topic at hand.
I would disagree, but that gets to my point later.
 

Bridging to a previous point I was trying to make: In general, I think it is fair to say that, from the ChromeApps effort perspective, it is a great -- probably the best possible -- outcome for any "chrome.Xxx" API to be deprecated in favor of an equivalent Open Web API -- in terms of scenarios covered, *not* shape or form . So, bringing raw-sockets support to the Open Web, in whatever form, and deprecating both the "old" and "new" chrome.socket APIs would be a "as intended". 

I disagree on this philosophy pretty strongly, although I do agree it's a 'perfect end goal'. Then again, a perfect end goal is also to end suffering and bring about world peace :)
 
 
I don't mean to be overly harsh, but to me, it's the
wrong way to approach API design. I respect the desire to optimize for
the 90% case, but really what would be better is to create a layered
API. Expose the lower layer API that exposes something very close to
platform primitives (BSD socket API). And if you want to create a
higher layer API that can be polyfilled using the lower layer API,
with the option of explicit platform support to optimize it, then
that's fine too. A library layer that implements a higher level API
that solved the event API vs callback/polling API issues would help
flush out these higher layer API bugs and build more confidence that
baking it into the platform would work.

This sounds like a reasonable implementation strategy :), assuming there is agreement an Open Web API for raw sockets is indeed needed/desirable at all, of course.
  • Is there enough agreement in this group that it would be a worthwhile effort to start at this point? Reason I am asking is that, unless I am mistaken, Ryan sounded opposed in principle.
Indeed. This will break the Same Origin Policy very strongly. Even in a world of ephemeral apps and permissions, that's a huge thing with lots of implications. Even plugins like Flash and Silverlight have historically botched this in very severe ways.
  • If yes, is a new discussion topic in this group a good starting point?
Even if no, I think a new topic for that, for sure :)
  • Also, should this group own/lead the effort?
Owners? We're a meritocracy! ;)

More aptly, I think this group can very much inform and share about real world experience for a variety of APIs. I think Will's point was something to the effect of, in an ideal world, could you do everything Chrome does today (in native code) using JS? A second concern, but as equally pressing as that primary concern, is whether or not we can do so in an opinionated and idiomatic way, which I think is something that ChromeApps has a lot of insight into.

For example, Apps/Extensions don't support Promises, and it doesn't seem they will anytime Soon (tm). What are the implications for these efforts? Don't know.

More to the general topic at hand, which is Lally/Lucas' problem, how do you feel about the 'backwards compatible' hack proposed?

William Chan (陈智昌)

unread,
Sep 15, 2014, 11:22:22 PM9/15/14
to Ryan Sleevi, Renaud Paquay, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
I don't fully understand Ryan's question, but I think that's because I
don't fully understand what the Chrome App/Extension architecture is
doing here. I think I may share the same confusion. Thanks for
pointing out the app lifecycle to me (which I missed the first time,
sorry!). It's unclear to me why Chrome apps would choose not to unload
the app if the a socket read callback was pending. Can you explain why
this is the case? It's unclear why it's OK to unload the app if we
structure it as an event instead of a callback.

>
>>
>>
>>>
>>>
>>> > The "read+callback" pattern was inefficient because, in general, API calls for Chrome Apps/Extensions have a somewhat high latency (marshaling, security checks). In the UDP case, for example, it was pretty much impossible to write an app that would *not* miss packets sent at short intervals.
>>>
>>> I need to think a bit more about the UDP case, but for the TCP case, I
>>> don't understand the marshaling & security checks costs. Why are you
>>> marshaling instead of using shared memory (like what we do with
>>> ResourceDispatcherHost and Mojo DataPipes)? And why wouldn't the
>>> security check only happen once, when the pipe is established?
>>
>>
>> I don't think there is inherently anything preventing what you are suggesting, I was merely describing the state of the ChromeApp/Extension API binding code as it was implemented at the time (It is still mostly unchanged at this point, although I believe there are early prototypes to port some extensions components to Mojo).

Great, thanks.

>>
>>>
>>>
>>> > We got feedback from developers that they would prefer have an event based API. The "old" API was forcing them to "poll", and that was tricky/annoying to get right.
>>>
>>> Can you point out discussion threads on this? That said, I can fully
>>> believe that for the 90% case, an event based API is preferable for
>>> developers.
>>
>>
>> Here are 2 examples of issues due to incorrect polling:
>>
>> https://github.com/GoogleChrome/chrome-app-samples/issues/74
>> https://github.com/GoogleChrome/chrome-app-samples/blob/master/samples/webserver/index.js#L52
>>
>> There were also in person design meetings + comments in various Chrome bugs -- I have not kept track of everything.
>>
>>>
>>> > We got feedback from developers that having TCP Client/UDP/TCP server in a single namespace was confusing.
>>> >
>>> > Given that we wanted to address "1" for sure, and other points if possible too, we looked around for other JavaScript based APIs for raw sockets, and we found 2 in particular 1) net for node.js and 2) raw-sockets for Sysapps. It turned out that both APIs seemed to address the 4 points above, and one of them (net for node.js) was already widely used and validated, so it gave us some confidence about the design. We also had a Chrome Apps API precedent for solving issue #1 (https://developer.chrome.com/apps/alarms)
>>> >
>>> > All in all, we eventually decided to go with a design *similar* to net for node.js (i.e. separate namespaces, event based) but *not exactly* the same, merely for mechanical reasons: ChromeApps use raw numbers for resource identifiers, not plain JavaScript objects.
>>> >
>>> > During the implementation, we validated the work by porting the chrome.socket samples to the new API, and we also ported CIRC (the most popular Chrome App using raw sockets).
>>> >
>>> > As an aside, we also at that time made the tradeoff that the "setPaused" method would be slightly less strict that promised in the doc. It promises that *no* "onReceive" event will be fired after calling "setPaused", whereas the implementation effectively guarantees that "*at most 1* onReceive" event will be fired. We decided to wait for feedback from developers to see if this was a problem. We never got feedback this was a problem (until recently with the issue at hand), so we never "fixed" that behavior. Remember at the time, the scenario for the "setPaused" function was seen as a way to throttle downloads, in case the Chrome App was not able to process data coming from the peer fast enough, so an extra "onReceive" event after calling "setPaused" was not seen as a major issue.
>>>
>>> setPaused() is important for backpressure. But in order to implement
>>> it correctly, you need to fulfill the promise of not invoking a
>>> onReceive event. As demonstrated, an API that does not conform to this
>>> results in bugs.
>>
>>
>> I think we all agree "setPaused()" should be fixed -- although it is not sufficient to fully address the issue raised originally in this thread. My main goal about giving history data points was to illustrate that "Hindsight is 20/20" :)

Did you consult net-dev@ about this API design point before? I'd be
surprised if we missed that. This is a common theme in our network
programming, so we see it a lot.

Furthermore, socket events & pausing are an insufficient API for
writing high performance servers. You simply have too coarse grained
control over buffers. The event model is only sufficient for simple
apps. For performant apps, you need a BYOB (Bring Your Own Buffer)
model.
Yes, new topic please. I see Ryan's concerns, although I'm not
immediately convinced. I'm a big believer in letting web apps (with
the appropriate permissions granted somehow) achieve feature parity
with native apps. But breaking the SOP definitely gives me significant
pause, and I'd have to hear more about this to reconcile my
conflicting feelings here.

>>
>> Also, should this group own/lead the effort?
>
> Owners? We're a meritocracy! ;)
>
> More aptly, I think this group can very much inform and share about real world experience for a variety of APIs. I think Will's point was something to the effect of, in an ideal world, could you do everything Chrome does today (in native code) using JS?

Yes.

> A second concern, but as equally pressing as that primary concern, is whether or not we can do so in an opinionated and idiomatic way, which I think is something that ChromeApps has a lot of insight into.
>
> For example, Apps/Extensions don't support Promises, and it doesn't seem they will anytime Soon (tm). What are the implications for these efforts? Don't know.
>
> More to the general topic at hand, which is Lally/Lucas' problem, how do you feel about the 'backwards compatible' hack proposed?

Is this where we fix setPaused() to do as documented (don't fire
events)? Sure, I think we should do that. I don't think it's a hack. I
think it's fixing a clear API functionality bug. If we can tolerate
the potential compat fallout, then let's do it. I don't worry about
the extra buffer. The API is clearly not designed with performance in
mind in the first place, so who cares about the extra buffer.

Ryan Sleevi

unread,
Sep 15, 2014, 11:31:47 PM9/15/14
to William Chan (陈智昌), Ryan Sleevi, Renaud Paquay, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
1) Fix setPaused
2) Add a way for the recipient of onReceived() to indicate how much data they consumed, with any unconsumed data being delivered via another onReceived event (or, in the event you STARTTLS to a TLS server, delivered as part of the data to the TLS handshake) 

William Chan (陈智昌)

unread,
Sep 15, 2014, 11:39:33 PM9/15/14
to Ryan Sleevi, Renaud Paquay, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
Wait, what? I totally missed this, and it sounds terrible. I don't
understand why it's needed. Can you link to the earlier discussion
where this is explained, or explain it for me?

William Chan (陈智昌)

unread,
Sep 15, 2014, 11:42:15 PM9/15/14
to Ryan Sleevi, Renaud Paquay, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
On Mon, Sep 15, 2014 at 8:39 PM, William Chan (陈智昌)
And if this functionality is required, I'd totally prefer to switch
back to the old deprecated socket API (which I think should not have
been deprecated).

Ryan Sleevi

unread,
Sep 15, 2014, 11:45:36 PM9/15/14
to William Chan (陈智昌), Ryan Sleevi, Renaud Paquay, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
Client -> Server
"Oh hai, I'd like to start TLS" + Client Hello
Server -> Client
"Okay! Let's start TLS" + Server Hello

In the current form, onReceive uses a 4K buffer (sure, yes, you can customize it, but there's plenty of ways to mess with that). As a result, the onReceive of the socket has a single message,
[ "Okay! Let's start TLS" jibba jabba bytes for TLS ]

There's no way to signal that the extension JUST read the "Okay! Let's start TLS" part, and let the "jibba jabba bytes for TLS" to be pushed to the TLS socket within Chrome.

Similarly, if the _server_ were an extension (and Lally and Lucas are certainly interested in that case, as they always are :P), it would have the same issue with reading JUST the "Oh hai, I'd like to start TLS" case, and leaving the TLS client hello to be left for Chrome to handle (e.g. after saying 'I read 10 bytes, now chrome.socket.secure me')

This can be done in a backwards-compatible way with the 'new' API (by just having a new method to indicate consumption during onReceive, with the default non-call being "all consumed"). This doesn't introduce _any_ new buffering overhead, because the Extensions layer (the EventDispatcher) _already_ has the buffer (the original 4K) and can just hold on to it until it's next loop iteration.

In the "legacy" API, you'd just read as many bytes up to your "Oh hai, I'd like to start TLS" - and nothing more.

William Chan (陈智昌)

unread,
Sep 16, 2014, 12:01:26 AM9/16/14
to Ryan Sleevi, Renaud Paquay, Lally Singh, Lucas Dixon, net-dev, Lally Singh, David Benjamin
This does not match my recollection of STARTTLS, but you're clearly
the expert here. But when I look at
http://tools.ietf.org/html/rfc3207, I see a flow like:

Client -> Server
"Oh hai, I'd like to start TLS"
Server -> Client
"Okay! Let's start TLS"
Client -> Server
Client Hello
Server -> Client
Server Hello
... <rest of TLS handshake>

This is why I'm confused, since I didn't think a flight of data would
include both an application layer message and a TLS layer message. But
if I'm wrong about that, then I see why you'd have to indicate amount
consumed. And I consider such an API unacceptable.

Ryan Sleevi

unread,
Sep 16, 2014, 12:32:52 AM9/16/14
to William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, Renaud Paquay, Lally Singh, David Benjamin

Two words: False Start :)

More generally speaking, you can run into this situation with Upgrade or with a variety of vendor-specific protocols (and I've both seen large deployments of and been responsible for deploying such protocols in the past)

I want to avoid having to piecemeal the API, and want to make sure its internally consistent.

It does sound like we are in violent agreement that the legacy API, from the POV of a network consumer, is actually the desirable API.

Renaud Paquay

unread,
Sep 16, 2014, 11:18:22 AM9/16/14
to Ryan Sleevi, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, Renaud Paquay, Lally Singh, David Benjamin
To answer Ryan question: Yes, I think that 1) fixing setPaused and 2) exposing a way to "unread" a certain amount of bytes would address the Lally's issue and be self consistent with the API. Thanks!

William Chan (陈智昌)

unread,
Sep 16, 2014, 2:23:37 PM9/16/14
to Renaud Paquay, Ryan Sleevi, Lucas Dixon, Lally Singh, net-dev, Lally Singh, David Benjamin
On Tue, Sep 16, 2014 at 8:18 AM, Renaud Paquay <rpa...@chromium.org> wrote:
> To answer Ryan question: Yes, I think that 1) fixing setPaused and 2)
> exposing a way to "unread" a certain amount of bytes would address the
> Lally's issue and be self consistent with the API. Thanks!

I don't understand how it fixes it, and I think it's a terrible API.
You guys own the API, but I strongly recommend against it, and am
unclear how it could work.
Sorry, I'm dense. Why? IIRC, False Start applies to the data written
after the TLS handshake begins, whereas the "Oh hai, I'd like to start
TLS" happens before, right?

>>
>> More generally speaking, you can run into this situation with Upgrade or
>> with a variety of vendor-specific protocols (and I've both seen large
>> deployments of and been responsible for deploying such protocols in the
>> past)
>>
>> I want to avoid having to piecemeal the API, and want to make sure its
>> internally consistent.

I don't really understand still :( Doesn't the request to start TLS
happen in cleartext *before* the TLS handshake happens? Like, you'd
have the socket write call and the read event.
I also don't know how the "unconsume" API works if it needs to shove
data to be read as part of the TLS handshake. Since we can have
multiple layers of sockets wrapping one another, it's unclear which
level you'd have to "unconsume" data back into. Our network stack
doesn't support anything like this, and there's no way to emulate it
in the apps layer above the network stack. This can only work if we
are treating the data as coming from the raw transport (TCP/UDP)
socket, but then if we wanted to support proxies or what not (via our
socket layers), then we'd be totally hosed.

>>
>> It does sound like we are in violent agreement that the legacy API, from
>> the POV of a network consumer, is actually the desirable API.

Most certainly.

Ryan Sleevi

unread,
Sep 16, 2014, 10:35:18 PM9/16/14
to William Chan (陈智昌), Renaud Paquay, Ryan Sleevi, Lucas Dixon, Lally Singh, net-dev, Lally Singh, David Benjamin
False Start violates the TLS spec by concatenating the Client Finished with the App Data in the same packet, rather than waiting for the server to send the Server Finished.

The "O hai case" may see application data prepended with the TLS handshake (e.g., sending the TLS client hello without waiting for the peer's "O hai" confirmation), thus creating the same timing issues.

>>
>> More generally speaking, you can run into this situation with Upgrade or
>> with a variety of vendor-specific protocols (and I've both seen large
>> deployments of and been responsible for deploying such protocols in the
>> past)
>>
>> I want to avoid having to piecemeal the API, and want to make sure its
>> internally consistent.

I don't really understand still :( Doesn't the request to start TLS
happen in cleartext *before* the TLS handshake happens? Like, you'd
have the socket write call and the read event.

Yup
 
I also don't know how the "unconsume" API works if it needs to shove
data to be read as part of the TLS handshake. Since we can have
multiple layers of sockets wrapping one another, it's unclear which
level you'd have to "unconsume" data back into. Our network stack
doesn't support anything like this, and there's no way to emulate it
in the apps layer above the network stack.

Sure there is. You just create your own Socket. Our own unittests do this to have determinism on the SSL layer.

The Good/Bad News is that Apps/Extensions uses very little of what we consider the "high level" Chrome network stack, just using the basic abstractions (TCPClientSocket, ClientSocket, etc). So ChrApps can just create their own net::ClientSocket in //chrome to do the buffering (see Lally's CL as an example that does just this)
 
This can only work if we
are treating the data as coming from the raw transport (TCP/UDP)
socket, but then if we wanted to support proxies or what not (via our
socket layers), then we'd be totally hosed.

Not really. We'll have already handed off the real (TCP || TLS) socket from the proxy layer at that time, and so it's just a matter of composition.

Again, we have existing code that does just this (in Chromoting and in ChrApps)

Lally Singh

unread,
Jan 9, 2015, 5:01:37 PM1/9/15
to rsl...@chromium.org, William Chan (陈智昌), Renaud Paquay, Lucas Dixon, Lally Singh, net-dev, David Benjamin
Hello again!  Apologies for the long delay.  

As a reminder, the new sockets API has a background loop running, that continuously calls read() on the socket and issues 'onReceive' events to the JS context that owns the socket.  There can be some issues when that socket goes from a cleartext to a TLS connection via secure().  If the socket is for TLS-traffic only, it can be paused before connecting.  Pausing a socket, as a refresher, turns off the loop that calls read() and dispatches onReceive() events.   The JS can then call secure() as soon as connect() completes, and all is well.   However, if there is some cleartext conversation to happen first, the loop calling read() can send over a buffer in a 'onReceive' event that contains both cleartext and the start of a TLS handshake.  SO, instead of trying to tell secure() "hey, here are the first K bytes of the server-side handshake", we're investigating a less-messy solution, and asking for help from the wise, kind, generous, and quite attractive members of net-dev@.  

(The APIs don't provide a way to drop back down from TLS to cleartext on a socket, so we're primarily concerned with the cleartext -> TLS transition.)

I researched the implementation for an "unread/mark-consumed" api, where the application indicates to chrome that it only read K < N bytes of the N-byte buffer of data that it got in an onReceive() event, and that the remaining (N-K) bytes are to be saved for a future onReceive event (or consumed by a an upcoming TLS handshake).  Unfortunately, it'd require a synchronization between the thread that issues read() on that socket, and the thread that processes onReceive() events.  Some light hackery would be involved in working around this (specifically, that the event to be delivered to JS would have to be inspected, to see if it's an onReceive(), and a response sent back when the event is processed, saying "all N bytes consumed" or "got an API call, processed here, saying only K bytes consumed".

Which is doable, but a bit messy.  Worse, there's that nasty event queue in the middle.  So an app can start receiving data much faster that it can process, and that data would just queue up locally, instead of throttling read() and letting TCP flow control throttle the incoming transmit rate.  I donno if we care, but it could OOM-out the app's renderer thread, Unless there's some other mechanism that kicks in that I haven't noticed... 

 I suppose we could fix that issue (which AFAICT is an issue with the API today) with waiting for the "%d bytes consumed" message before issuing the next read().   This all requires a new synchronization between the two threads, which does protect against receive-side OOMing, but will probably also hurt throughput for apps that do a lot of I/O.

So, we're back to considering a lower-level API (e.g., read() or something with a different name) that's enabled when the socket is paused.   Nicely, this lets apps that risk receiving OOM-level amounts of data manually control their rate of data digestion, without hurting the throughput of apps that don't feel that it's a risk.

So, if the app is talking to a server, and the server sends some "Let's start TLS, here's my ServerHello" message, the app could read the "Let's start TLS" part, and then call secure(), which consumes the ServerHello.

Does that sound reasonable?



Cheers,
$ ls

Lally Singh | Google Ideas | Engineering | la...@google.com
 


Ryan Sleevi

unread,
Jan 9, 2015, 7:18:51 PM1/9/15
to Lally Singh, Ryan Sleevi, William Chan (陈智昌), Renaud Paquay, Lucas Dixon, Lally Singh, net-dev, David Benjamin
On Fri, Jan 9, 2015 at 2:00 PM, 'Lally Singh' via net-dev <net...@chromium.org> wrote:
Hello again!  Apologies for the long delay.  

As a reminder, the new sockets API has a background loop running, that continuously calls read() on the socket and issues 'onReceive' events to the JS context that owns the socket.  There can be some issues when that socket goes from a cleartext to a TLS connection via secure().  If the socket is for TLS-traffic only, it can be paused before connecting.  Pausing a socket, as a refresher, turns off the loop that calls read() and dispatches onReceive() events.   The JS can then call secure() as soon as connect() completes, and all is well.   However, if there is some cleartext conversation to happen first, the loop calling read() can send over a buffer in a 'onReceive' event that contains both cleartext and the start of a TLS handshake.  SO, instead of trying to tell secure() "hey, here are the first K bytes of the server-side handshake", we're investigating a less-messy solution, and asking for help from the wise, kind, generous, and quite attractive members of net-dev@.  

(The APIs don't provide a way to drop back down from TLS to cleartext on a socket, so we're primarily concerned with the cleartext -> TLS transition.)

I researched the implementation for an "unread/mark-consumed" api, where the application indicates to chrome that it only read K < N bytes of the N-byte buffer of data that it got in an onReceive() event, and that the remaining (N-K) bytes are to be saved for a future onReceive event (or consumed by a an upcoming TLS handshake).  Unfortunately, it'd require a synchronization between the thread that issues read() on that socket, and the thread that processes onReceive() events.  Some light hackery would be involved in working around this (specifically, that the event to be delivered to JS would have to be inspected, to see if it's an onReceive(), and a response sent back when the event is processed, saying "all N bytes consumed" or "got an API call, processed here, saying only K bytes consumed".

Which is doable, but a bit messy.  Worse, there's that nasty event queue in the middle.  So an app can start receiving data much faster that it can process, and that data would just queue up locally, instead of throttling read() and letting TCP flow control throttle the incoming transmit rate.  I donno if we care, but it could OOM-out the app's renderer thread, Unless there's some other mechanism that kicks in that I haven't noticed... 

 I suppose we could fix that issue (which AFAICT is an issue with the API today) with waiting for the "%d bytes consumed" message before issuing the next read().   This all requires a new synchronization between the two threads, which does protect against receive-side OOMing, but will probably also hurt throughput for apps that do a lot of I/O.

You present this as two extremes - either unbounded reads or sycnhronizing reads - but is there not a middle ground, the same as there is in the kernels for most modern TCP stacks?

That is, there's an internal buffer but it isn't allowed to exceed some size. Yes, this contributes to buffer bloat, but it also avoids either extreme you're talking about.
 

So, we're back to considering a lower-level API (e.g., read() or something with a different name) that's enabled when the socket is paused.   Nicely, this lets apps that risk receiving OOM-level amounts of data manually control their rate of data digestion, without hurting the throughput of apps that don't feel that it's a risk.

New APIs to work around this have an immediate feel of "gross" to me :/
 

--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To post to this group, send email to net...@chromium.org.

Lally Singh

unread,
Jan 12, 2015, 1:06:37 PM1/12/15
to rsl...@chromium.org, William Chan (陈智昌), Renaud Paquay, Lucas Dixon, Lally Singh, net-dev, David Benjamin
On Fri, Jan 9, 2015 at 7:18 PM, Ryan Sleevi <rsl...@chromium.org> wrote:


On Fri, Jan 9, 2015 at 2:00 PM, 'Lally Singh' via net-dev <net...@chromium.org> wrote:
Hello again!  Apologies for the long delay.  

As a reminder, the new sockets API has a background loop running, that continuously calls read() on the socket and issues 'onReceive' events to the JS context that owns the socket.  There can be some issues when that socket goes from a cleartext to a TLS connection via secure().  If the socket is for TLS-traffic only, it can be paused before connecting.  Pausing a socket, as a refresher, turns off the loop that calls read() and dispatches onReceive() events.   The JS can then call secure() as soon as connect() completes, and all is well.   However, if there is some cleartext conversation to happen first, the loop calling read() can send over a buffer in a 'onReceive' event that contains both cleartext and the start of a TLS handshake.  SO, instead of trying to tell secure() "hey, here are the first K bytes of the server-side handshake", we're investigating a less-messy solution, and asking for help from the wise, kind, generous, and quite attractive members of net-dev@.  

(The APIs don't provide a way to drop back down from TLS to cleartext on a socket, so we're primarily concerned with the cleartext -> TLS transition.)

I researched the implementation for an "unread/mark-consumed" api, where the application indicates to chrome that it only read K < N bytes of the N-byte buffer of data that it got in an onReceive() event, and that the remaining (N-K) bytes are to be saved for a future onReceive event (or consumed by a an upcoming TLS handshake).  Unfortunately, it'd require a synchronization between the thread that issues read() on that socket, and the thread that processes onReceive() events.  Some light hackery would be involved in working around this (specifically, that the event to be delivered to JS would have to be inspected, to see if it's an onReceive(), and a response sent back when the event is processed, saying "all N bytes consumed" or "got an API call, processed here, saying only K bytes consumed".

Which is doable, but a bit messy.  Worse, there's that nasty event queue in the middle.  So an app can start receiving data much faster that it can process, and that data would just queue up locally, instead of throttling read() and letting TCP flow control throttle the incoming transmit rate.  I donno if we care, but it could OOM-out the app's renderer thread, Unless there's some other mechanism that kicks in that I haven't noticed... 

 I suppose we could fix that issue (which AFAICT is an issue with the API today) with waiting for the "%d bytes consumed" message before issuing the next read().   This all requires a new synchronization between the two threads, which does protect against receive-side OOMing, but will probably also hurt throughput for apps that do a lot of I/O.

You present this as two extremes - either unbounded reads or sycnhronizing reads - but is there not a middle ground, the same as there is in the kernels for most modern TCP stacks?

That is, there's an internal buffer but it isn't allowed to exceed some size. Yes, this contributes to buffer bloat, but it also avoids either extreme you're talking about.

I suppose an internal buffer, mixed with an event-ack that includes how much of the buffer was actually used, may work?  So, going with this idea:  the read-loop sends events to the receiver with some fixed-size amount (currently, up to 4k, depending on how much data's available), and the receiving-side sends back Event-ACKs with how much was consumed.    We can avoid a lock-step synchrony by allowing more than one Event to be in-flight at a time (as goes on now), and figure out the offset into the internal buffer upon event reception.  Then it's just a matter of making sure that we never have data in the buffer without an OnReceive event to go with it.  

We'll just have to actually start using synchronization primitives on that buffer, instead of the fully-async model used between threads right now.

How does that work for you?  Renaud?
 
 

So, we're back to considering a lower-level API (e.g., read() or something with a different name) that's enabled when the socket is paused.   Nicely, this lets apps that risk receiving OOM-level amounts of data manually control their rate of data digestion, without hurting the throughput of apps that don't feel that it's a risk.

New APIs to work around this have an immediate feel of "gross" to me :/
Well we either have some sort of 'setConsumed()' to mark how much data we're using out of the OnReceive event, or we have a synchronous read().  I don't think we have a way to avoid this for a mixed plaintext/cyphertext* I/O situation.
 
* where the cyphertext is really the setup messaging for TLS...

Matt Menke

unread,
Jan 12, 2015, 2:26:09 PM1/12/15
to Lally Singh, Ryan Sleevi, William Chan (陈智昌), Renaud Paquay, Lucas Dixon, Lally Singh, net-dev, David Benjamin
Why not an async read?

socket.read(bytes, callback)  (Or use promises or whatever).

Lally Singh

unread,
Jan 12, 2015, 2:45:14 PM1/12/15
to Matt Menke, Ryan Sleevi, William Chan (陈智昌), Renaud Paquay, Lucas Dixon, Lally Singh, net-dev, David Benjamin
On Mon, Jan 12, 2015 at 2:26 PM, Matt Menke <mme...@chromium.org> wrote:
Why not an async read?

socket.read(bytes, callback)  (Or use promises or whatever).

Sorry, that's what I meant for the lower-level read().

Renaud Paquay

unread,
Jan 12, 2015, 2:50:16 PM1/12/15
to Lally Singh, Ryan Sleevi, William Chan (陈智昌), Renaud Paquay, Lucas Dixon, Lally Singh, net-dev, David Benjamin
On Mon, Jan 12, 2015 at 10:05 AM, Lally Singh <la...@google.com> wrote:
On Fri, Jan 9, 2015 at 7:18 PM, Ryan Sleevi <rsl...@chromium.org> wrote:


On Fri, Jan 9, 2015 at 2:00 PM, 'Lally Singh' via net-dev <net...@chromium.org> wrote:
Hello again!  Apologies for the long delay.  

As a reminder, the new sockets API has a background loop running, that continuously calls read() on the socket and issues 'onReceive' events to the JS context that owns the socket.  There can be some issues when that socket goes from a cleartext to a TLS connection via secure().  If the socket is for TLS-traffic only, it can be paused before connecting.  Pausing a socket, as a refresher, turns off the loop that calls read() and dispatches onReceive() events.   The JS can then call secure() as soon as connect() completes, and all is well.   However, if there is some cleartext conversation to happen first, the loop calling read() can send over a buffer in a 'onReceive' event that contains both cleartext and the start of a TLS handshake.  SO, instead of trying to tell secure() "hey, here are the first K bytes of the server-side handshake", we're investigating a less-messy solution, and asking for help from the wise, kind, generous, and quite attractive members of net-dev@.  

(The APIs don't provide a way to drop back down from TLS to cleartext on a socket, so we're primarily concerned with the cleartext -> TLS transition.)

I researched the implementation for an "unread/mark-consumed" api, where the application indicates to chrome that it only read K < N bytes of the N-byte buffer of data that it got in an onReceive() event, and that the remaining (N-K) bytes are to be saved for a future onReceive event (or consumed by a an upcoming TLS handshake).  Unfortunately, it'd require a synchronization between the thread that issues read() on that socket, and the thread that processes onReceive() events.  Some light hackery would be involved in working around this (specifically, that the event to be delivered to JS would have to be inspected, to see if it's an onReceive(), and a response sent back when the event is processed, saying "all N bytes consumed" or "got an API call, processed here, saying only K bytes consumed".

Which is doable, but a bit messy.  Worse, there's that nasty event queue in the middle.  So an app can start receiving data much faster that it can process, and that data would just queue up locally, instead of throttling read() and letting TCP flow control throttle the incoming transmit rate.  I donno if we care, but it could OOM-out the app's renderer thread, Unless there's some other mechanism that kicks in that I haven't noticed... 

 I suppose we could fix that issue (which AFAICT is an issue with the API today) with waiting for the "%d bytes consumed" message before issuing the next read().   This all requires a new synchronization between the two threads, which does protect against receive-side OOMing, but will probably also hurt throughput for apps that do a lot of I/O.

You present this as two extremes - either unbounded reads or sycnhronizing reads - but is there not a middle ground, the same as there is in the kernels for most modern TCP stacks?

That is, there's an internal buffer but it isn't allowed to exceed some size. Yes, this contributes to buffer bloat, but it also avoids either extreme you're talking about.

I suppose an internal buffer, mixed with an event-ack that includes how much of the buffer was actually used, may work?  So, going with this idea:  the read-loop sends events to the receiver with some fixed-size amount (currently, up to 4k, depending on how much data's available), and the receiving-side sends back Event-ACKs with how much was consumed.    We can avoid a lock-step synchrony by allowing more than one Event to be in-flight at a time (as goes on now), and figure out the offset into the internal buffer upon event reception.  Then it's just a matter of making sure that we never have data in the buffer without an OnReceive event to go with it.  

We'll just have to actually start using synchronization primitives on that buffer, instead of the fully-async model used between threads right now.

How does that work for you?  Renaud?

I think it is doable, but tricky:  we'd have to think about the interaction with the "setPause", "close", etc. calls.
 
 
 

So, we're back to considering a lower-level API (e.g., read() or something with a different name) that's enabled when the socket is paused.   Nicely, this lets apps that risk receiving OOM-level amounts of data manually control their rate of data digestion, without hurting the throughput of apps that don't feel that it's a risk.

New APIs to work around this have an immediate feel of "gross" to me :/
Well we either have some sort of 'setConsumed()' to mark how much data we're using out of the OnReceive event, or we have a synchronous read().  I don't think we have a way to avoid this for a mixed plaintext/cyphertext* I/O situation.
 
* where the cyphertext is really the setup messaging for TLS...

Yeah, maybe there was a misunderstanding of what "new API" means. I think Lally is merely talking about adding a single "read" function to the "chrome.sockets.tcp" API. Either solution we have looked at so far at least requires adding a new function to the API -- either an async read() function or a setConsumed() function.

FWIW, I prefer the option of adding a new (async) "read" function, as it has the nice property of making chrome.sockets (note the "s") a strict superset of chrome.socket. Essentially, chrome.sockets.tcp would have 2 modes: 

1) setPaused == true => application explicitly uses "read", has fine grained control on data flow.
2) setPaused == false => application waits for "onReceive" events, but has no fine grained control on data flow. 

Going back to the unbounded reads and buffering, I think it probably deserves its own tracking bug (because we should address it, notwithstanding what we chose to do for STARTTLS scenarios).

David Benjamin

unread,
Jan 12, 2015, 3:08:16 PM1/12/15
to Renaud Paquay, Lally Singh, Ryan Sleevi, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev

David Benjamin

unread,
Jan 12, 2015, 3:09:51 PM1/12/15
to Renaud Paquay, Lally Singh, Ryan Sleevi, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev
Oops. Hit the send button too early. Sorry about that. Inbox being finicky.

On Mon Jan 12 2015 at 2:50:17 PM Renaud Paquay <rpa...@chromium.org> wrote:
FWIW, I prefer the option of adding a new (async) "read" function, as it has the nice property of making chrome.sockets (note the "s") a strict superset of chrome.socket. Essentially, chrome.sockets.tcp would have 2 modes: 

1) setPaused == true => application explicitly uses "read", has fine grained control on data flow.
2) setPaused == false => application waits for "onReceive" events, but has no fine grained control on data flow. 

This would still require fixing setPaused's behavior, right? Otherwise you could receive data from onReceived while the setPaused is still asynchronously going through.

David

 
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+unsubscribe@chromium.org.

To post to this group, send email to net...@chromium.org.

Renaud Paquay

unread,
Jan 12, 2015, 3:20:33 PM1/12/15
to David Benjamin, Renaud Paquay, Lally Singh, Ryan Sleevi, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev
Yes.

Ryan Hamilton

unread,
Jan 12, 2015, 3:22:34 PM1/12/15
to Renaud Paquay, Lally Singh, Ryan Sleevi, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin

On Mon, Jan 12, 2015 at 11:50 AM, Renaud Paquay <rpa...@chromium.org> wrote:
FWIW, I prefer the option of adding a new (async) "read" function, as it has the nice property of making chrome.sockets (note the "s") a strict superset of chrome.socket. Essentially, chrome.sockets.tcp would have 2 modes: 

1) setPaused == true => application explicitly uses "read", has fine grained control on data flow.
2) setPaused == false => application waits for "onReceive" events, but has no fine grained control on data flow. 

​It seems like you could implement setPaused on top of an async read() method. If so, I wonder if it's not simpler to remove setPaused from the API and simply provider async read(). If some app wants to simply get OnReceive events it should be pretty trivial to wrap read() to give them than.​

Ryan Sleevi

unread,
Jan 12, 2015, 3:29:29 PM1/12/15
to Ryan Hamilton, Renaud Paquay, Lally Singh, Ryan Sleevi, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin
Doesn't this return back to the performance problem that the new API
was trying to solve though?

Ryan Hamilton

unread,
Jan 12, 2015, 3:38:01 PM1/12/15
to Ryan Sleevi, Renaud Paquay, Lally Singh, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin
​It's possible that I lost the train somewhere along the way, but I thought that the performance problem as associated with using OnDataRecveived​
 
​exclusively. I would have thought that an async read() based API would be quite performant, since that maps well to what we have under the hood. 

My point about implementing OnDataReceived on top of read() was not to suggest that it would be the high-performance, preferred API but that if someone *really* wanted it,then they could implement it; knowing that they would sacrifice performance for some simplicity. 

But am I misremembering the performance issue?

​Cheers,

Ryan​


Renaud Paquay

unread,
Jan 12, 2015, 3:41:35 PM1/12/15
to Ryan Hamilton, Renaud Paquay, Lally Singh, Ryan Sleevi, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin
In theory, yes. In practice, there are a couple of things to think about

1) Deprecation: setPaused and onReceived were published a while back (chrome 33 according to https://developer.chrome.com/apps/sockets_tcp).

2) onReceive has a *subtle* property that can't (currently) be implemented using an async "read": Chrome will never suspend an application with a pending read active, whereas it may suspend an application using the "onReceived" event. This behavior is specific to the ChromeApps runtime, and not very well documented, but I believe a few apps out there take advantage of the behavior.

On Mon, Jan 12, 2015 at 12:22 PM, Ryan Hamilton <r...@chromium.org> wrote:

Matt Menke

unread,
Jan 12, 2015, 3:47:02 PM1/12/15
to Ryan Hamilton, Ryan Sleevi, Renaud Paquay, Lally Singh, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin
I don't believe async read quite maps to what we have for standard reads from the renderer.  We push data to the renderer, tracking ACKs, and don't stop reading until our read buffer is full of un-ACKed data, as I understand it.

--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+u...@chromium.org.
To post to this group, send email to net...@chromium.org.

Ryan Hamilton

unread,
Jan 12, 2015, 3:51:06 PM1/12/15
to Matt Menke, Ryan Sleevi, Renaud Paquay, Lally Singh, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin

On Mon, Jan 12, 2015 at 12:47 PM, Matt Menke <mme...@chromium.org> wrote:
I don't believe async read quite maps to what we have for standard reads from the renderer.  We push data to the renderer, tracking ACKs, and don't stop reading until our read buffer is full of un-ACKed data, as I understand it.

​Ah, I see. So it maps to net/ but not to the renderer. Rats. OK, I'll bow out of this thread :>​

Renaud Paquay

unread,
Jan 12, 2015, 4:02:27 PM1/12/15
to Ryan Hamilton, Matt Menke, Ryan Sleevi, Renaud Paquay, Lally Singh, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin
I think the performance problem we are talking about is this:

A ChromeApp using an async "read" suffers from the latency associated with ChromeApps JavaScript binding. Basically, there is some latency between the "read" JS call is made by the app and the actual socket (net/) read operation is issued. There is also some latency between the low level socket read callback being invoked and the corresponding JS callback invoked inside the ChromeApp. Given a ChromeApp can have only a single pending read, it has to wait for the callback invocation before issuing a new read call.

When we first implemented the "onReceived" pattern, we noticed an increase of about 10x in terms of throughput, because we essentially always have an active low level pending "read" on the socket, and we asynchronously emit a JS event every time we get data.

(As an aside, Lally recently discovered the "onReceived" behavior potentially leads to the "unbounded" read behavior he describes in a previous mail).

To answer Ryan (rsleevi@) question: Yes, but I think "setConsumed" would suffer the same performance issue as "read":  naive implementation of setConsumed requires that every event is ACK'ed before a new read can be issued, so we are back to the same latency issue we have with "read".

In both cases (read and setConsumed), we could improve the "naive" behavior and improve throughput by implementing some user-space buffering (tricky). So, basically, I *think* read and setConsumed would both have the same performance characteristics.

Ryan Sleevi

unread,
Jan 12, 2015, 4:24:06 PM1/12/15
to Renaud Paquay, Ryan Hamilton, Matt Menke, Ryan Sleevi, Lally Singh, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin
On Mon, Jan 12, 2015 at 1:02 PM, Renaud Paquay <rpa...@chromium.org> wrote:
> I think the performance problem we are talking about is this:
>
> A ChromeApp using an async "read" suffers from the latency associated with
> ChromeApps JavaScript binding. Basically, there is some latency between the
> "read" JS call is made by the app and the actual socket (net/) read
> operation is issued. There is also some latency between the low level socket
> read callback being invoked and the corresponding JS callback invoked inside
> the ChromeApp. Given a ChromeApp can have only a single pending read, it has
> to wait for the callback invocation before issuing a new read call.
>
> When we first implemented the "onReceived" pattern, we noticed an increase
> of about 10x in terms of throughput, because we essentially always have an
> active low level pending "read" on the socket, and we asynchronously emit a
> JS event every time we get data.
>
> (As an aside, Lally recently discovered the "onReceived" behavior
> potentially leads to the "unbounded" read behavior he describes in a
> previous mail).
>
> To answer Ryan (rsleevi@) question: Yes, but I think "setConsumed" would
> suffer the same performance issue as "read": naive implementation of
> setConsumed requires that every event is ACK'ed before a new read can be
> issued, so we are back to the same latency issue we have with "read".
>
> In both cases (read and setConsumed), we could improve the "naive" behavior
> and improve throughput by implementing some user-space buffering (tricky).
> So, basically, I *think* read and setConsumed would both have the same
> performance characteristics.

I think we have to implement some degree of userspace buffering, if
only to ensure there is proper backpressure for when we are *not*
consuming events. It seems like a real bug that we aren't.

Put differently, I'm assuming that, at least at the Browser<->Renderer
side, we need an explicit ACK phase between when the browser sends
bytes, the renderer gets them, and when the browser asks for more.

So assuming we have some synchronization event between these two (to
handle backpressure and the like), what differences are there between
read() and setConsumed()? The difference is that setConsumed() does
not need to be explicitly called, so it avoids an additional explicit
JS->C++ conversion. That is, in a read() system, the renderer can't
ack data unless read() is explicitly called again. In a setConsumed()
system, any tick of the JS loop that _doesn't_ get a setConsumed()
call implicitly acks all of the data.

I guess the performance tradeoff is whether it's cheaper to check (on
each down-tick after dispatching an onReceived) whether setConsumed
was called or whether it's cheaper to force the JS to call into C++
for every read. My gut is that the former is cheaper, but it's
entirely possible that they could be equal or negligible.

But my assumption is that both systems rely on solving the
backpressure issue as well.

Renaud Paquay

unread,
Jan 12, 2015, 7:19:42 PM1/12/15
to Ryan Sleevi, Renaud Paquay, Ryan Hamilton, Matt Menke, Lally Singh, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin
I see. Just to make sure I understand what you have in mind: Even in the case of the implicit call (from JS perspective) to "setConsumed", there would still be a renderer -> browser (c++) call for the ACK, right?

In anyways, with the backpressure assumption, I agree that, performance-wise, setConsumed is likely more efficient or equivalent (and in any case, I can't think of a reason it would be slower). It may very well be negligible indeed.

Ryan Sleevi

unread,
Jan 12, 2015, 7:21:58 PM1/12/15
to Renaud Paquay, Ryan Sleevi, Ryan Hamilton, Matt Menke, Lally Singh, William Chan (陈智昌), Lucas Dixon, Lally Singh, net-dev, David Benjamin
On Mon, Jan 12, 2015 at 4:19 PM, Renaud Paquay <rpa...@chromium.org> wrote:
> I see. Just to make sure I understand what you have in mind: Even in the
> case of the implicit call (from JS perspective) to "setConsumed", there
> would still be a renderer -> browser (c++) call for the ACK, right?

Right
Reply all
Reply to author
Forward
0 new messages