Adding WebSocket protocol stack in chrome/net

228 views
Skip to first unread message

Yuta Kitamura

unread,
Feb 9, 2012, 4:30:40 AM2/9/12
to chromium-dev, will...@chromium.org, w...@chromium.org, Takashi Toyoshima
Hi all,

We are planning to add WebSocket protocol stack in chrome/net, and wrote a document describing motivation and detailed design.

Follow this link to read lightweight read-only version:

If you would like to make inline comments, use this heavier full-version:
(Please limit your usage of the full version because only 50 users are able to view it at the same time.)

Comments, suggestions, questions are much appreciated!

Thanks,
Yuta and Takashi

William Chan (陈智昌)

unread,
Feb 9, 2012, 2:03:31 PM2/9/12
to Yuta Kitamura, chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
[ +mmenke, +simonjam ]

Thanks for sending this out!

Just to be clear, this is a refactor of the existing WebSocket protocol stack, with support for multiplexing, right?

I took a brief 15 second read and have a number of sizable comments about the organization of the classes within net/. I need to read this much more thoroughly, but I'm swamped with a bunch of other work and vacation. Just want to say, I love the elimination of the separate WebSocket codepath and the reuse of much more of the HTTP stack, but I'm very opinionated (big surprise, eh?) on the exact way WebSocket integrates into the HTTP stack. Hopefully you guys will have time to wait for high level design comments before delving into code.

Yuta Kitamura

unread,
Feb 9, 2012, 9:08:35 PM2/9/12
to William Chan (陈智昌), chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
Thanks William,

On Fri, Feb 10, 2012 at 4:03 AM, William Chan (陈智昌) <will...@chromium.org> wrote:
[ +mmenke, +simonjam ]

Thanks for sending this out!

Just to be clear, this is a refactor of the existing WebSocket protocol stack, with support for multiplexing, right?

Correct.
 

I took a brief 15 second read and have a number of sizable comments about the organization of the classes within net/. I need to read this much more thoroughly, but I'm swamped with a bunch of other work and vacation. Just want to say, I love the elimination of the separate WebSocket codepath and the reuse of much more of the HTTP stack, but I'm very opinionated (big surprise, eh?) on the exact way WebSocket integrates into the HTTP stack. Hopefully you guys will have time to wait for high level design comments before delving into code.

Yes, we are happy to wait and listen to your and other net/ experts' advice. I'd love to take some time to figure out a good design.

Thanks,
Yuta

John Abd-El-Malek

unread,
Feb 9, 2012, 9:15:05 PM2/9/12
to yu...@google.com, William Chan (陈智昌), chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
small note: since this is a web platform feature, it belongs in the content layer (so content/net). see  http://www.chromium.org/developers/content-module 


--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

William Chan (陈智昌)

unread,
Feb 14, 2012, 3:50:19 AM2/14/12
to Yuta Kitamura, chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
As I said before, this is great! I'm glad to see the old SocketStream code die and have a tighter integration of WebSocket with the HTTP stack. Here are some high-level comments/questions:

* Unless Ws is a common abbreviation for WebSocket code, let's not use abbreviations here. Call these classes net::WebSocketFoo...
* I don't see a discussion of the desired throttling semantics and implementation. This is important for figuring out how to implement this. Please update the design doc to include this. The net/ socket pools enforce a lot of these limits for HTTP connections, so we'll have to think about what to do for WebSockets. My suspicion is the right thing to do is to simply use socket pools with much higher limits (but we'd still need limits to prevent DoS by renderers). But I'd like to hear what your desired semantics are.
* Why do we have extra userland send buffering in the browser-side classes for WebSockets? The TCP stack in the kernel will already throttle outstanding data to a certain degree, since the kernel will only allocate so much memory for socket send buffers. Also note that the SPDY layer will also throttle due to SPDY3 stream flow control semantics. Also note that WebSocket over SPDY socket buffering where multiple streams share the same socket will result in different throttling effects than a vanilla WebSocket over TCP connection.
* I think the idea of detaching ClientSocketHandles from HttpStreams is fraught with peril. I think the HttpStreamFactory should be renamed to something that encompasses both HTTP and WebSockets, and returns the appropriate stream in the appropriate situation, rather than always producing a HttpStream and trying to detach from it. Indeed, it is impossible to detach a ClientSocketHandle from a HttpPipelinedStream since there may be multiple streams using that underlying connection, so it cannot simply be hijacked for upgrading to WebSocket.
* The design doc does not discuss how we decide whether or not to do WebSocket over SPDY. Is it known prior to establishing a connection whether or not we'll be establishing it over SPDY? Or is that determined during connection establishment, and if so, how? This will have obvious implications on how HttpStreamFactoryImpl will need to change.

Sorry again for the tardy response. I'm back from vacation and once I dig myself out of email debt I should be more responsive again.

On Thu, Feb 9, 2012 at 6:08 PM, Yuta Kitamura <yu...@chromium.org> wrote:

Yuta Kitamura

unread,
Feb 18, 2012, 2:36:09 AM2/18/12
to William Chan (陈智昌), chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
Thanks William for looking at the doc.

Sorry for delay in response; I'm busy the rest of this week and next week, so I may not have enough time to update the doc. I will do it once I have a spare time, and will ping you when the new revision is ready.

Here are responses of your questions:

On Tue, Feb 14, 2012 at 5:50 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
As I said before, this is great! I'm glad to see the old SocketStream code die and have a tighter integration of WebSocket with the HTTP stack. Here are some high-level comments/questions:

* Unless Ws is a common abbreviation for WebSocket code, let's not use abbreviations here. Call these classes net::WebSocketFoo...

Sure.
 
* I don't see a discussion of the desired throttling semantics and implementation. This is important for figuring out how to implement this. Please update the design doc to include this. The net/ socket pools enforce a lot of these limits for HTTP connections, so we'll have to think about what to do for WebSockets. My suspicion is the right thing to do is to simply use socket pools with much higher limits (but we'd still need limits to prevent DoS by renderers). But I'd like to hear what your desired semantics are.

RFC6455 requires that there must not be more than one connection in CONNECTING state (i.e. WebSocket handshake is incomplete) for each IP address. To do this, WebSocket socket pool needs to get notified when each connection exits CONNECTING state, but I'm not sure if this should be done in socket pool layer (maybe requests could stall somewhere else).

I think RFC6455 requirement is practically good enough, but we could add more throttling criteria if there's some needs that cannot be addressed by RFC6455's throttling. To design a throttling criteria, we need to avoid having any WebSocket connection to stall indefinitely, because that blocks execution of user scripts. Rather, we should reject the connection right away.

* Why do we have extra userland send buffering in the browser-side classes for WebSockets? The TCP stack in the kernel will already throttle outstanding data to a certain degree, since the kernel will only allocate so much memory for socket send buffers. Also note that the SPDY layer will also throttle due to SPDY3 stream flow control semantics. Also note that WebSocket over SPDY socket buffering where multiple streams share the same socket will result in different throttling effects than a vanilla WebSocket over TCP connection.

I wanted to avoid extra IPC overhead that is caused by the following event chain: kernel's send buffer becomes open -> the browser sends an IPC which requests the renderer for more data -> the renderer sends more data -> the browser issues send() syscall.

But I agree this is subtle and may be omitted.
 
* I think the idea of detaching ClientSocketHandles from HttpStreams is fraught with peril. I think the HttpStreamFactory should be renamed to something that encompasses both HTTP and WebSockets, and returns the appropriate stream in the appropriate situation, rather than always producing a HttpStream and trying to detach from it. Indeed, it is impossible to detach a ClientSocketHandle from a HttpPipelinedStream since there may be multiple streams using that underlying connection, so it cannot simply be hijacked for upgrading to WebSocket.

Hm, then how about stopping to make use of HttpStream and giving WebSocketStream an ability to speak HTTP requests and response using existing HTTP utilities (like HttpStreamParser)? Does this make sense for you?
 
* The design doc does not discuss how we decide whether or not to do WebSocket over SPDY. Is it known prior to establishing a connection whether or not we'll be establishing it over SPDY? Or is that determined during connection establishment, and if so, how? This will have obvious implications on how HttpStreamFactoryImpl will need to change.

I will elaborate on this later.

William Chan (陈智昌)

unread,
Feb 18, 2012, 11:12:33 AM2/18/12
to Yuta Kitamura, chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
On Fri, Feb 17, 2012 at 11:36 PM, Yuta Kitamura <yu...@chromium.org> wrote:
Thanks William for looking at the doc.

Sorry for delay in response; I'm busy the rest of this week and next week, so I may not have enough time to update the doc. I will do it once I have a spare time, and will ping you when the new revision is ready.

Here are responses of your questions:

On Tue, Feb 14, 2012 at 5:50 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
As I said before, this is great! I'm glad to see the old SocketStream code die and have a tighter integration of WebSocket with the HTTP stack. Here are some high-level comments/questions:

* Unless Ws is a common abbreviation for WebSocket code, let's not use abbreviations here. Call these classes net::WebSocketFoo...

Sure.
 
* I don't see a discussion of the desired throttling semantics and implementation. This is important for figuring out how to implement this. Please update the design doc to include this. The net/ socket pools enforce a lot of these limits for HTTP connections, so we'll have to think about what to do for WebSockets. My suspicion is the right thing to do is to simply use socket pools with much higher limits (but we'd still need limits to prevent DoS by renderers). But I'd like to hear what your desired semantics are.

RFC6455 requires that there must not be more than one connection in CONNECTING state (i.e. WebSocket handshake is incomplete) for each IP address. To do this, WebSocket socket pool needs to get notified when each connection exits CONNECTING state, but I'm not sure if this should be done in socket pool layer (maybe requests could stall somewhere else).

It requires it on a per-IP address basis? Not per-origin? Implementing this optimally may be complicated.
 

I think RFC6455 requirement is practically good enough, but we could add more throttling criteria if there's some needs that cannot be addressed by RFC6455's throttling. To design a throttling criteria, we need to avoid having any WebSocket connection to stall indefinitely, because that blocks execution of user scripts. Rather, we should reject the connection right away.

OK, thanks for the explanation. I need to think about whether or not we should try to reuse the socket pool architecture or do some more refactoring. Currently there are two fundamental concepts somewhat integrated: socket pools and connect jobs. ConnectJobs cover all the steps in establishing a connection. Socket pools manage deciding when to start connect jobs, reuse idle sockets, and implement throttling policies. I think the interface is flexible and allows for swapping out implementations, but all current socket pools use ClientSocketPoolBaseHelper to implement policies for HTTP. It's conceivable we could implement the ClientSocketPool interface with a WebSocket-specific policy that reuses our ConnectJobs. We may need to use some template magic to make the ConnectJobs be applicable to both HTTP and WebSocket socket pools. Anyway, food for thought.

 

* Why do we have extra userland send buffering in the browser-side classes for WebSockets? The TCP stack in the kernel will already throttle outstanding data to a certain degree, since the kernel will only allocate so much memory for socket send buffers. Also note that the SPDY layer will also throttle due to SPDY3 stream flow control semantics. Also note that WebSocket over SPDY socket buffering where multiple streams share the same socket will result in different throttling effects than a vanilla WebSocket over TCP connection.

I wanted to avoid extra IPC overhead that is caused by the following event chain: kernel's send buffer becomes open -> the browser sends an IPC which requests the renderer for more data -> the renderer sends more data -> the browser issues send() syscall.

But I agree this is subtle and may be omitted.

I see. Your concern is reasonable, although I think it's only a problem if you expect/desire renderer IPCs to send more data than can fit into send(). I don't expect kernel send buffers to open up at unreasonably small sizes. I could be wrong. Keep it simple, then benchmark and optimize if needed.

 
* I think the idea of detaching ClientSocketHandles from HttpStreams is fraught with peril. I think the HttpStreamFactory should be renamed to something that encompasses both HTTP and WebSockets, and returns the appropriate stream in the appropriate situation, rather than always producing a HttpStream and trying to detach from it. Indeed, it is impossible to detach a ClientSocketHandle from a HttpPipelinedStream since there may be multiple streams using that underlying connection, so it cannot simply be hijacked for upgrading to WebSocket.

Hm, then how about stopping to make use of HttpStream and giving WebSocketStream an ability to speak HTTP requests and response using existing HTTP utilities (like HttpStreamParser)? Does this make sense for you?

Yes, that's roughly what I had in mind.

John Tamplin

unread,
Feb 18, 2012, 12:08:25 PM2/18/12
to yu...@google.com, William Chan (陈智昌), chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
On Sat, Feb 18, 2012 at 2:36 AM, Yuta Kitamura <yu...@chromium.org> wrote:
I think RFC6455 requirement is practically good enough, but we could add more throttling criteria if there's some needs that cannot be addressed by RFC6455's throttling. To design a throttling criteria, we need to avoid having any WebSocket connection to stall indefinitely, because that blocks execution of user scripts. Rather, we should reject the connection right away.

So all apps would have to be written to expect an immediate failure and do their own backoff and retry?  The open API is already async, so it doesn't seem like it would block user scripts and other tabs may have initiated a connection to the same server -- it doesn't seem bad to allow a few requests to queue up (and it is a good thing once multiplexing is implemented, so they can go over the initial connection).
 
--
John A. Tamplin
Software Engineer (GWT), Google

Yuta Kitamura

unread,
Feb 29, 2012, 5:31:38 PM2/29/12
to William Chan (陈智昌), chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
Sorry for delayed responce, I'm still fixing the doc.

On Sat, Feb 18, 2012 at 8:12 AM, William Chan (陈智昌) <will...@chromium.org> wrote:
On Fri, Feb 17, 2012 at 11:36 PM, Yuta Kitamura <yu...@chromium.org> wrote:
Thanks William for looking at the doc.

Sorry for delay in response; I'm busy the rest of this week and next week, so I may not have enough time to update the doc. I will do it once I have a spare time, and will ping you when the new revision is ready.

Here are responses of your questions:

On Tue, Feb 14, 2012 at 5:50 PM, William Chan (陈智昌) <will...@chromium.org> wrote:
As I said before, this is great! I'm glad to see the old SocketStream code die and have a tighter integration of WebSocket with the HTTP stack. Here are some high-level comments/questions:

* Unless Ws is a common abbreviation for WebSocket code, let's not use abbreviations here. Call these classes net::WebSocketFoo...

Sure.
 
* I don't see a discussion of the desired throttling semantics and implementation. This is important for figuring out how to implement this. Please update the design doc to include this. The net/ socket pools enforce a lot of these limits for HTTP connections, so we'll have to think about what to do for WebSockets. My suspicion is the right thing to do is to simply use socket pools with much higher limits (but we'd still need limits to prevent DoS by renderers). But I'd like to hear what your desired semantics are.

RFC6455 requires that there must not be more than one connection in CONNECTING state (i.e. WebSocket handshake is incomplete) for each IP address. To do this, WebSocket socket pool needs to get notified when each connection exits CONNECTING state, but I'm not sure if this should be done in socket pool layer (maybe requests could stall somewhere else).

It requires it on a per-IP address basis? Not per-origin? Implementing this optimally may be complicated.

Per-IP address according to RFC6455.

 
 

I think RFC6455 requirement is practically good enough, but we could add more throttling criteria if there's some needs that cannot be addressed by RFC6455's throttling. To design a throttling criteria, we need to avoid having any WebSocket connection to stall indefinitely, because that blocks execution of user scripts. Rather, we should reject the connection right away.

OK, thanks for the explanation. I need to think about whether or not we should try to reuse the socket pool architecture or do some more refactoring. Currently there are two fundamental concepts somewhat integrated: socket pools and connect jobs. ConnectJobs cover all the steps in establishing a connection. Socket pools manage deciding when to start connect jobs, reuse idle sockets, and implement throttling policies. I think the interface is flexible and allows for swapping out implementations, but all current socket pools use ClientSocketPoolBaseHelper to implement policies for HTTP. It's conceivable we could implement the ClientSocketPool interface with a WebSocket-specific policy that reuses our ConnectJobs. We may need to use some template magic to make the ConnectJobs be applicable to both HTTP and WebSocket socket pools. Anyway, food for thought.

Current implementation looks like the following: SocketStream provides "OnStartOpenConnection" callback, which is called right after DNS resolution and before connect() is attempted. SocketStream's delegate may return ERR_IO_PENDING to delay the connection attempt.

RFC6455 requirement may be hard to accomplish within the socket pool layer. (but I could be wrong, as I haven't done much research on socket pools)
 

 

* Why do we have extra userland send buffering in the browser-side classes for WebSockets? The TCP stack in the kernel will already throttle outstanding data to a certain degree, since the kernel will only allocate so much memory for socket send buffers. Also note that the SPDY layer will also throttle due to SPDY3 stream flow control semantics. Also note that WebSocket over SPDY socket buffering where multiple streams share the same socket will result in different throttling effects than a vanilla WebSocket over TCP connection.

I wanted to avoid extra IPC overhead that is caused by the following event chain: kernel's send buffer becomes open -> the browser sends an IPC which requests the renderer for more data -> the renderer sends more data -> the browser issues send() syscall.

But I agree this is subtle and may be omitted.

I see. Your concern is reasonable, although I think it's only a problem if you expect/desire renderer IPCs to send more data than can fit into send(). I don't expect kernel send buffers to open up at unreasonably small sizes. I could be wrong. Keep it simple, then benchmark and optimize if needed.

OK, let's start with no buffer in the browser-side first.
 

 
* I think the idea of detaching ClientSocketHandles from HttpStreams is fraught with peril. I think the HttpStreamFactory should be renamed to something that encompasses both HTTP and WebSockets, and returns the appropriate stream in the appropriate situation, rather than always producing a HttpStream and trying to detach from it. Indeed, it is impossible to detach a ClientSocketHandle from a HttpPipelinedStream since there may be multiple streams using that underlying connection, so it cannot simply be hijacked for upgrading to WebSocket.

Hm, then how about stopping to make use of HttpStream and giving WebSocketStream an ability to speak HTTP requests and response using existing HTTP utilities (like HttpStreamParser)? Does this make sense for you?

Yes, that's roughly what I had in mind.

OK.

Yuta Kitamura

unread,
Feb 29, 2012, 6:10:57 PM2/29/12
to John Tamplin, William Chan (陈智昌), chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
My thoughts are:

* connecting attempt shouldn't get starved indefinitely -- for example, if we hit the limit of maximum # of WebSocket connections (not sure we are going to have this kind of limit though), we should abort the connection immediately rather than waiting for other random pages to close a connection. I think web app devs will get confused if some WebSocket connection randomly takes up looong time depending on other tabs' WebSocket usage.

* However, we should try to minimize the chance of aborting the connection. We should only drop the connection on extraordinary conditions (such as "a new connection attempt is made when there are already 500 pending connections"). Throttling should suffice in almost all cases.

* It may be a good idea to have a new WebSocket close code for "connection limit reached" to allow web app authors to notice/workaround connection closures.

Yuta Kitamura

unread,
Mar 2, 2012, 6:20:44 PM3/2/12
to William Chan (陈智昌), chromium-dev, w...@chromium.org, Takashi Toyoshima, Matthew Menke, simo...@chromium.org
OK, I think I addressed almost all the points willchan suggested. Could you take a look again?

"Throttling and connection limits" is the biggest open issue I couldn't find the right approach for. I wrote my thoughts in "Discussions" section.

Let's go to the issue tracker for further discussions to not spam everyone:

Thanks,
Yuta
Reply all
Reply to author
Forward
0 new messages