Sockjs vs engine.io

tr...@phamcom.com

unread,

Jul 3, 2012, 4:22:20 AM7/3/12

to soc...@googlegroups.com

Sockjs was created because socket.io has gotten bloated.

Now that socket.io team has extracted out the transport layer and call it engine.io, which is equivalent of sockjs.

So is Sockjs original goal still relevant? Should the two projects merge? If not, what's reason should one choose sockjs over engine.io or vice versa.

Marek Majkowski

unread,

Jul 3, 2012, 2:32:47 PM7/3/12

to tr...@phamcom.com, soc...@googlegroups.com

Hi,

Much has been said on differences between SockJS vs Socket.io.

I haven't looked at engine.io for a while, but for me it looks like
a simpler API for socket.io. This is very good IMO, I'm a believer
in simple API's.

Underlying technology is similar in spirit between SockJs and
socket.io/engine.io.
But the details are very different. Read the documentation
of both projects for more details (just to make you interested,
take a look at: support for cross-domain connections, supported fallbacks,
supported proxy configurations, flash flallback, how projects
are tested, what programming languages are supported server-side)

Last time I checked engine.io was using "upgrade" way of selecting
fallback protocols. It is a perfect choice if you have "flash" fallback
(as flash VM needs some time to warm up), but IMO is not the
best design decision.

User end up having _very_ inconsistent experience - you start
with terribly slow jsonp and maybe upgrade to a faster protocol.
Also, there is a delay during the upgrade procedure - you're
effectively disconnected when flash or other fallback kicks in.

For a comparison - SockJS connection is established reasonably
quickly (usually a few RTT's) and it just works, consistently
and predictably since then.

Whatever you choose, I'd be glad if you could describe your
experience - I'm eager to learn if I'm wrong!

Cheers,
Marek

Carl Byström

unread,

Jul 3, 2012, 3:42:14 PM7/3/12

to maj...@gmail.com, tr...@phamcom.com, soc...@googlegroups.com

On Tue, Jul 3, 2012 at 8:32 PM, Marek Majkowski <maj...@gmail.com> wrote:

User end up having _very_ inconsistent experience - you start
with terribly slow jsonp and maybe upgrade to a faster protocol.
Also, there is a delay during the upgrade procedure - you're
effectively disconnected when flash or other fallback kicks in.

At first glance I like the idea of upgrading a connection. Seems reasonable to start with lowest common denominator that always should work.

Downgrading your way to a working connection can also be problematic. Especially with pesky personal firewalls and/or proxies. If you don't have a proper handshake mechanism in place (not saying SockJS suffers from this) you'll get yourself into trouble. Have experienced lots of corner cases with the development of Beaconpush that gets you.

I just don't see why connection upgrading is so much worse than a downgrading order.

--

Carl

tr...@phamcom.com

unread,

Jul 3, 2012, 5:20:49 PM7/3/12

to soc...@googlegroups.com, maj...@gmail.com, tr...@phamcom.com

Agree. I want to understand why engine.io connection upgrading is a bad idea. Thoughts?

Also, having flash support will cover IE7/IE8/IE9, which still amount to 20% of the market share. Which is not a bad thing. Between native web socket and flash socket, I think we can provide socket like experience to 99% of the users.

Marek Majkowski

unread,

Jul 4, 2012, 6:36:25 AM7/4/12

to tr...@phamcom.com, Carl Byström, soc...@googlegroups.com

On Tue, Jul 3, 2012 at 8:42 PM, Carl Byström <cgby...@gmail.com> wrote:
> At first glance I like the idea of upgrading a connection. Seems reasonable
> to start with lowest common denominator that always should work.
> Downgrading your way to a working connection can also be problematic.
> Especially with pesky personal firewalls and/or proxies. If you don't have a
> proper handshake mechanism in place (not saying SockJS suffers from this)
> you'll get yourself into trouble. Have experienced lots of corner cases with
> the development of Beaconpush that gets you.
>
> I just don't see why connection upgrading is so much worse than a
> downgrading order.

Executive summary: if you're using flash fallback doing upgrade may
make sense.

In ideal world upgrade vs handshake makes no difference. In the
end you need to detect if transport works and there is only
one way of doing it - you need to just start using the transport.

My thoughts about the 'upgrade' idea:
- during the 'upgrade' (or handshake) you can't send messages.
The problem is the 'upgrade' will happen in random moments.
In other words possible scenarios:
engine.io: works-slowly, doesn't work, works-slowly, doesn't work,
works fine
sockjs: doesn't work, works fine.
In ideal world the time of "doesn't work" moments should be equal
for both projects. I do prefer consistent behaviour. Once SockJS
connection is established, it's done. Latency and performance should
be predictable from now on. And, btw, SockJS connection establishment
is quite fast, even in complex cases.

- The "upgrade" idea is likely to be racy - that's why
during the upgrade the "working" transport must be stopped.

- I find the "upgrade" way to be more complicated to implement.

But, if you think of using flash fallback - doing 'upgrade' thing
makes sense - flash is slow to start _and_ flash will take few
seconds to timeout in proxied environments (corporations).
If only the jsonp didn't have to be stopped during the flash transport
probing doing the 'update' would make sense.

Though AFAIK, when a fallback is probed previous transport must be
stopped (in order to avoid races). So the whole benefit
vanishes and engine.io stops for a few seconds to detect
that flash doesn't work.
(please, can someone tell me that is not true?)

Executive summary #2:
- SockJS doesn't use flash, so it doesn't need to work around the
fact that flash is slow to initialize.
- SockJS chooses consistency and predictability when possible.
- SockJS is simple. Doing 'upgrades' correctly introduces a lot of complexity.

On Tue, Jul 3, 2012 at 10:20 PM, <tr...@phamcom.com> wrote:
> Also, having flash support will cover IE7/IE8/IE9, which still amount to 20%
> of the market share. Which is not a bad thing. Between native web socket and
> flash socket, I think we can provide socket like experience to 99% of the
> users.

Ha,

SockJS uses "streaming" transports for IE7+.They are not as performant
as flash-websockets but hey, they don't require flash, they work in
corporate environments (adhere to proxy settings), they won't require few
seconds to timeout, don't require magical port 843 to be open and work
in hosted environments like cloudfoundry.

How cool is that?

Marek

tr...@phamcom.com

unread,

Jul 6, 2012, 4:22:34 AM7/6/12

to soc...@googlegroups.com, tr...@phamcom.com, Carl Byström

Does this streaming protocol work across domains?

Marek

Marek Majkowski

unread,

Jul 6, 2012, 6:01:30 AM7/6/12

to tr...@phamcom.com, soc...@googlegroups.com, Carl Byström

>> SockJS uses "streaming" transports for IE7+.They are not as performant
>> as flash-websockets but hey, they don't require flash, they work in
>> corporate environments (adhere to proxy settings), they won't require few
>> seconds to timeout, don't require magical port 843 to be open and work
>> in hosted environments like cloudfoundry.
>>
>> How cool is that?
>
>
> Does this streaming protocol work across domains?

Yes. SockJS is able to connect cross-domain.

Marek

ma...@hahnca.com

unread,

Jul 6, 2012, 12:56:45 PM7/6/12

to soc...@googlegroups.com, tr...@phamcom.com, Carl Byström

It would seem to me that upgrading is superior if you keep the working one open and overlap them. Youy don't kill the inferior one until the superior is known good. Then there would always be a working connection.

Marek Majkowski

unread,

Jul 6, 2012, 1:04:19 PM7/6/12

to ma...@hahnca.com, soc...@googlegroups.com

On Fri, Jul 6, 2012 at 5:56 PM, <ma...@hahnca.com> wrote:
> It would seem to me that upgrading is superior if you keep the working one
> open and overlap them. Youy don't kill the inferior one until the superior
> is known good. Then there would always be a working connection.

a) implementing that without race conditions would be hard, and even in
the perfect world you would need at least 2*RTT time of delay
when the upgrade happens.
b) having two connections open to same host at the same time
is quite risky. Consider a situation when the old connection
is very busy sending a lot of data. That means the other connection
(probing one) would likely go very slow, possibly even timeout, due
to network
congestion.

SockJS does the upgrade on start. After that you have
a consistent experience. You send what you send, you receive
what you receive. No magic happens. You can rely on it.

Marek

Mark Hahn

unread,

Jul 6, 2012, 1:18:45 PM7/6/12

to Marek Majkowski, soc...@googlegroups.com

You are probably right, but let me play devil's advocate. BTW, i doubt socket.io does this crazy thing.

> you would need at least 2*RTT time of delay when the upgrade happens.

I don't see how the delay would matter if the first one is still working.

> implementing that without race conditions would be hard,

Simply number the packets at the top level and sort at the receiving top level.

> Consider a situation when the old connection is very busy sending a lot of data. That means the other connection (probing one) would likely go very slow, possibly even timeout, die.

I don't understand this. Browsers handle tons of connections all the time. Also the worst that would happen in that rare situation is the faster would back off and try later.

I could simulate this by wrapping a layer on top and forcing the type of connection. Of course this test implementation might be slower to switch.

> SockJS does the upgrade on start.

If you started websocket and flash at the same time, wouldn't websocket be done sooner? My crazy algorithm would give the fastest response.

All of this will be moot soon when everyone supports websocket. Will SockJS have any advantage over vanilla WS at that point?

Marek Majkowski

unread,

Jul 6, 2012, 2:22:23 PM7/6/12

to Mark Hahn, soc...@googlegroups.com

On Fri, Jul 6, 2012 at 6:18 PM, Mark Hahn <ma...@hahnca.com> wrote:
> You are probably right, but let me play devil's advocate. BTW, i doubt
> socket.io does this crazy thing.
>
>> you would need at least 2*RTT time of delay when the upgrade happens.
>
> I don't see how the delay would matter if the first one is still working.
>
>> implementing that without race conditions would be hard,
>
> Simply number the packets at the top level and sort at the receiving top
> level.

Yeah...

>> Consider a situation when the old connection is very busy sending a lot
>> of data. That means the other connection (probing one) would likely go very
>> slow, possibly even timeout, die.
>
> I don't understand this. Browsers handle tons of connections all the time.
> Also the worst that would happen in that rare situation is the faster would
> back off and try later.

Browsers, like other applications in the world, have a lot of trouble dealing
with two parallel tcp/ip connections to one destination when one is consuming
all the bandwidth. Have you ever used SSH while downloading a large file from
the same host?

Now try repeating this experiment in a browser: do one AJAX call
for a big file, wait a sec and fire another AJAX call for some lightweight
resource, and see what happens.

Repeat the experiment and chart it.

I'd be interested in a latency distribution, what do you think?

> I could simulate this by wrapping a layer on top and forcing the type of
> connection. Of course this test implementation might be slower to switch.
>
>> SockJS does the upgrade on start.
>
> If you started websocket and flash at the same time, wouldn't websocket be
> done sooner? My crazy algorithm would give the fastest response.

Yup, it sounds like a great idea. Go for it. You will have the quickest
time-to-connected ever possible. But why to stop at websockets and flash,
why not fire all the fallbacks possible at once, including, say jsonp?

> All of this will be moot soon when everyone supports websocket. Will SockJS
> have any advantage over vanilla WS at that point?

SockJS never has any advantage over vanilla WS if the latter is
working. That's the point.

Hope that helps,
Marek

Mark Hahn

unread,

Jul 6, 2012, 2:25:08 PM7/6/12

to Marek Majkowski, soc...@googlegroups.com

Thanks for putting up with my wild ideas.

milan.b...@sql.co.rs

unread,

Jul 6, 2012, 4:11:45 PM7/6/12

to soc...@googlegroups.com, tr...@phamcom.com, Carl Byström

On Wednesday, July 4, 2012 12:36:25 PM UTC+2, majek wrote:

Though AFAIK, when a fallback is probed previous transport must be
stopped (in order to avoid races). So the whole benefit
vanishes and engine.io stops for a few seconds to detect
that flash doesn't work.
(please, can someone tell me that is not true?)

Hello, Marek,

I have done some testing with socket.io 0.9 before placing it in production and I cannot confirm this. Socket.io tries the transports in the order they are listed when starting the socket.io on the server side. I.e. if you have:

sio.set('transports', [
'websocket'
, 'flashsocket'
, 'jsonp-polling'
, 'xhr-polling'
, 'htmlfile'
]);

...websockets would be tried first. Like I wrote, I did a lot of testing because I want my users to be happy. Using the same code I tried with Firefox 13, Firefox 3.6, Opera, IE, Chrome. When Firefox 13 opens the page, it connects instantaneously. When Firefox 3.6 and other non-websocket browsers open, it takes about 10 seconds to establish connection if "flashsocket" option is selected. If not, jsonp get selected really fast.

Another example, Opera 11 has problem with jsonp. If I only put "jsonp", it barely works. If I only put "xhr-polling" everything works fine. Now, when order is "jsonp", "xhr", Opera still selects jsonp and works very slow. XHR is not even tested. When order is "xhr", "jsonp" it connects at once and works fast.

Like I wrote, this testing has been done against v0.9. I don't know if engine.io works differently.

Regards,

Milan Babuskov

Marek Majkowski

unread,

Jul 6, 2012, 4:51:21 PM7/6/12

to milan.b...@sql.co.rs, soc...@googlegroups.com

On Fri, Jul 6, 2012 at 9:11 PM, <milan.b...@sql.co.rs> wrote:
> Like I wrote, this testing has been done against v0.9. I don't know if
> engine.io works differently.

It does.

Marek

Milan Babuskov

unread,

Jul 6, 2012, 7:19:20 PM7/6/12

to Marek Majkowski, soc...@googlegroups.com

You're right. I just checked the engine.io docs and everything you
wrote on the mailing list seems right. If socket.io is really heading
that way, I might switch to sock.js in the future as well.

Regards,

--
Milan Babuskov
http://www.guacosoft.com

jco...@gmail.com

unread,

Jul 14, 2012, 6:15:53 AM7/14/12

to soc...@googlegroups.com, tr...@phamcom.com, Carl Byström

I'm the author of Faye (which SockJS uses the WebSocket implementation from), just though I'd chime in here since I've been following this and would like to know what problems others have faced. For background, Faye is based on the Bayeux protocol, which predates WebSocket but has easily adapted to it.

A Bayeux server is only required to support XMLHttpRequest (XHR) and JSON-P, and clients must send an initial handshake request over these transports, chosen because they work pretty much everywhere. In the response, the server advertises which transports it supports, and the client then upgrades to one of these based on browser support, cross-domain considerations, and (in the case of WebSocket) connection testing. Servers and clients are free to invent their own transports in addition to those defined by the spec.

In Faye, when a new client is created, it selects XHR or JSON-P based on whether the server is cross-origin or not, and sends a handshake request. When it gets the response, it begins selecting a new transport, but does this asynchronously -- while new transports are being tested, which may involve opening sockets, etc, the original transport continues to be used, so there's no interruption in service and no delay waiting for transports to be tested. Once a new transport is selected, it simply replaces the original one and the client continues. At all times, the client has some transport object with a standard API that it can talk to.

Faye supports, in order of preference: WebSocket, EventSource, XHR, CORS and JSON-P.

In light of this I have some questions -- see below. Please understand these are questions of curiosity and trying to fix my ignorance, not trying to suggest any project's approach is wrong.

On Wednesday, July 4, 2012 11:36:25 AM UTC+1, majek wrote:

My thoughts about the 'upgrade' idea:
- during the 'upgrade' (or handshake) you can't send messages.
The problem is the 'upgrade' will happen in random moments.
In other words possible scenarios:
engine.io: works-slowly, doesn't work, works-slowly, doesn't work,
works fine
sockjs: doesn't work, works fine.

It's not clear to me both:

* why the testing pattern for transports differs between both projects

* why there needs to be any interruption in service if using an upgrade approach, given Faye's approach I described

Why is there necessarily any phase during which you cannot send messages?

In ideal world the time of "doesn't work" moments should be equal
for both projects. I do prefer consistent behaviour. Once SockJS
connection is established, it's done. Latency and performance should
be predictable from now on. And, btw, SockJS connection establishment
is quite fast, even in complex cases.

Minor point, but could you clarify what is meant by a 'connection' here in terms of the wire protocol. If we're talking about the transport layer, very often 'connection' does not seem like the right word (i.e. if using a request-response transport). I think of it more like, at all times the client has an object it can use to send/receive data with the server, and the client should not be given such an object unless the transport selection process has proven the object works.

- The "upgrade" idea is likely to be racy - that's why
during the upgrade the "working" transport must be stopped.

What race conditions can occur, and why must work be stopped? The biggest race I see in Faye is that it cannot select WebSocket as fast as you'd like, and so the first events are delivered over long-polling. This is due to the async transport selection process, but it doesn't not stop the client from working for any length of time, it's just sub-optimal.

- I find the "upgrade" way to be more complicated to implement.

I'd like more info on this -- what did you find hard, and how do the two approaches differ?

But, if you think of using flash fallback - doing 'upgrade' thing
makes sense - flash is slow to start _and_ flash will take few
seconds to timeout in proxied environments (corporations).
If only the jsonp didn't have to be stopped during the flash transport
probing doing the 'update' would make sense.

Is it the case that you need to stop JSON-P because it's hogging a network connection, or because of execution guarantees etc? We had a problem in Faye ages ago where JSON-P would block execution of other downloaded scripts, because Firefox guarantees scripts execute in the order they're inserted. This can be mitigated to some extent by:

* Trying to avoid using JSON-P if you possibly can

* Making the server end any open JSON-P polling request whenever it receives any other request from the same client, to allow the response from the second request to execute sooner

Though AFAIK, when a fallback is probed previous transport must be
stopped (in order to avoid races). So the whole benefit
vanishes and engine.io stops for a few seconds to detect
that flash doesn't work.
(please, can someone tell me that is not true?)

I'd like to know if the fact that I've not seen any of these problems is connect with the fact that Faye doesn't use Flash. I don't understand the additional problems it introduces very well.

Thanks in advance for your help.

Marek Majkowski

unread,

Jul 26, 2012, 8:45:51 AM7/26/12

to jco...@gmail.com, soc...@googlegroups.com, tr...@phamcom.com, Carl Byström

On Sat, Jul 14, 2012 at 11:15 AM, <jco...@gmail.com> wrote:
> I'm the author of Faye (which SockJS uses the WebSocket implementation
> from)

Hi! Yup, thanks for your great work!

SockJS detects working transports before marking the connection
as open. Engine.io will immediately open the connection and upgrade
it later.

The main rationale AFAICT: flash, one of the Engine.io fallbacks
(and _not_ present in SockJS) loads slowly and in environments
behind proxies takes 3 seconds to timeout.

In previous socket.io architecture this meant at least 3 seconds delay
for some users. In new engine.io this can happen without affecting
the session.

SockJS doesn't use flash and therefore doesn't need to work around
this issue.

> * why there needs to be any interruption in service if using an upgrade
> approach, given Faye's approach I described

Maybe I'm getting it wrong, but if you have two established bidirectional
connections, in order to pass control between them, you need some
synchronization mechanism. It should mark that:
- this (old) connection is getting retired, please use new one.

IMO this needs to be synchronous - both sides need to agree
that the old connection is flushed, all messages were indeed delivered
correctly and can be closed. New messages will be sent over
the new one connection from that moment. Something like:

1) normal (old) connection established
2) messages are being exchanged
3) new connection gets established
4) the client says: I'd like to use new connection please.
5) the client should stop sending messages for a time
6) the server hears 4) and sends response: okay, let's use the new one
7) the server stops sending messages
8) the old connection is terminated
9) the client resumes sending messges on the new connection
10) the server after hearing a message resumes sending messages over
the new connection.

Basically: stopping sending messages is required
to avoid a race condition when messages on the new
connection arrive before handover message and possibly
before older messages on the old connection.

> Why is there necessarily any phase during which you cannot send messages?

To avoid a race when one connection is faster than the other.
In other words: if you have a single connection, you have
ordering. With two connections established you can't
reason about ordering at all. You need an explicit synchronization.

>> In ideal world the time of "doesn't work" moments should be equal
>> for both projects. I do prefer consistent behaviour. Once SockJS
>> connection is established, it's done. Latency and performance should
>> be predictable from now on. And, btw, SockJS connection establishment
>> is quite fast, even in complex cases.
>
> Minor point, but could you clarify what is meant by a 'connection' here in
> terms of the wire protocol. If we're talking about the transport layer, very
> often 'connection' does not seem like the right word (i.e. if using a
> request-response transport).

Yup. "SockJS session" would be a better phrase in this context. Sorry.

> I think of it more like, at all times the
> client has an object it can use to send/receive data with the server, and
> the client should not be given such an object unless the transport selection
> process has proven the object works.
>
>>
>> - The "upgrade" idea is likely to be racy - that's why
>> during the upgrade the "working" transport must be stopped.
>
>
> What race conditions can occur, and why must work be stopped? The biggest
> race I see in Faye is that it cannot select WebSocket as fast as you'd like,
> and so the first events are delivered over long-polling. This is due to the
> async transport selection process, but it doesn't not stop the client from
> working for any length of time, it's just sub-optimal.

The race I talked about earlier. There may be ways to avoid it,
like to buffer messages all the messages on the new connection
until the confirmation of connection passover is received on the old
one.

In such case one should need to think about:
a) old connection failing before this passover message is received
b) Is this buffer a possible DoS vector?
c) when to close the old connection and who is responsible for it.

I'm not arguing it's all non-solvable (although Engine.io
does not try to solve this - they are using an explicit synchronization
AFAICT). I'm merely saying:
SockJS approach of a having the transport detection before
session establishment is is way simpler, and I can't see any
reason why it could be inferior.

>> - I find the "upgrade" way to be more complicated to implement.
>
> I'd like more info on this -- what did you find hard, and how do the two
> approaches differ?

I think I've already covered this.

>> But, if you think of using flash fallback - doing 'upgrade' thing
>> makes sense - flash is slow to start _and_ flash will take few
>> seconds to timeout in proxied environments (corporations).
>> If only the jsonp didn't have to be stopped during the flash transport
>> probing doing the 'update' would make sense.
>
> Is it the case that you need to stop JSON-P because it's hogging a network
> connection, or because of execution guarantees etc? We had a problem in Faye
> ages ago where JSON-P would block execution of other downloaded scripts,
> because Firefox guarantees scripts execute in the order they're inserted.

Interesting. I think I was relating to particular engine.io implementation.
Engine.io-client sends 'upgrade' packet, and waits for a response. During this
time client can't communicate.

> This can be mitigated to some extent by:
>
> * Trying to avoid using JSON-P if you possibly can
> * Making the server end any open JSON-P polling request whenever it receives
> any other request from the same client, to allow the response from the
> second request to execute sooner
>
>> Though AFAIK, when a fallback is probed previous transport must be
>> stopped (in order to avoid races). So the whole benefit
>> vanishes and engine.io stops for a few seconds to detect
>> that flash doesn't work.
>> (please, can someone tell me that is not true?)
>
> I'd like to know if the fact that I've not seen any of these problems is
> connect with the fact that Faye doesn't use Flash. I don't understand the
> additional problems it introduces very well.

I'm sure you have some kind of synchronization that ensures
there aren't any outstanding messages in the old connection
before switching to a new one, and that ensures message
ordering is preserved.

Hope that helps,
Marek

Brendan R

unread,

May 27, 2013, 5:50:31 AM5/27/13

to soc...@googlegroups.com, jco...@gmail.com, tr...@phamcom.com, Carl Byström

I just wanted to chime in on this interesting thread and point out that Pusher has done some good work in the area of socket vs. fallback connectivity lately...

http://blog.pusher.com/how-we-built-pusher20-part-1/

http://blog.pusher.com/how-we-built-pusher-js-2-0-part-2-implementation/

http://blog.pusher.com/how-we-built-pusher-js-2-0-part-3-metrics/

Reply all

Reply to author

Forward