sockjs vs socket.io ...

15,311 views
Skip to first unread message

StigFærch

unread,
Apr 28, 2012, 7:01:24 AM4/28/12
to sockjs
I'm in the process of choosing different components for a node.js
project.
Currently I'm trying to find out whether to use socket.io or sockJS.
It's important to me that it scales good, as I hope that my project
eventually will grow big. Are socket.io and sockjs different at this
point?

Also, what know projects / products makes use of sockJS?
socket.io have a list, but haven't found much on sockJS' pages.

I know that sockJS is concentrating on the transport layer and not
getting bloated with others stuff. The sounds good.

If you (Marek) get hit by a bus (God forbid that it may happen!), or
if other things change the focus of your life - will sockJS then die?

Any other things worth mentioning comparing the two?

Thanks in advance :-)

StigFærch

unread,
Apr 28, 2012, 7:10:04 AM4/28/12
to soc...@googlegroups.com
If you (Marek) ...
Sorry about that - it's Majek - not Marek.

Sergey Koval

unread,
Apr 28, 2012, 9:28:28 AM4/28/12
to sockjs
Well, I'm not Marek, but have some feedback :-)

I wrote Tornadio (and Tornadio2) - socket.io server implementations
for python. I also wrote sockjs-tornado - sockjs server implementation
for python. So I have knowledge of both projects.

Long story short..

I'll try to provide some information why SockJS is better choice:

1. It is actively maintained. By "actively" I mean that all tickets
are getting reviewed in matter of a day-two and you get meaningful
response.

2. SockJS enforces certain behavior patterns for all server
implementations. Tests cover everything - from protocol to proper
error handing. So, it is easy to know if your server implementation
works according to the spec or not, which makes third-party server
implementations first-class citizens.

3. Because of the previous point, SockJS is more predictable and just
works better - tests even cover some edge cases which socket.io is not
aware of. There's test suite for client-side library as well:
http://sockjs.popcnt.org/

4. SockJS is designed to be horizontally scalable. Have capacity
problems? Throw-in more nodes, add nodes to load balancer and you're
set. There's no need to use cookie-based sticky sessions - all
information is already in the URL.

5. SockJS really works for all browsers, even Opera, even in
cross-domain scenario. Socket.io client is more picky about where it
works. And SockJS supports streaming transports (one persistent
connection from the server instead of hammering it down with
short-living HTTP requests when using polling transports), socket.io
does not.

6. I benchmarked sockjs-tornado and expect that tornadio2 will be ~20%
slower than sockjs-tornado due to more complex socket.io protocol. You
can find benchmark here:
http://mrjoes.github.com/2011/12/15/sockjs-bench.html

Only thing that SockJS is missing, in comparison to socket.io, is
events. But it is not very hard to implement them yourself.

And I know few guys who were very active in socket.io bugtracker
asking for help, but then they gave up and switched to SockJS.

As for the socket.io problems:
1. Development focus

Right now, socket.io devs are focused on the engine.io and current
socket.io version looks abandoned. However, there were 5 minor
releases in last month, which fixed some of the critical issues (like
this one - https://github.com/LearnBoost/socket.io/issues/438 which
was open for 8 months), so it might change in the future.

If you'll open socket.io bugtracker, you'll see like 20+ pull requests
and 200+ open defects. That's not very good sign, even if 90% are not
bugs.

2. Stability

Protocol and behavior patterns are poorly documented. No unit tests.
No protocol tests. No client-side library tests.

Client still does not know how to close multiplexed connections, can't
properly fallback to polling protocols if something screwed up native
websocket connection and so on.

Client has lots of places where race condition can happen, which
either kill your server (like issue #438 mentioned above, good it was
fixed) or you will lose data without knowing it. For example, for
polling transports, if client sees disconnect - it thinks that it was
intentional disconnect and will try to reconnect to get more data,
which might lead to data loss in some cases.

To sum it up: just go with SockJS, at least until Engine.io will be as
mature as SockJS.

Serge.

Marek Majkowski

unread,
Apr 28, 2012, 1:30:33 PM4/28/12
to stigf...@gmail.com, sockjs
On Sat, Apr 28, 2012 at 12:01, StigFærch <stigf...@gmail.com> wrote:
> I'm in the process of choosing different components for a node.js
> project.
> Currently I'm trying to find out whether to use socket.io or sockJS.
> It's important to me that it scales good, as I hope that my project
> eventually will grow big. Are socket.io and sockjs different at this
> point?

Yes, very much. I'm not an expert on scaling socket.io, but
it seems to require having redis as a backbone. From
what I can see this is barely documented - I would
be grateful if someone finds some decent documentation
on the subject.

In contrast, SockJS do not require any magic within your backend.
That is done at a cost - for horizontal scalability SockJs does
depend on sticky-sessions supported by the load balancer.
(or on separate domain names for every sockjs host
if you're not using a load balancer).

SockJS does support both path-based (prefix) sticky
sessions (template haproxy config is available), or cookie
based sticky sessions (JSESSIONID), see the READMEs
of sockjs-node and sockjs-client for details.

In practice that means:
- sockjs will scale perfectly behind haproxy
- sockjs will scale nicely on for example cloud foundry
(via the JSESSIONID cookie)
- you can't really scale sockjs to more than one host on
heroku (no sticky sessions support).

> Also, what know projects / products makes use of sockJS?
> socket.io have a list, but haven't found much on sockJS' pages.

SockJS is much younger and, frankly, less attractive to
node.js developers. It's focused much more on operations
rather than ease of use for web developers. Socket.io has many
shiny features and therefore is more attractive on first sight.

SockJS is used with success by:
- realtime.co - they have it running on a serious scale
- meteorjs project - http://meteor.com/
- online radio for live content streaming, some facebook
games and few other smaller projects

> I know that sockJS is concentrating on the transport layer and not
> getting bloated with others stuff. The sounds good.
>
> If you (Marek) get hit by a bus (God forbid that it may happen!), or
> if other things change the focus of your life - will sockJS then die?

I do plan to live long and happy!

SockJS will die by itself, and that's a good thing - hopefully
native websockets will be stable enough in a year or two.
At that point SockJS will be irrelevant.
Fortunately SockJS has the same API as native websockets
so the transition to native websockets should be straightforward
for SockJS users.

But you're right - sockjs-client was developed by myself mostly.
I wouldn't expect major problems with that code though.

The situation is better with the servers. There are many
(ten +) people that succeed or attempted to build a sockjs server.
So if you have server-side problems my absence shouldn't
be an issue.

> Any other things worth mentioning comparing the two?

SockJS is a project focused on operations, replaceable and
without any magic. SockJS was born out of frustrations
with socket.io 0.6.

I think this commit summarizes many of the practical differences:
https://github.com/meteor/meteor/commit/91f479e9d4e9a49a87b40c17a0ce2d3732d4f9ac#diff-2

Marek

Tim Fox

unread,
May 2, 2012, 5:50:43 AM5/2/12
to soc...@googlegroups.com
Adding my 2c here.

We have our own SockJS server side implementation in vert.x http://vertx.io

Regarding the SockJS client and protocol: IMO SockJS is well thought out and well designed, and it's worked really well for us so far. Kudos to Marek for that

A couple of small criticisms:

The lack of a "proper" specification document. Having said that, what there is of a spec, in the form of tests and an html document, is far better than what you get with socket.io. I guess the documentation bar is set quite low in the node.js world ;)

There is currently no flow control for non websockets transports. This could mean the server could run out of RAM under heavy load.

Regarding the server side node.js SockJS implementation, there are issues under load with websockets, but this is due to problems in the websockets library that the sockjs-node uses (Faye-Websocket)

qingli...@gmail.com

unread,
May 6, 2012, 10:44:46 PM5/6/12
to soc...@googlegroups.com
I have one question.
Does SockJS really scale? 
My understanding is that it can scale by always routing the same session to the same server.
This isn't real scalable.
Say I have a application that is used by millions of users online and I need to broadcast messages to all of them. 
One server is not enough and I have to be split all user sessions into two or more servers.
How is that going to be taken care of by SockJS? I don't see any solution.

AD

unread,
May 8, 2012, 12:06:25 AM5/8/12
to qingli...@gmail.com, soc...@googlegroups.com
this example shows a broadcast mechanism -  https://github.com/sockjs/sockjs-erlang/blob/master/examples/cowboy_test_server.erl

However, that is for one server.  What you need is another mechanism of pubsub to push a message to each sockjs server, so that each sockjs server can execute the broadcast function above.

I am doing this exact thing right now by using rabbitmq to push messages to multiple nodes.  Once the nodes receive the message they can each locally look up in the ETS table all the connections (or a filtered list if you change the structure of your ETS key) and then broadcast to each.

-AD

Tim Fox

unread,
May 8, 2012, 2:30:58 AM5/8/12
to soc...@googlegroups.com
You can solve this fairly easily by using the vert.x http://vertx.io on the server side instead of the node.js SockJS server.

Vert.x fully supports SockJS (0.2.1 currently), but scales far more easily than node. A single vert.x node can do the work of many node instances. When you need more than one vert.x node, vert.x has a built in distributed event bus so you can do stuff like pubsub as you describe over man nodes on both the server and the client.

Tim Fox

unread,
May 8, 2012, 2:32:43 AM5/8/12
to soc...@googlegroups.com
On 08/05/2012 05:06, AD wrote:
this example shows a broadcast mechanism -  https://github.com/sockjs/sockjs-erlang/blob/master/examples/cowboy_test_server.erl

However, that is for one server.  What you need is another mechanism of pubsub to push a message to each sockjs server, so that each sockjs server can execute the broadcast function above.

If you're not hung up on using node.js on the server side, Vert.x has this built in :)

-AD

AD

unread,
May 8, 2012, 9:00:18 AM5/8/12
to timv...@gmail.com, soc...@googlegroups.com
I'm not, i did it with erlang :-)

However, I do love what you are doing with vert.x I just need it to be 1.0 final ! 

Marek Majkowski

unread,
May 8, 2012, 9:09:44 AM5/8/12
to qingli...@gmail.com, soc...@googlegroups.com
On Mon, May 7, 2012 at 3:44 AM, <qingli...@gmail.com> wrote:
> I have one question.
> Does SockJS really scale?
> My understanding is that it can scale by always routing the same session to
> the same server.

True.

> This isn't real scalable.

It depends on the definition of "scalable". My understanding is - if, for
some reason, a single box doing SockJS is not enough (RAM, CPU, disk,
network, whatever). You can just add another box. That's called horizontal
scalability.

> Say I have a application that is used by millions of users online and I need
> to broadcast messages to all of them.
> One server is not enough and I have to be split all user sessions into two
> or more servers.
> How is that going to be taken care of by SockJS? I don't see any solution.

You're right. When you have more than a single SockJS server you may need
to do some synchronization between them. This is not the layer that
SockJS solves for you.

You need to roll own your solution of choice.
You may use RabbitMQ, or redis or maybe even a database (mongo?)
to get basically an event bus that everyone is connected to.

In this scenario your scalability is limited to the message bus performance,
this is not a problem very often.

By the way, I think this is a good approach to scalability - you need to write
the message bus logic and make sure it has the characteristics you want.
As a counter example, Socket.io, uses Redis as a backbone. If redis
is not good enough for you - you need to rewrite chunks of Socket.io logic.

With SockJS those two elements are decoupled - SockJS is a transport
level (browser to the server). And the second thing is communication
between the servers - that's left completely up to you.

Marek

Tim Fox

unread,
May 8, 2012, 10:43:58 AM5/8/12
to AD, soc...@googlegroups.com
On 08/05/12 14:00, AD wrote:
> I'm not, i did it with erlang :-)
>
> However, I do love what you are doing with vert.x I just need it to be
> 1.0 final !

That should happen this week :)
>> <http://socket.io> server implementations
>> for python. I also wrote sockjs-tornado - sockjs server
>> implementation
>> for python. So I have knowledge of both projects.
>>
>> Long story short..
>>
>> I'll try to provide some information why SockJS is better
>> choice:
>>
>> 1. It is actively maintained. By "actively" I mean that
>> all tickets
>> are getting reviewed in matter of a day-two and you get
>> meaningful
>> response.
>>
>> 2. SockJS enforces certain behavior patterns for all server
>> implementations. Tests cover everything - from protocol
>> to proper
>> error handing. So, it is easy to know if your server
>> implementation
>> works according to the spec or not, which makes
>> third-party server
>> implementations first-class citizens.
>>
>> 3. Because of the previous point, SockJS is more
>> predictable and just
>> works better - tests even cover some edge cases which
>> socket.io <http://socket.io> is not
>> aware of. There's test suite for client-side library as
>> well:
>> http://sockjs.popcnt.org/
>>
>> 4. SockJS is designed to be horizontally scalable. Have
>> capacity
>> problems? Throw-in more nodes, add nodes to load balancer
>> and you're
>> set. There's no need to use cookie-based sticky sessions
>> - all
>> information is already in the URL.
>>
>> 5. SockJS really works for all browsers, even Opera, even in
>> cross-domain scenario. Socket.io client is more picky
>> about where it
>> works. And SockJS supports streaming transports (one
>> persistent
>> connection from the server instead of hammering it down with
>> short-living HTTP requests when using polling
>> transports), socket.io <http://socket.io>
>> does not.
>>
>> 6. I benchmarked sockjs-tornado and expect that tornadio2
>> will be ~20%
>> slower than sockjs-tornado due to more complex socket.io
>> <http://socket.io> protocol. You
>> can find benchmark here:
>> http://mrjoes.github.com/2011/ 12/15/sockjs-bench.html
>> <http://mrjoes.github.com/2011/12/15/sockjs-bench.html>
>>
>> Only thing that SockJS is missing, in comparison to
>> socket.io <http://socket.io>, is
>> events. But it is not very hard to implement them yourself.
>>
>> And I know few guys who were very active in socket.io
>> <http://socket.io> bugtracker
>> asking for help, but then they gave up and switched to
>> SockJS.
>>
>> As for the socket.io <http://socket.io> problems:
>> 1. Development focus
>>
>> Right now, socket.io <http://socket.io> devs are focused
>> on the engine.io <http://engine.io> and current
>> socket.io <http://socket.io> version looks abandoned.
>> However, there were 5 minor
>> releases in last month, which fixed some of the critical
>> issues (like
>> this one - https://github.com/LearnBoost/
>> socket.io/issues/438
>> <https://github.com/LearnBoost/socket.io/issues/438> which
>> was open for 8 months), so it might change in the future.
>>
>> If you'll open socket.io <http://socket.io> bugtracker,
>> On Sat, Apr 28, 2012 at 2:10 PM, StigF�rch
>> <stigf...@gmail.com <mailto:stigf...@gmail.com>> wrote:
>> >>
>> >> If you (Marek) ...
>> >
>> > Sorry about that - it's Majek - not Marek.
>>
>>
>
>


--
Tim Fox

Vert.x - effortless polyglot asynchronous application development
http://vertx.io
twitter:@timfox

3rdEden

unread,
Jul 4, 2012, 6:46:07 AM7/4/12
to soc...@googlegroups.com, stigf...@gmail.com
Yes, very much. I'm not an expert on scaling socket.io, but
it seems to require having redis as a backbone. From
what I can see this is barely documented - I would
be grateful if someone finds some decent documentation
on the subject.

Socket.IO and Faye have the concept of Storage Engines that is used to
sync data between different node.js processes. I agree that there isn't any
"decent" documentation for it, but you can look at https://github.com/dshaw/RedisStore-Docs
for some more information 
 
In contrast, SockJS do not require any magic within your backend.
That is done at a cost - for horizontal scalability SockJs does
depend on sticky-sessions supported by the load balancer.
(or on separate domain names for every sockjs host
if you're not using a load balancer).

For Socket.IO and Faye it's not a magical requirement to scale, you know just as well as me
that you can strap any real time server behind and sticky load balancer and make it scale.

The only thing that wouldn't work anymore would be broadcasting.. 

jco...@gmail.com

unread,
Nov 22, 2012, 10:04:59 AM11/22/12
to soc...@googlegroups.com, stigf...@gmail.com
On Wednesday, July 4, 2012 11:46:07 AM UTC+1, 3rdEden wrote:
Yes, very much. I'm not an expert on scaling socket.io, but
it seems to require having redis as a backbone. From
what I can see this is barely documented - I would
be grateful if someone finds some decent documentation
on the subject.

Socket.IO and Faye have the concept of Storage Engines that is used to
sync data between different node.js processes. I agree that there isn't any
"decent" documentation for it, but you can look at https://github.com/dshaw/RedisStore-Docs
for some more information 

Chipping in since Faye was mentioned. Not going to try to talk you into using it, but worth mentioning the two basic scaling mechanisms it uses.

The first approach is to connect the various server processes using a backend, which Faye refers to as 'engines' -- there is a documented API for writing engines, and there's two engines widely available -- one in-process (for a single server) and one based on Redis. The API is documented here: http://faye.jcoglan.com/node/engines.html

This means a cluster of Faye servers act as a single service transparently -- a message published to one server is automatically forwarded to all the others. Stick sessions are not required in the load balancer layer.

The other main approach is manual sharding. Faye uses channels for routing messages, and a common pattern is to shard the channel space such that an subset of the channels you use are assigned to each server. Each shard can be assigned to a single server, or a cluster connected through Redis, it's up to you. Clients connect to the right server(s) for the channels they care about, and publish in the same way.

This does result in a bit more complexity on the client side but it means you can divide the load up without sharing any state between the various servers, so there's no single bottleneck in the system.

Broadcasting can work in a similar way -- you run multiple independent Faye servers, clients pick one at random to connect to, and the server publishes to all the servers when it has a new message (this publishing can be done using regular HTTP if the server doesn't need a long-running connection to the servers). This pattern is mainly used to overcome the network limitations of a single box, where the long-lived connections and synchronized CPU and request cycling for all clients on every message is where the load comes from.

Marek Majkowski

unread,
Nov 26, 2012, 7:18:31 AM11/26/12
to jco...@gmail.com, soc...@googlegroups.com
Thanks for the explanation James!

SockJS doesn't do broadcasting out of the box, doesn't require redis
underneath and does relay on sticky sessions.

Sharding is an interesting topic. In SockJS we encourage to keep all
the servers equal and possibly have some message bus underneath
for internal communication.

That said, SockJS-client accepts a 'server' parameter which is usually
a three digit number of server. By passing this value a browser can
point to a particular SockJS server within a cluster. This gives a bit
of flexibility for application builders - for example one can easily
point all connections interested in particular data to a single server
and gain locality. The cost is more sophisticated load balancing logic.

Marek

jezt...@gmail.com

unread,
Aug 29, 2013, 4:11:07 PM8/29/13
to soc...@googlegroups.com
I used Socket.IO for a long time and found it was incredibly unreliable, many users being dropped frequently, people often connecting on websocket then it falling over. A few months ago I switched to SockJS, users will usually connect using polling now, and at leas the connections are stable (even though many of the users should in theory be able to connect over websocket). A few weeks ago I swapped from ws: to wss: (TSL/SSL) this has significantly improved the situation, now the majority of users connect correctly using websockets. My advice: If you are using non-ssl steer well clear of socket.io! I have done simple tests with SockJS and it seems far more reliable for non-ssl, I have not compared Socket.io (SSL) vs Sock.js (SSL) though. However the SockJS contributors does appear to be far more responsive to issues.
Reply all
Reply to author
Forward
0 new messages