Socket.io + Express + Cluster. WTH?

1,983 views
Skip to first unread message

Jordan

unread,
Oct 30, 2012, 12:53:39 AM10/30/12
to sock...@googlegroups.com
Does anyone have this working?...  I'm seeing way too many "client not handshaken client should reconnect" when running socket.io in a cluster.  My app is completely unusable.

Here's some code:

os = require 'os'
cluster = require 'cluster'
redis = require 'redis'
http = require 'http'
express = require 'express'
io = require 'socket.io'

RedisStore = require 'socket.io/lib/stores/redis'


if cluster.isMaster
numCpus = os.cpus().length
for cpu in [0...numCpus]
cluster.fork()
else
expressServer = express()
expressServer.get '/', (request, response) ->
response.send "<script src='socket.io/socket.io.js'></script>
<script>
var socket = io.connect('http://' + window.location.host);
socket.on('news', function (data) {
console.log(data);
socket.emit('my other event', { my: 'data' });
});
</script>"

server = http.createServer expressServer
socketio = io.listen server
socketio.set 'store', new RedisStore
redisClient: redis.createClient()
redisPub: redis.createClient()
redisSub: redis.createClient()

socketio.sockets.on 'connection', (socket) ->
socket.emit 'news', hello: 'world'
socket.on 'my other event', (data) ->
console.log 'data: ' + data

server.listen 80
console.log 'started server!'


Here's some output:

   debug - served static content /socket.io.js
   debug - client authorized
   info  - handshake authorized I2UamUeKVN0eiGErAAAA
   debug - setting request GET /socket.io/1/websocket/I2UamUeKVN0eiGErAAAA
   debug - set heartbeat interval for client I2UamUeKVN0eiGErAAAA
   warn  - client not handshaken client should reconnect
   info  - transport end (error)
   debug - set close timeout for client I2UamUeKVN0eiGErAAAA
   debug - cleared close timeout for client I2UamUeKVN0eiGErAAAA
   debug - cleared heartbeat interval for client I2UamUeKVN0eiGErAAAA
   debug - discarding transport
   debug - client authorized
   info  - handshake authorized 43mLEUhnS8X6J9UaAAAB
   debug - setting request GET /socket.io/1/websocket/43mLEUhnS8X6J9UaAAAB
   debug - set heartbeat interval for client 43mLEUhnS8X6J9UaAAAB
   warn  - client not handshaken client should reconnect
   info  - transport end (error)
   debug - set close timeout for client 43mLEUhnS8X6J9UaAAAB
   debug - cleared close timeout for client 43mLEUhnS8X6J9UaAAAB
   debug - cleared heartbeat interval for client 43mLEUhnS8X6J9UaAAAB
   debug - discarding transport
   debug - client authorized
   info  - handshake authorized lnUKGy73Cg9mGkStAAAC
   debug - setting request GET /socket.io/1/websocket/lnUKGy73Cg9mGkStAAAC
   debug - set heartbeat interval for client lnUKGy73Cg9mGkStAAAC
   warn  - client not handshaken client should reconnect
   info  - transport end (error)
   debug - set close timeout for client lnUKGy73Cg9mGkStAAAC
   debug - cleared close timeout for client lnUKGy73Cg9mGkStAAAC
   debug - cleared heartbeat interval for client lnUKGy73Cg9mGkStAAAC
   debug - discarding transport
   debug - client authorized
   info  - handshake authorized 40ySXZnAhiBOJCkXAAAD
   debug - setting request GET /socket.io/1/websocket/40ySXZnAhiBOJCkXAAAD
   debug - set heartbeat interval for client 40ySXZnAhiBOJCkXAAAD
   warn  - client not handshaken client should reconnect
   info  - transport end (error)
   debug - set close timeout for client 40ySXZnAhiBOJCkXAAAD
   debug - cleared close timeout for client 40ySXZnAhiBOJCkXAAAD
   debug - cleared heartbeat interval for client 40ySXZnAhiBOJCkXAAAD
   debug - discarding transport
   debug - client authorized
   info  - handshake authorized WcJLGqrKpz1l5FEIAAAE
   debug - setting request GET /socket.io/1/websocket/WcJLGqrKpz1l5FEIAAAE
   debug - set heartbeat interval for client WcJLGqrKpz1l5FEIAAAE
   warn  - client not handshaken client should reconnect
   info  - transport end (error)
   debug - set close timeout for client WcJLGqrKpz1l5FEIAAAE
   debug - cleared close timeout for client WcJLGqrKpz1l5FEIAAAE
forever... 


Please help!!!!

Dmitry

unread,
Oct 30, 2012, 9:31:23 AM10/30/12
to sock...@googlegroups.com
It won't work out of the box. The issue is that socket.io uses local state to keep connected clients, and subsequent request can go to a different node in cluster. You need to use load-balancer with consistent hashing to connect one client with exactly one cluster node.

Abhimanyu Saxena

unread,
Oct 30, 2012, 9:59:58 AM10/30/12
to sock...@googlegroups.com
I faced this issue some 9 months back but was not able to find a clean solution for this. There is a huge thread going on this bug 

Dmitry can you please elaborate your solution and understanding of what is causing it?
--
Cheers!
Abhimanyu Saxena,
Software Architect, Fab.com

Jordan

unread,
Oct 30, 2012, 3:13:34 PM10/30/12
to sock...@googlegroups.com
Interesting.  I would have assumed socket.io would have used redis to retain that information so that things would work, no matter how many servers you were using.  Huge bummer!!

Dmitry

unread,
Oct 31, 2012, 7:03:35 AM10/31/12
to sock...@googlegroups.com
With redis it will broadcast room messages and keep client state across the cluster, but your load balancer should ensure that in case of xhr-polling the client is always polling the same server.

Dmitry

unread,
Oct 31, 2012, 7:17:05 AM10/31/12
to sock...@googlegroups.com
I'm not sure if that issue is the same. If we're talking about cluster, while socket.io can use redis to sync some data, it appears it don't sync everything including local client state. I can't say for sure since we decided to not invest more time looking for its bugs. Instead, we use amino gateway (which we use for other purposes as well) with sticky id to route clients to the same servers in the cluster. This completely solves the issue. We're also using engine.io instead and self-written room message broadcast via redis to accomplish the same task with less effort.

Dmitry

unread,
Mar 13, 2013, 11:58:27 AM3/13/13
to sock...@googlegroups.com
AFAIK no, they don't need to be on the same server if you're using redis store. But that doesn't scale much since socket.io sends too many messages from the store. Right now the only way I see is to write your own little pub/sub code that subscribes to the room id on join and unsubscribes if last user left the room, and broadcasts your messages across the cluster using room id.
Something like this:

on join: 
   redis.subscribe('room:' + roomId);

on left:
   if(lastUser) redis.unsubscribe('room:' + roomId);

on socket.io message:
   redis.publish('room:' + roomId, message);

redis.on('message', function(msg) {
   if(msg.id != process.id)
      socket.in(msg.room).json.send(msg.json);
});

Note that you'll also receive your own published message, so you need to add some payload like process id.


On Wednesday, March 13, 2013 11:50:50 AM UTC+4, Justin Meltzer wrote:
Dmitry - I've been trying to get socket.io to work with redis store and polling without sticky sessions. It seems like this does not work. However if I use sticky sessions and route each client to the same server, do all clients joined to the same room need to be connected to the same server if I want to broadcast to that room?
Reply all
Reply to author
Forward
0 new messages