I personally believe this is a non issue as your server should'nt even care if there are 100 or 1000 users connecting and the browsers doesn't care either. You can open the chat example in 40 tabs and you will see that it still runs fine without any issues.
I discussed this with http://github.com/markjeee, and he had a look at
it, see his Socket.IO-rack project. The problem isn't with Ruby, but
with
Rack. It basically can't be done in a Rack-based server, unless you
extend the server itself allow websockets and flash sockets to bypass
Rack. Long-polling via Rack is feasible using Thin, for example with
async_sinatra (which modifies Thin so a Rack status code of -1 means
laterz doodz), but in typical deployment scenarios involving front-end
servers (Apache, I'm looking at you), even that doesn't work. It also
doesn't work on Heroku, with their routing grid.
So what I decided, for the application I care about, is to implement
COMET with long-polling and XHR-polling, and forget about websockets
and flash sockets. This works for me because most of my users will not
have other users pushing async messages at them - the only responses
those users will receive come from their own requests, and not on the
long-polling socket, so there isn't much extra connect overhead from
the long-polling socket being set up and torn down continually. Because
I must have a Ruby server (to access a large body of Ruby code), I
implemented it using async_sinatra.
See <http://github.com/cjheath/jquery.comet>. The chat_server directory
contains "bayeux.rb", which is most of what you need for the server-side
COMET protocol.
Remaining work I haven't done are:
* Detecting multiple connections and sending "advice" to xhr-poll
instead
of long-polling
* Timeout for clients that stop polling, so the resources can be
cleaned up.
If you use this code, I'd be delighted to receive pull requests.
Clifford Heath, Data Constellation, http://dataconstellation.com
Agile Information Management and Design
> I discussed this with http://github.com/markjeee, and he had a look at
> it, see his Socket.IO-rack project. The problem isn't with Ruby, but
> with
> Rack. It basically can't be done in a Rack-based server, unless you
> extend the server itself allow websockets and flash sockets to bypass
> Rack. Long-polling via Rack is feasible using Thin
^^^
Rainbows! by Eric Wong is the way to go with this, quote:
"It is based on Unicorn, but designed to handle applications that expect long
request/response times and/or slow clients."
I have a stagnated forever iframe model now that falls over at 300 seconds (as configured currently) and a partial XHR model that uses the same timing system. It's certainly doable and the RTT latency between our inhouse tech and the browser is higher with an IPC style bridge between node and ruby (although I don't anticipate on this remaining true; hence the switch). It works on Chrome, Safari, FF and IE (although no Opera, hence the switch; and it's a pain to maintain). It's all done in Ruby 1.8 although there are some nice 1.9 concurrency models: http://rainbows.rubyforge.org/ . The *current* implementation is all hand-crafted which I want to gut out for socket.io in order to offload the fragility of transport models. However, this would incur two possible costs:
(1) fleshing out a full-featured rack/ruby implementation of socket.io, drifting from the core code base
or
(2) implementing the necessary parts of our IP in node.js
(2) looks like the solution we may end up going with, unless (1) sounds like there would be a lot of serious support and it would be immensely beneficial.
> async_sinatra (which modifies Thin so a Rack status code of -1 means
> laterz doodz), but in typical deployment scenarios involving front-end
> servers (Apache, I'm looking at you), even that doesn't work. It also
> doesn't work on Heroku, with their routing grid.
^^^^
Again, Rainbows! supports many multi-threading models and can effectively take the server-pool models (ala nginx or apache talking over say domain sockets) out of the picture. We have deployed a number of commercial products that talk directly to rainbows and support dozens of clients simultaneously (we don't do web-scale products ... mostly just web interfaces to native apps). The core library is just pure Rack (1.2.2 or 1.3) and Ruby (1.8-p334) although you can write a web app and stack sinatra on it if you want.
> So what I decided, for the application I care about, is to implement
> COMET with long-polling and XHR-polling, and forget about websockets
> and flash sockets. This works for me because most of my users will not
> have other users pushing async messages at them - the only responses
> those users will receive come from their own requests, and not on the
> long-polling socket, so there isn't much extra connect overhead from
> the long-polling socket being set up and torn down continually. Because
> I must have a Ruby server (to access a large body of Ruby code), I
> implemented it using async_sinatra.
^^^^
Although I haven't looked at the code, this sounds like a traditional query/response model. Many of the ruby web servers (Mongrel, Thin, WebBrick) have two problems:
(1) They insist on buffering a bunch of data before flushing the socket
(2) They don't support HTTP/1.1's keep-alive directives
Both of these make it a non-starter, but again, Wong is the man here, he really came through (he's the guy behind git-instaweb and sunshowers too). You can configure a large number of workers, toss it on a tcp port, use something like ThreadPool, have some OOB IPC mechanism and you are good to go.
> See <http://github.com/cjheath/jquery.comet>. The chat_server directory
> contains "bayeux.rb", which is most of what you need for the server-side
> COMET protocol.
>
> Remaining work I haven't done are:
> * Detecting multiple connections and sending "advice" to xhr-poll
> instead
> of long-polling
^^^
I have an IPC method for doing this, sending things in-channel to propagate errors. Our IPC message passing is IP, so I can't really share the details. But I can share the algorithm
> * Timeout for clients that stop polling, so the resources can be
> cleaned up.
^^^
This is a difficult thing to genuinely detect because you don't know if the client is going to come back. You really need to fall back to a renegotiation at that time and then keep timestamps of when you "last saw" a client and then have some kind of Garbage collector.
But even ignoring all of that; ie, not having a GC and a last-seen counter, my database for this information doesn't seem to ever pass a few hundred K in the wild and grows very very slowly; aka, if you are using a similar model (not web-scale, think dozens of clients) then I don't know if all this work is justified.
> If you use this code, I'd be delighted to receive pull requests.
>
> Clifford Heath, Data Constellation, http://dataconstellation.com
> Agile Information Management and Design
^^^
I can do a bit of code sharing, but I don't think we will be going forward with the ruby model at this time; it looks like there's a lot of work involved that unless, there would be a large support base, would effectively negate the immense benefit of lightning turn-around times to bug fixes that you enjoy with well supported OSS projects. If you think there would be a lot of support however, I'd be more than happy to reignite this discussion with my superiors.
Thanks for the thoughtful response. I appreciate it very much.
~chris.
--
sent from a regular computer.
Thin is based on EventMachine, but uses the Mongrel HTTP parser.
With James Tucker's async extensions, it works just fine for long
polling.
I was aware also of Rainbows, but it's a bit more "fringe" and I haven't
played with it.
> I have a stagnated forever iframe model now that falls over at 300
> seconds
> ... It works on Chrome, Safari, FF and IE (although no Opera, hence
> the switch
My COMET stuff works fine on Opera.
> (1) fleshing out a full-featured rack/ruby implementation of
> socket.io, drifting
> from the core code base
If you decide to do this, I'd be happy to work with you on it.
I have 20,000 lines of Ruby code that I need to use in the
pub/sub context, and would rather not use a tiered server.
>> async_sinatra (which modifies Thin so a Rack status code of -1 means
>> laterz doodz), but in typical deployment scenarios involving front-
>> end
>> servers (Apache, I'm looking at you), even that doesn't work. It also
>> doesn't work on Heroku, with their routing grid.
> ^^^^
> Again, Rainbows! supports many multi-threading models
Yes, but like Thin, it still doesn't work on shared hosting. And Thin
works
for me.
> The core library is just pure Rack (1.2.2 or 1.3) and Ruby (1.8-
> p334) although you can write a web app and stack sinatra on it if
> you want.
Sinatra is pure Rack. Every Sinatra app is a Rack middleware.
>> So what I decided, for the application I care about, is to implement
>> COMET with long-polling and XHR-polling... I
>> implemented it using async_sinatra.
> ^^^^
>
> Although I haven't looked at the code, this sounds like a
> traditional query/response model.
No, it's COMET, with either *long-polling* or request-polling.
> Many of the ruby web servers (Mongrel, Thin, WebBrick) have two
> problems:
> (1) They insist on buffering a bunch of data before flushing the
> socket
> (2) They don't support HTTP/1.1's keep-alive directives
Your information on Thin is incorrect; it has neither of these problems.
>> Remaining work I haven't done are:
>> * Detecting multiple connections and sending "advice" to xhr-poll
>> instead of long-polling
> ^^^
> I have an IPC method for doing this, sending things in-channel to
> propagate errors. Our IPC message passing is IP, so I can't really
> share the details. But I can share the algorithm
The COMET protocol has this built-in, I just haven't implemented it
in my server (the client has it though). You seem to be unfamiliar with
COMET? See <http://svn.cometd.com/trunk/bayeux/bayeux.html>
and <http://cometd.org/>.
A COMET connection involves a long-poll which is used only for
asynchronous messages from the server, with normal XHR requests
for messages to the server (and their responses). In a multi-tab
situation, the server can detect the extra connection and advise
XHR-polling instead of long-polling. Mine doesn't... yet.
My code has a minimal channel-based pub/sub infrastructure on
both client and server sides.
>> * Timeout for clients that stop polling, so the resources can be
>> cleaned up.
> ^^^
>
> This is a difficult thing to genuinely detect because you don't know
> if the client is going to come back. You really need to fall back
> to a renegotiation at that time and then keep timestamps of when you
> "last saw" a client and then have some kind of Garbage collector.
>
> But even ignoring all of that; ie, not having a GC and a last-seen
> counter, my database for this information doesn't seem to ever pass
> a few hundred K in the wild and grows very very slowly; aka, if you
> are using a similar model (not web-scale, think dozens of clients)
> then I don't know if all this work is justified.
It's not a lot of work, and it's necessary for me. Simply refreshing
the webpage creates a new session, for example. Obviously I can
do better than that, but it's still going to be a problem as I have a
*lot* of data cached server-side for a session.
>> If you use this code, I'd be delighted to receive pull requests.
> I can do a bit of code sharing, but I don't think we will be going
> forward with the ruby model at this time; it looks like there's a
> lot of work involved that unless, there would be a large support
> base, would effectively negate the immense benefit of lightning turn-
> around times to bug fixes that you enjoy with well supported OSS
> projects. If you think there would be a lot of support however, I'd
> be more than happy to reignite this discussion with my superiors.
Try out my code - I think you'll see that it's a very compact solution
to the problem. A large solution would need a large amount of support,
but this is simple enough that it doesn't need a large installed base.
If you have Thin installed, just "cd chat_server" and run "rackup --
server thin"
and visit <http://localhost:8080> in multiple browsers.
Clifford Heath, Data Constellation, http://dataconstellation.com
Agile Information Management and Design
Skype: cjheath, Ph: (+61/0)401-533-540
Replies below:
----- Original Message -----
From: "Clifford Heath" <cliffor...@gmail.com>
To: "socket io" <sock...@googlegroups.com>
Sent: Monday, May 30, 2011 4:37:11 PM
Subject: Re: Ruby Backend
> Thin is based on EventMachine, but uses the Mongrel HTTP parser.
> With James Tucker's async extensions, it works just fine for long
> polling.
^^^
From my experience long polling is when you make an HTTP request to a server and
it blocks until a message is available, the connection is closed (or in an HTTP-Keep alive
sense, a new request can be made) and the process repeats.
If this is correct, that you need to potentially tear down and rebuild a connection
for each asynchronous message, then the problems I've faced in this is that you don't
have a reliable transport above about 10 messages a second (and that's with a lot of coercion);
and even then, things can get shaky with you missing a message or having to use some sophisticated
indexed counting system to ensure full delivery.
If you mean something else by long-polling then please forgive me. The methodologies
I was talking about was to do something like emit <script></script> tags in a hidden
iframe that would pass messages up so that you could get maybe 100-200x the throughput with
0% drop and none of the sophisticated overhead. If this is what you meant, than I think we
are just at a clash of terminology.
If this is also what you meant, then my problem with using thin, at least back in october,
is that even though you do the
dev call env
loop {
... do something ...
yield message
}
approach, if you decided to do say, just a simple sleep there and a yield of a single character and then do
a curl on the host, to take the browser out of the equation , I wasn't able to get Thin in a configuration
so that it would emit timely single character chunks.
Instead, I would receive large buffers in chunks meaning that I would need to buffer each message with lots of data
to ensure timely delivery. Unicorn, Zbatery, and Rainbows did not have this problem.
Although 8 months is a lot of time in the Ruby world and things could have changed.
> I was aware also of Rainbows, but it's a bit more "fringe" and I haven't
> played with it.
> I have a stagnated forever iframe model now that falls over at 300
> seconds
> ... It works on Chrome, Safari, FF and IE (although no Opera, hence
> the switch
> My COMET stuff works fine on Opera.
^^^
Awesome!
> If you decide to do this, I'd be happy to work with you on it.
> I have 20,000 lines of Ruby code that I need to use in the
> pub/sub context, and would rather not use a tiered server.
^^^^
Great!
---8<---
> The COMET protocol has this built-in, I just haven't implemented it
> in my server (the client has it though). You seem to be unfamiliar with
> COMET? See <http://svn.cometd.com/trunk/bayeux/bayeux.html>
> and <http://cometd.org/>.
^^^^^
As far as I've seen, "COMET" is a neologism that
encompasses a broad stroke of techniques for doing duplexed message passing as
opposed to a specific protocol or methodology.
I think that the cometD project that you are talking about shouldn't be
referred to as "COMET". I was genuinely confused by your terminology and I believe that
the term as defined by wikipedia here:
http://en.wikipedia.org/wiki/Comet_%28programming%29
is probably how most people use it; as an encompassing term, like "AJAX" or
"Cloud Computing".
I was not aware of the cometd project.
---8<---
> Try out my code - I think you'll see that it's a very compact solution
> to the problem. A large solution would need a large amount of support,
> but this is simple enough that it doesn't need a large installed base.
>
> If you have Thin installed, just "cd chat_server" and run "rackup --
> server thin"
> and visit <http://localhost:8080> in multiple browsers.
^^^^^
I am trying to be the kindest I can here but maybe the best way for me to do so
will be with some code:
I took your demo and added a button in that would send 300 messages in a setInterval with a timeout of 0, here's the code:
var
counter=0,
ival = setInterval(function(){
document.getElementById('phrase').value=counter;
$("#sendB").trigger('click');
counter++;
if(counter == 300){clearInterval(ival)}
},0);
I then modified the reporting code to report the delta time since the beginning of the test along with how many messages
had come through:
var n = new Date();
from = [n.getTime() - startTime.getTime(), ++messageCount, "..."].join(' ');
Here's some numbers I got back for a 0ms interval, trying to test raw speed in a way without trivially constructed burst payloads.
Time | message count | messages per second | % dropped
6.376s | 122 | 19.13 | 59
6.506s | 130 | 19.98 | 57
Then I put in a timeout of 30ms and repeated the test. In this scenario, we'd be expecting 33 messages per second and a runtime of
under 4 seconds.
Time | message count | messages per second | % dropped
11.327s | 214 | 18.89 | 29
11.253s | 219 | 19.46 | 27
And from my experience, my hypothesis is that the drop is asymptotic so even when going within the reported range, you'll
still not get 100%. I tried this using a 100ms timeout, expecting 10 messages/second, with hopefully a 100% delivery,
hopefully taking about 30 seconds:
Time | message count | messages per second | % dropped
32.141s | 292 | 9.08 | 2.67%
32.120s | 292 | 9.09 | 2.67%
Even at 0.25 seconds between the messages, having a throughput of 4 messages a second on localhost, with two clients,
on a machine with 32GB of ram and 12 cores, there was STILL about a 1.2% drop.
If you want a reliable message transport, this approach simply does not work. I'm sorry, not trying to kick sand in your
face, it just can't be used for my scenario, where I need reliable delivery of at least a few hundred messages a second to multiple
hosts. :-( apologies.
Thanks for your helpful responses! More below...
On 31/05/2011, at 10:58 AM, Chris McKenzie wrote:
> From my experience long polling is when you make an HTTP request to
> a server and
> it blocks until a message is available, the connection is closed (or
> in an HTTP-Keep alive
> sense, a new request can be made) and the process repeats.
>
> If this is correct, that you need to potentially tear down and
> rebuild a connection
> for each asynchronous message, then the problems I've faced in this
> is that you don't
> have a reliable transport above about 10 messages a second (and
> that's with a lot of coercion);
> and even then, things can get shaky with you missing a message or
> having to use some sophisticated
> indexed counting system to ensure full delivery.
I have queueing on both the client and server-side, so you can get many
messages/second without many connections/second. It's true I haven't
heavily stress-tested it, but I think it's a good deal better than the
naive
situation you describe. Your test apparently breaks my queueing, but
I think I can fix it... Plus I don't expect such a high message rate
anyhow.
> The methodologies
> I was talking about was to do something like emit <script></script>
> tags in a hidden
> iframe that would pass messages up so that you could get maybe
> 100-200x the throughput with
> 0% drop and none of the sophisticated overhead.
Yes, I'm familiar with that, but it's not what I'm doing.
> If this is also what you meant, then my problem with using thin, at
> least back in october,
> is that even though you do the
>
> dev call env
> loop {
> ... do something ...
> yield message
> }
>
> approach, if you decided to do say, just a simple sleep there and a
> yield of a single character and then do
> a curl on the host, to take the browser out of the equation , I
> wasn't able to get Thin in a configuration
> so that it would emit timely single character chunks.
Hmmph, I see, annoying. Is it trying to aggregate packets at the TCP
level? Often this single-char stuff requires you disable that using
setsockopt(TCP_NODELAY) - could that have been the cause of your
issues?
In any case, I'm not talking about a game where I need real-time
performance - 2-3 second latency is acceptable. This is where 2 or 3
people are working in a diagram editor, and their changes are being
synchronised to each other by a server that decides a single definitive
timeline from the various messages (it can correct conflicts and
reverse them). Obviously I can't deal with 30 second latency though.
Also, I don't know what the async extensions to Thin change, if
anything.
I don't know whether it can be used with multi-part responses yet.
You just get to emit a single body for a delayed response. But that's
possible to fix, now that Rack has updated to allow enumerable
(multi-part) responses.
>> The COMET protocol has this built-in, I just haven't implemented it
>> in my server (the client has it though). You seem to be unfamiliar
>> with
>> COMET? See <http://svn.cometd.com/trunk/bayeux/bayeux.html>
>> and <http://cometd.org/>.
> ^^^^^
>
> As far as I've seen, "COMET" is a neologism that
> encompasses a broad stroke of techniques for doing duplexed message
> passing as
> opposed to a specific protocol or methodology.
>
> I think that the cometD project that you are talking about shouldn't
> be
> referred to as "COMET". I was genuinely confused by your
> terminology and I believe that
> the term as defined by wikipedia here:
>
> http://en.wikipedia.org/wiki/Comet_%28programming%29
>
> is probably how most people use it; as an encompassing term, like
> "AJAX" or
> "Cloud Computing".
>
> I was not aware of the cometd project.
I'm not sure how the term came about, despite what Wikipedia says.
I thought it was first used in a blog post by someone who went on to
create the cometd project at DOJO, and that others had adopted the
term for any method of doing server-push over XHR. Perhaps that's
wrong, but it's what I had in mind.
>> Try out my code - I think you'll see that it's a very compact
>> solution
>> to the problem. A large solution would need a large amount of
>> support,
>> but this is simple enough that it doesn't need a large installed
>> base.
>>
>> If you have Thin installed, just "cd chat_server" and run "rackup --
>> server thin"
>> and visit <http://localhost:8080> in multiple browsers.
> ^^^^^
>
> I am trying to be the kindest I can here
Hey, I freely admit I'm learning here... I've done a lot of socket
programming
at a low level, but I'm less familiar with the browser quirks.
> but maybe the best way for me to do so will be with some code:
>
> I took your demo and added a button in that would send 300 messages
> in a setInterval with a timeout of 0,
If you had written a simple loop, it would have only sent one array of
queued messages, since the actual send is done on a 0 timeout after
the queue becomes non-empty. If I set that to 10ms, even your code
will make only a single connection for all 300 messages.
The intention is that a client can have at most two active XHR's at a
time; one to poll, and one to send. If it's sending when new messages
get queued, they wait for the current send to complete. I don't really
see how that can cause the symptoms you show in the results below.
Can you explain why it's happening? Or show that my code opened
more than two XHRs at a time? IOW, are you sure that the technique
cannot be made to work, and that it's not just a bug in the way I've
done it?
It really annoys me the way the tools make this stuff to hard to
diagnose...
If messages are being dropped, it should be possible to find out exactly
where and why...
> And from my experience, my hypothesis is that the drop is asymptotic
> so even when going within the reported range, you'll
> still not get 100%. I tried this using a 100ms timeout, expecting
> 10 messages/second, with hopefully a 100% delivery,
> hopefully taking about 30 seconds:
> Time | message count | messages per second | % dropped
> 32.141s | 292 | 9.08 | 2.67%
> 32.120s | 292 | 9.09 | 2.67%
>
> Even at 0.25 seconds between the messages, having a throughput of 4
> messages a second on localhost, with two clients,
> on a machine with 32GB of ram and 12 cores, there was STILL about a
> 1.2% drop.
>
> If you want a reliable message transport, this approach simply does
> not work. I'm sorry, not trying to kick sand in your
> face, it just can't be used for my scenario, where I need reliable
> delivery of at least a few hundred messages a second to multiple
> hosts. :-( apologies.
Not at all! If you're right, you've done me a great service by stopping
me going too far down this path... but I'm still not convinced that it
can't work - even though you've shown that so far, it doesn't.
I'd love to see the hacked code with your performance tests added.
Can you email me a zip please?
Clifford Heath.
Some notes below:
> I have queueing on both the client and server-side, so you can get many
> messages/second without many connections/second. It's true I haven't
> heavily stress-tested it, but I think it's a good deal better than the
> naive
> situation you describe. Your test apparently breaks my queueing, but
> I think I can fix it... Plus I don't expect such a high message rate
> anyhow.
^^^^^^^^
Your queuing system is fundamentally sound as you can see from the current github
snapshot where I take a number of setInterval'd functions with a timeout of 0 (or 1
or any low number). The sequence numbers are absolutely incrementing (and they should
from the code) which is a good step; messages are not coming in out of order. So
that part of the queuing checks out.
----8<----
> Hmmph, I see, annoying. Is it trying to aggregate packets at the TCP
> level? Often this single-char stuff requires you disable that using
> setsockopt(TCP_NODELAY) - could that have been the cause of your
> issues?
^^^^^^
Whatever it was, I wasn't willing to fork Thin to get it working ... I didn't
want to maintain a webserver.
> In any case, I'm not talking about a game where I need real-time
> performance - 2-3 second latency is acceptable. This is where 2 or 3
> people are working in a diagram editor, and their changes are being
> synchronised to each other by a server that decides a single definitive
> timeline from the various messages (it can correct conflicts and
> reverse them). Obviously I can't deal with 30 second latency though.
^^^^^^^
That's fine, it's the drop that is concerning with this approach; and the
fact that reliability appears to be asymptotic (see below about addressing this)
> Also, I don't know what the async extensions to Thin change, if
> anything.
> I don't know whether it can be used with multi-part responses yet.
> You just get to emit a single body for a delayed response. But that's
> possible to fix, now that Rack has updated to allow enumerable
> (multi-part) responses.
^^^^^^^^
Worth looking into; I'm in an environment where I have more control over my
server solutions, so I'd probably just stay the course with Wong's Rainbows!
project. :-\
> I'm not sure how the term came about, despite what Wikipedia says.
> I thought it was first used in a blog post by someone who went on to
> create the cometd project at DOJO, and that others had adopted the
> term for any method of doing server-push over XHR. Perhaps that's
> wrong, but it's what I had in mind.
^^^^^^^^
No problem. I just wanted to make sure that our language was in sync.
> Hey, I freely admit I'm learning here... I've done a lot of socket
> programming
> at a low level, but I'm less familiar with the browser quirks.
^^^^^^^^
The biggest gotcha is that the webkit browsers need like a few hundred bytes
of data before they will start streaming. It's really annoying.
> If you had written a simple loop, it would have only sent one array of
> queued messages, since the actual send is done on a 0 timeout after
> the queue becomes non-empty.
^^^^^^^^
Correct. That's what I referred to as packet bursting ... it's a hard problem.
> If I set that to 10ms, even your code
> will make only a single connection for all 300 messages.
^^^^^^^^
Agreed. Wireshark confirms this. There's of course two kinds of connections though,
the low-level HTTP connection and the connection as the browser sees it.
> The intention is that a client can have at most two active XHR's at a
> time; one to poll, and one to send.
^^^^^^^^
This appears to be true.
> If it's sending when new messages
> get queued, they wait for the current send to complete. I don't really
> see how that can cause the symptoms you show in the results below.
>
> Can you explain why it's happening? Or show that my code opened
> more than two XHRs at a time?
^^^^^^^^
I don't think it does; although I didn't write a test of this; it doesn't
appear that way.
> IOW, are you sure that the technique
> cannot be made to work, and that it's not just a bug in the way I've
> done it?
> It really annoys me the way the tools make this stuff to hard to
> diagnose...
> If messages are being dropped, it should be possible to find out exactly
> where and why...
^^^^^^^^
Probably because when you do a post and ask for the response, in that interim a bunch
of stuff goes to the server that you don't get and then gets dropped. But that was
just a quick look.
> Not at all! If you're right, you've done me a great service by stopping
> me going too far down this path... but I'm still not convinced that it
> can't work - even though you've shown that so far, it doesn't.
^^^^^^^^
It won't. I mean, ok it will. If you maintain your indices and queues correctly you can
make sure to get 100% throughput (what I initially referred to as a lot of accounting);
but you probably won't be seeing more then about 10 pps on a multi-user system.
I don't know exactly your internals, but you need to somehow abstract away the current session
from the user and then keep index counters on a per-instance basis so you know what you have
sent to what user so that when a current instance of the application in a specific browser window
or tab requests more information, you know exactly where to start sending it from. I'm guessing
from my tests that you aren't doing this.
But this is *exactly* why socket.io exists; because this problem is Fundamentally Difficult to get
right. That's why we are here. It's a huge time sink to do it well.
> I'd love to see the hacked code with your performance tests added.
> Can you email me a zip please?
^^^^^^^^
https://github.com/kristopolous/jquery.comet << the hacked version
~chris
The 10 PPS mentioned was the idea that there'd be a low number of payloads that could be
delivered through this mechanism; ignoring the fact that you can package multiple messages in
a single payload.
If you have all of the indexing problems fixed, this will probably become a factor of server
RTT + overhead, or the TTLB, which depending on the setup may exceed 10 payloads per second, but I don't think would reasonably exceed 40.
This won't probably be a factor in the ruby implementation you have since your latency requirements
are quite liberal, but in applications where you are doing say, mouse movements, this will probably
become a large factor.
As far as the drops go, I realize now that I may be populating the DOM element of the text being
transited too quickly and that what we may actually be seeing is just some DOM latency and your system may be ok.
It certainly bears the symptoms of things being dropped, but I don't think this is a reasonable
diagnosis at this time. I would probably recommend to take my code and then do a direct deposit
instead of trying to go through the dom and triggering a click event through jquery like I did.
After looking a bit deeper into the code, I realized I was rather quick to be dismissive since I don't really thinkg this transport mechanism is a particularly strong one; however, it appears to be rather well implemented.
As far as replacing cometd with socket.io, there are mechanisms for pulling in the data via socket.io and then using some type of IPC bridge, like zeromq or libevent or just tcp/domain sockets in order to do message passing in-between the two languages.
If you are in a position where you really can't be reasonably expected to switch to node.js for your
solution, then this may be the best option for you going forward.
----- Original Message -----
From: "Chris McKenzie" <ch...@oblong.com>
To: "socket io" <sock...@googlegroups.com>
Sent: Monday, 30 May, 2011 10:52:51 PM
Subject: Re: Ruby Backend
Totally agree! If I can't find where the missing messages are dropping
I'll need to rethink somewhat. Not entirely, because I already have an
ability to correct things (at the application level) if I detect
drops, but
still...
>> Hey, I freely admit I'm learning here... I've done a lot of socket
>> programming
>> at a low level, but I'm less familiar with the browser quirks.
> ^^^^^^^^
>
> The biggest gotcha is that the webkit browsers need like a few
> hundred bytes
> of data before they will start streaming. It's really annoying.
I'm not sure exactly what you mean here. Few hundred sent, or
received, or...?
and by "streaming" do you mean using keep-alive? So they won't do that
until they detect it being probably-worthwhile?
>> If I set that to 10ms, even your code
>> will make only a single connection for all 300 messages.
> Agreed. Wireshark confirms this. There's of course two kinds of
> connections though,
> the low-level HTTP connection and the connection as the browser sees
> it.
Yes, but if he low-level HTTP connection lasts longer than what the
browser
sees, that won't cause any problems, just help performance by reducing
connect overheads. But yeah, I meant "only a single request".
>> The intention is that a client can have at most two active XHR's at a
>> time; one to poll, and one to send.
> This appears to be true.
I believe that messages can only be lost if one end times out the
connection
after the other end sends data. I set up my solution with a long-poll
channel
and a request channel, and I never mix up the return traffic. It
should be set
up so that it's always the client and never the server which times out
a long-poll
(otherwise the client might timeout just as the server sends data),
and so that
the server and never the client times out the request channel.
If this was done right (and as yet, it's not) there should be no
message lost
without an XHR failure being reported; and the messages will be re-sent.
Some might get doubled up, but none should get dropped.
> Probably because when you do a post and ask for the response, in
> that interim a bunch
> of stuff goes to the server that you don't get and then gets
> dropped. But that was
> just a quick look.
I don't think that's happening, though it certainly was prior to the
last rewrite.
>> Not at all! If you're right, you've done me a great service by
>> stopping
>> me going too far down this path... but I'm still not convinced that
>> it
>> can't work - even though you've shown that so far, it doesn't.
> It won't. I mean, ok it will. If you maintain your indices and
> queues correctly
I don't want to implement a full sliding window protocol, but I do
want to know
if messages have been lost so an application-level resynchronise can be
triggered.
> https://github.com/kristopolous/jquery.comet
Thanks so much! I'll have a play...
There's still a limit for cross-domain, where it falls back to JSONP,
and the messages must be packed into the URL's query string.
I don't detect and handle that yet...
> This won't probably be a factor in the ruby implementation you have
> since your latency requirements
> are quite liberal, but in applications where you are doing say,
> mouse movements, this will probably
> become a large factor.
I don't want ever to do mouse movements. The events I've chosen
are such that a user couldn't generate more than one group of messages
per second. Group, because if objects are grouped, you get a "move"
command for each object, packed into a start-end envelope, when you
drop - no dynamic motion.
> As far as the drops go, I realize now that I may be populating the
> DOM element of the text being
> transited too quickly and that what we may actually be seeing is
> just some DOM latency and your system may be ok.
Ahhh, that would explain it. I'll do some more careful testing and see
whether it holds up.
> ... I don't really think this transport mechanism is a particularly
> strong one; however, it appears to be rather well implemented.
Well, thanks for the compliment! It is the 3rd complete rewrite, and I
was surprised how happy I felt with the last version. The examples
I found in the web before starting out were both badly written and
visibly buggy.
> As far as replacing cometd with socket.io, there are mechanisms for
> pulling in the data via socket.io and then using some type of IPC
> bridge,
Of course, there are plenty of ways to do that. I'd probably explore
using Redis in fact, since I may end up using that for the session
state.
> If you are in a position where you really can't be reasonably
> expected to switch to node.js for your
> solution, then this may be the best option for you going forward.
I have a choice, but I'm biassed in favour of fewer moving parts.
On re-reading, I found one line that might be causing loss:
<https://github.com/kristopolous/jquery.comet/blob/master/jquery.comet.js#L274
>
It's a hang-over from before I changed the code to move the
contents of messagesQueued into messagesToSend, before
the ajax call. It will wipe any messages queued while the current
request was executing... oops!
If you were motivated to re-run your test with this line deleted,
I'd be happy to know the result. It should be a good deal more
reliable.
That checks out ... it's just a matter of latency now. We should probably
do the ruby backend for socket.io given this; we both appear to have a business
interest here; I just fear getting out of alignment with the node implementation.
Excuse my ignorance but is there any sort of protocol documentation to go off of?
----- Original Message -----
From: "Clifford Heath" <cliffor...@gmail.com>
To: "socket io" <sock...@googlegroups.com>
Sent: Monday, May 30, 2011 9:32:52 PM
Subject: Re: Ruby Backend
Chris,
Do you mean that all messages get delivered?
Or that it explains the symptoms you saw but you haven't verified it?
> it's just a matter of latency now.
Do you mean the connect-time overheads associated with starting
a new poll every time you receive async messages (as opposed to
response messages) from the server? Multi-part responses or
infinite iframes should help with that, but I'm not experienced in
the latter.
> We should probably do the ruby backend for socket.io given this
My concern is that the Ruby backend still won't do websockets
or flash sockets (without more server hackery), so the standard
socket.io client would need to be configured that way. Or can the
server respond to an initial contact saying what transports it has
available?
> ; we both appear to have a business interest here;
> I just fear getting out of alignment with the node implementation.
The socket.io protocol is very different from any kind of XHR JSON
message passing. Do you propose to switch? Because for my
purposes, I think I can be happy with the code more-or-less as
it stands, modulo disconnect detection and ensuring that the
two connections are timed-out from opposite ends, as I said.
It'll also need short-polling too, in the case of multiple open tabs.
Clifford Heath.
> > That checks out ...
>
> Do you mean that all messages get delivered?
>Or that it explains the symptoms you saw but you haven't verified it?
^^^^^^
It works.
> > it's just a matter of latency now.
>
> Do you mean the connect-time overheads associated with starting
> a new poll every time you receive async messages (as opposed to
> response messages) from the server? Multi-part responses or
> infinite iframes should help with that, but I'm not experienced in
> the latter.
^^^^^^
Basically, yes. If you were to send out messageCount messages at
interval messageInterval and expected messageCount responses at some
latency RTTLatency then the following is *ideal*:
1. The time from the first message going out to the last message coming
back should be about equal to
messageCount * (messageInterval + RTTLatency)
for any value of messageCount and messageInterval
2. You should be able to take random sample points, startSample and
endSample with some times startTime and endTime associated with
it and be able to do something like this:
(endSample - startSample) * (messageInterval + RTTLatency) ~= endTime - startTime
Some transports are naturally better at achieving these ideals then others,
I know that it's not important for your application but it is, in general,
important for other peoples (specifically mine). It helps form an assertion
as to what kind of load and latency is to be expected and can help give symptoms
of problems and dictate what problems simply can and cannot be solved given this
system.
A whole different class of interaction and interface is available when you have
high throughput, low latency message passing.
> We should probably do the ruby backend for
> socket.io given this
>
> My concern is that the Ruby backend still won't do websockets
> or flash sockets (without more server hackery), so the standard
> socket.io client would need to be configured that way. Or can the
> server respond to an initial contact saying what transports it has
> available?
^^^^^^
No. We would do it right. Whatever wizardry it takes, we'd get it
done and have a fully featured ruby alternative for people that for
one reason or another love Socket.IO but cannot use node.js in there
solution. I'm talking full, maintained support.
It *will be done by someone eventually*, and I don't mean a proxy; I mean
the real deal. I think that once someone (if it's just me then so be it)
starts working on it, and does so as a serious effort, more people will
hop on quickly and contribute. Someone just needs to start getting this
ball rolling.
There's been interest expressed in a ruby backend before. I'm sure that
a team of 3 or 4 competent programmers could bang it out in a few days at
most. I want to start this now.
**** The only thing I'm waiting on **** is how the LearnBoost group wants
it to be done, since its fundamentally there project.
> > ; we both appear to have a business interest here;
> > I just fear getting out of alignment with the node implementation.
>
> The socket.io protocol is very different from any kind of XHR JSON
> message passing. Do you propose to switch? Because for my
> purposes, I think I can be happy with the code more-or-less as
> it stands, modulo disconnect detection and ensuring that the
> two connections are timed-out from opposite ends, as I said.
> It'll also need short-polling too, in the case of multiple open tabs.
^^^^^^
Exactly. Every day Socket.io becomes more and more feature complete. If it's
not the time to use it today, then it will be very soon. I venture to say that
in a year or so it will be expected for any serious implementation of this sort,
in the same way that jquery is expected to be used on almost any serious website.
Offloading the responsibility of implementing and maintaining such a thing would
save you time and give you a more stable and robust product that you can be
confident in. You also have a place to turn to if things go wrong and can benefit
from lightning fast turnaround time.
Have you personally downloaded Firefox 5.0b3 and used socket.io with it? What about
the IE 10 platform preview?
I bet you someone here has. I bet you they will identify any issues, if they exist
well before release and they will be fixed before you install the browser for the
first time. You will never see the issues because they will be long since taken
care of.
What if you need to do a handoff of your project one day because you move on to
greener pastures? I bet you anything that within 12 months there will be a book
available on Socket.IO on amazon for purchase. This is just how it goes. Would
you want to write a book on your stuff? Me either.
What if a competitor comes around and is using Socket.IO? Do you want to independently
feature match socket.io yourself, because the competitors supported browser list will be
bigger than yours, and they will probably have spent a lot less time getting there. Their
sales pitch of "We are using this well supported library" versus "I am using one I built
myself" is a much lower risk for customers when evaluating a product.
What about yourself? Wouldn't you like to get a piece of the action on a big project like
this? Be an early adopter and a core contributor to a project that is doing a hockey
stick graph, getting 10 more people watching it just today alone? How would something like
that look versus "I developed my own" on some resume or CV? Even in a purely objectivist
stance, taking your clients and bosses out of it, switching over to socket.io and contributing
to the project is a move that will position *you* better in the future.
For all these reasons and more it is a better solution for commercial product deployment.
All the ruby backend needs is a go-ahead from the founders on how to move forward.
Thanks! :)
~chris.
Thanks for the clarifications.
>> My concern is that the Ruby backend still won't do websockets
>> or flash sockets (without more server hackery),
> ^^^^^^
> No. We would do it right. Whatever wizardry it takes, we'd get it
> done and have a fully featured ruby alternative
Ok, sounds fair. I think that uptake will always be limited however
by the sheer number of deployed Rack-based hosting providers
and technologies, which will get in the way of deploying the server(s)
which make it possible.
> It *will be done by someone eventually*,
Probably, yes.
> I'm sure that
> a team of 3 or 4 competent programmers could bang it out in a few
> days at
> most. I want to start this now.
You might see if James Tucker @raggi is interested - he's the guy
behind async_sinatra and various related work.
While your arguments about getting involved are convincing in a
general sense, I'm working on such wide-range of novel technologies
here that I need to limit my commitment to each one. IOW I'm
interested and willing to help, but don't want another job.
> What if a competitor comes around and is using Socket.IO?
My technology won't stand or fall on the browser communications,
simple as that. That's at least three tiers down in importance. Take
a look at my website if you want to understand that better.
> What about yourself? Wouldn't you like to get a piece of the action
> on a big project like
> this?
I have nothing to prove, except that the semantic technology I'm
working on can change the IT industry fundamentally. Supporting
another language for a browser communication protocol which I
only use in one part of one related tool is just not a big deal for me.
Maybe for someone, but not for me.
> ...taking your clients and bosses out of it
I am my own boss, and so far, my client too, and I have enough on
my plate without starting any new projects. Happy to help, and hope
you succeed, but can't drop other things to do it.
Thanks for the reply. Given everything, I'll just be submitting pull requests to the unofficial one here:
https://github.com/rahearn/Socket.IO-rack/tree/development
Oh, nice - I wasn't aware markjeee's repo had been forked.
He started the original one after I tried to contract him to
develop it for me :). He didn't get far enough to want to take
my $$ though...
If you manage to beat it into submission, and rename various
classes from Tagalog to English so I can understand it, you
may see pull requests from me also :).
Clifford Heath.
It's an often situation on forums, when users open 10-20 pages simultaniously. AFAIK, browser will now allow to create > 8 connections to single domain. Does socket.io have any abstraction, to share connections between tabs/windows? Or does it solve such problem in other ways (activate/deactive io on focus change and so on)?