Hookbox keeps stopping

83 views
Skip to first unread message

tim

unread,
Nov 20, 2010, 4:49:00 PM11/20/10
to Hookbox User Group
I have got the following in my log files:


Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.13-
py2.6.egg/eventlet/greenpool.py", line 80, in _spawn_n_impl
func(*args, **kwargs)
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.13-
py2.6.egg/eventlet/wsgi.py", line 510, in process_request
proto = self.protocol(socket, address, self)
File "/usr/lib/python2.6/SocketServer.py", line 615, in __init__
self.handle()
File "/usr/lib/python2.6/BaseHTTPServer.py", line 331, in handle
self.handle_one_request()
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.13-
py2.6.egg/eventlet/wsgi.py", line 195, in handle_one_reques
t
self.raw_requestline = self.rfile.readline(MAX_REQUEST_LINE)
File "/usr/lib/python2.6/socket.py", line 430, in readline
data = self._sock.recv(self._rbufsize)
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.13-
py2.6.egg/eventlet/greenio.py", line 217, in recv
return fd.recv(buflen, flags)
error: [Errno 113] No route to host


And then:



Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.13-
py2.6.egg/eventlet/greenpool.py", line 80, in _spawn_n_impl
func(*args, **kwargs)
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.13-
py2.6.egg/eventlet/wsgi.py", line 510, in process_request
proto = self.protocol(socket, address, self)
File "/usr/lib/python2.6/SocketServer.py", line 615, in __init__
self.handle()
File "/usr/lib/python2.6/BaseHTTPServer.py", line 331, in handle
self.handle_one_request()
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.13-
py2.6.egg/eventlet/wsgi.py", line 195, in handle_one_request
self.raw_requestline = self.rfile.readline(MAX_REQUEST_LINE)
File "/usr/lib/python2.6/socket.py", line 430, in readline
data = self._sock.recv(self._rbufsize)
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.13-
py2.6.egg/eventlet/greenio.py", line 217, in recv
return fd.recv(buflen, flags)
error: [Errno 113] No route to host


Both times caused hookbox to stop. And ideas on why this would be
caused?

Michael Carter

unread,
Nov 21, 2010, 1:43:41 PM11/21/10
to hoo...@googlegroups.com
This is pretty weird; I don't really understand the error... but it looks like an issue with the eventlet wsgi server. I would suggest asking on the eventlet list.

-Michael

tim

unread,
Nov 21, 2010, 9:14:23 PM11/21/10
to Hookbox User Group
Sure, I can do that, but sometimes clients can't connect even if I
don't have those messages...

Just seems that after a while hookbox gives up, even those I can see
connections coming in:


2010-11-22 01:50:04,997 - access - INFO - Incoming WebSocket
connection 89.16.234.145 89.16.234.146:8000
2010-11-22 01:50:25,922 - access - INFO - Incoming WebSocket
connection 89.16.234.145 89.16.234.146:8000
2010-11-22 01:50:48,109 - access - INFO - Incoming CSP
connection 89.16.234.145 89.16.234.146:8000
2010-11-22 01:53:36,371 - access - INFO - Incoming CSP
connection 89.16.234.145 89.16.234.146:8000

Safari and then Firefox - in the admin view it says no users, there is
no authentication happening to the backend, and the clients are not
connected.

My Post from Firefox:
[[1,0,"[1,\"CONNECT\",{\"cookie_string\":
\"csrftoken=7c2ac4247951a8b8fbb72563869cfb2a;
sessionid=cabc120bee51babb363da9be0e128db3\"}]\u000d\u000a"]]

Hookbox response:
("OK")

A Monkey

unread,
Nov 22, 2010, 10:16:50 AM11/22/10
to hoo...@googlegroups.com

Hi Tim,

Is it possible that you're running afoul of your csrf protection mechanism? Without knowing anything about your setup it's hard to say, but those tokens are often good for one request only. If hookbox is receiving the initial connection ("incoming websocket connection") but your connect hook never executes, it may be something is preventing the "connect" message from reaching your connect hook.

The "OK" response is a little confusing, though.

On Nov 21, 2010 9:14 PM, "tim" <t...@timc3.com> wrote:

tim

unread,
Nov 22, 2010, 4:22:14 PM11/22/10
to Hookbox User Group
No its not that. I have an exception in place to handle that.

I downgraded hookbox to a version pre RestKit and so far so good.
Haven't seen any problems at all after that.

Would suggest that more testing goes into that version.

Thanks,

Tim

jordo

unread,
Nov 28, 2010, 10:38:23 PM11/28/10
to Hookbox User Group
I'm seeing the same thing - both with and without the RestKit changes.
When it happens, hookbox has 900+ open connections according to
netstat, which is far more than the number of connected users (~300).

Ergo

unread,
Dec 8, 2010, 5:31:07 AM12/8/10
to Hookbox User Group
This also happens to my instance of hookbox, is there any chance this
problem gets fixed soon ?

crusherdestroyer

unread,
Dec 8, 2010, 10:40:02 AM12/8/10
to Hookbox User Group
I had this happen as well, and like tim, I just backed it down to
previous version and it works fine for me. It doesn't stop.

Ergo

unread,
Dec 8, 2010, 1:41:42 PM12/8/10
to Hookbox User Group
What exact version are you using ?

crusherdestroyer

unread,
Dec 8, 2010, 3:08:11 PM12/8/10
to Hookbox User Group
It tells me I'm running 0.3.4dev. I downloaded it from git on
10-29-2010. The only reason I upgraded was because I thought maybe
the private messaging feature worked in the newest stuff on git. I
held onto the version I downloaded on 10-29-2010 in case I ran into
problems.

Ergo

unread,
Dec 12, 2010, 5:46:09 PM12/12/10
to Hookbox User Group
unfortunately, when we are having 300-500 users online it still stops
listening after a few mins, literally :(

crusherdestroyer

unread,
Dec 12, 2010, 7:08:36 PM12/12/10
to Hookbox User Group
With the newest available on git?

Ergo

unread,
Dec 13, 2010, 5:34:24 PM12/13/10
to Hookbox User Group


On 13 Gru, 01:08, crusherdestroyer <ryanhe...@gmail.com> wrote:
> With the newest available on git?

both 10-29-2010 revisions and latest trunk, it listens for few min and
then dies, it seems more users online make the problem appear sooner :(

desmaj

unread,
Dec 15, 2010, 1:07:58 PM12/15/10
to Hookbox User Group
Hi Folks,

If anyone can put together a very simple test case for this I'll be
glad to look into it.

Thanks,
Matthew

Eric

unread,
Dec 27, 2010, 7:41:53 PM12/27/10
to Hookbox User Group
This happens to me if I just leave hookbox running overnight (no users
connected) then come back in the morning. It won't subscribe users. No
error messages. The log acknowledges the connection attempt as in the
above cases with a line like this:

2010-12-28 00:20:55,133 - access - INFO - Incoming WebSocket
connection client_ip_addr hookbox_ip_addr:8001

but the log stops there, no subscription never happens, the browser
gets no response. Restarting hookbox puts everything back in a
functional state, subscriptions accepted, all working.

I recommend setting up a hookbox test app, then just leave it running
for a day or so. Come back and see if it accepts subscriptions.

Ergo, I'm concerned about your case as I am likely to be in heavy
production (hundreds of users) soon, and I can't afford to have to
rebuild once it gets going. Is it possible there is some other problem
on your server, some sort of deadlock that is wedging the hookbox
process? I've downgraded hookbox, but I need to wait and see if it
worked for me or not (and who knows how it will respond to real load).

Is this project still alive? I don't see much action here, and I'm
wondering if I should really rely on it. If not I need to go put
together RabbitMQ, socket.io, etc. in a hurry.

Thanks!
Eric

Ziga Ham

unread,
Dec 27, 2010, 7:44:50 PM12/27/10
to hoo...@googlegroups.com
I have exactly same problems and concerns as Eric.
--
Best regards,
Žiga Ham

marie_dk

unread,
Dec 28, 2010, 2:48:37 AM12/28/10
to Hookbox User Group


On Dec 28, 1:41 am, Eric <edr...@gmail.com> wrote
> Ergo, I'm concerned about your case as I am likely to be in heavy
> production (hundreds of users) soon, and I can't afford to have to
> rebuild once it gets going. Is it possible there is some other problem
> on your server, some sort of deadlock that is wedging the hookbox
> process? I've downgraded hookbox, but I need to wait and see if it
> worked for me or not (and who knows how it will respond to real load).
>

Just before Christmas I wrote an email to giantbomb.com as I know they
are running hookbox in production with thousands of users. I asked him
if they are still happy with hookbox, and what version they are
running.

They seem satisfied, but it turns out they are running a very old
version (doesn't even have logging) with a few patches. It's on github
under user HonzaKral.

Running an old version is not an option for me, since I am using some
of the new features. But on the other hand I don't have problems with
hookbox stopping over night.

I am using 0.3.4dev (20101024) running on 5 FreeBSD servers (version 7
and 8).

> Is this project still alive? I don't see much action here, and I'm
> wondering if I should really rely on it. If not I need to go put
> together RabbitMQ, socket.io, etc. in a hurry.
>

This is also a concern of mine. I am going live some time in January
and I would really like to see this project keep going. The fact that
hookbox is scalable and communicates with an existing application is
very important to me, and I have not seen this feature on any other
comet server.

/marie_dk

Ergo

unread,
Dec 28, 2010, 3:33:07 PM12/28/10
to Hookbox User Group
On 28 Gru, 01:44, Ziga Ham <ziga....@gmail.com> wrote:
> I have exactly same problems and concerns as Eric.
>

We all have these concerns, and its not the bug that concerns me, its
the lack of communication that worries me.

Even "sorry im very busy right now", or "i lost interest in project"
would be better, than silence because we are all confused and worried.

Where we could know whats going on.

salma...@asti-usa.com

unread,
Dec 28, 2010, 3:55:14 PM12/28/10
to hoo...@googlegroups.com
Hi all,

I have seen the bug intermittently too although I cannot reproduce it.

Besides letting it run over a 24 hr period, what other conditions are
necessary to reproduce the bug?

How many active connections, etc?

If the bug could be reproduced reliably, then it would be much easier to
track down the problem.

Thanks,
Salman

Ergo

unread,
Dec 29, 2010, 9:03:53 AM12/29/10
to Hookbox User Group
Salman,

do you visit #hookbox?

i have a test case built where we successfully exposed the error - but
it needs a live server and few 3-5 folks who can open their browsers
to stress the server.

if you have a moment to talk with me on IRC i can try to help debug
the issue with you.

marie_dk

unread,
Dec 29, 2010, 9:28:46 AM12/29/10
to Hookbox User Group
Salman, I am going live in January and if this bug spans multiple
versions, a few of us will run into deep trouble soon.

We will all be very very grateful if you could look into this, as it
is a real showstopper. You are our only hope at the moment.

And also, do you know what has happened to michael carter? He has been
very silent for a while, and it makes us wonder about the future of
hookbox.

/marie_dk

salma...@asti-usa.com

unread,
Dec 29, 2010, 11:45:55 AM12/29/10
to hoo...@googlegroups.com
Ergo, Marie,

>
> On 29 Dec., 15:03, Ergo <erg...@gmail.com> wrote:
>> Salman,
>>
>> do you visit #hookbox?

Not usually.

>>
>> i have a test case built where we successfully exposed the error - but
>> it needs a live server and few 3-5 folks who can open their browsers
>> to stress the server.

Good. Have you described your test on the mailing list previously?

>>
>> if you have a moment to talk with me on IRC i can try to help debug
>> the issue with you.

The earliest I can come on the IRC channel is Monday.
I'm in the eastern timezone in the US. Let me a time when I am likely to
find you there.

>
> Salman, I am going live in January and if this bug spans multiple
> versions, a few of us will run into deep trouble soon.

I have seen this problem with version 0.3.4 commit id d35a777.

>
> We will all be very very grateful if you could look into this, as it
> is a real showstopper. You are our only hope at the moment.
>

I suspect the problem might not be in the hookbox python code but in the
ancilliary libraries, eg: csp_eventlet, rtjp_eventlet, pyjsiocompile, etc.
If that is the case, then we would really
need the author's help.

> And also, do you know what has happened to michael carter? He has been
> very silent for a while, and it makes us wonder about the future of
> hookbox.

No, I don't. Try emailing him or Martin Hunt (mgh) directly to see if that
helps.
Or ask on the Orbited mailing list.

Best,
Salman

>
> /marie_dk
>


Ergo

unread,
Dec 29, 2010, 12:41:37 PM12/29/10
to Hookbox User Group


On 29 Gru, 17:45, salman....@asti-usa.com wrote:
> Good. Have you described your test on the mailing list previously?

My test basicly consists of a backend that hookbox uses, and a webpage
that opens 30 iframes initially , and then opens and closes one every
few ms - that simulates tons of connects/disconnects to hookbox.

After a short while the server stops listening.

Anything should be fine to replicate the issue, ppl who use php
reported it, i use python so its backend agnostic.

One thing that i find important is that it needs to be tested with few
remote ips - testing localhost i couldnt make the error appear. got
600 users running without issue, having server and clients on same
machine all from 127.0.0.1

A Monkey

unread,
Dec 29, 2010, 2:01:04 PM12/29/10
to hoo...@googlegroups.com
Hi Folks,

On Wed, Dec 29, 2010 at 12:41 PM, Ergo <erg...@gmail.com> wrote:
>
[snip the description of the test case]


>
> One thing that i find important is that it needs to be tested with few
> remote ips - testing localhost i couldnt make the error appear. got
> 600 users running without issue, having server and clients on same
> machine all from 127.0.0.1

As we've discussed before, these requirements really limit your test
case's usefulness. I spent a couple of nights working on it, but was
unable to reliably reproduce the problem. Additionally, I wasn't able
to get good indication of when the problem actually surfaced. It
would be nice to include this as well.

Right now a simple test case that can clearly and reliably reproduce
the problem for one person is the thing we most need.

Thanks,
Matthew

Ergo

unread,
Dec 29, 2010, 2:31:11 PM12/29/10
to Hookbox User Group
Matthew

I have an idea for a good test that would work but im not sure how it
can be implemented.

Can hookbox source be altered to substitute REMOTE_ADDR to be some
random generated ip?

I wouldnt bet on it, but then any test case would expose the issue i
hope.
Reply all
Reply to author
Forward
0 new messages