Chat on hookbox live on giantbomb.com

Honza Král

ulæst,

26. aug. 2010, 18.31.5426.08.2010

til hoo...@googlegroups.com

it's just a prototype, stress testing now...

http://www.giantbomb.com/chat/

Honza Král
E-Mail: Honza...@gmail.com
Phone: +1-415-797-8453

Michael Carter

ulæst,

26. aug. 2010, 21.31.4826.08.2010

til hoo...@googlegroups.com

Looks great.

Something I'll note though is that you've got presenceful enabled
which may be a bad idea if you've got a really large churn rate in a
channel with 500+ users. 90% of your bandwidth/cpu is going to be
spent sending subscribe/unsubscribe frames.

-Michael Carter

Honza Král

ulæst,

26. aug. 2010, 22.54.3426.08.2010

til hoo...@googlegroups.com

Thanks!

we certainly ran into some performance issues (lags) when we had over
1k users in one channel, we will have to:

- turn off logging (or make it async)
- track presention ourselves (exactly the reason)
- look into scaling into more than one OS thread (maybe using some
message queue, either internal - Queue or like RabbitMQ)
- eliminate nginx as a proxy (no websockets)
- ... ?

but we are determined to make it work, this was just a tesst to see
what we can do with default and it wasn't bad at all.

Thanks for the tips and for hookbox in the first place...

btw I haven't forgotten about the REST interface, things have just
been fairly busy recently.

Honza Král
E-Mail: Honza...@gmail.com
Phone: +1-415-797-8453

Michael Carter

ulæst,

27. aug. 2010, 03.04.2427.08.2010

til hoo...@googlegroups.com

Few quick questions so I can help you scale this as best I can.

On Thu, Aug 26, 2010 at 7:54 PM, Honza Král <honza...@gmail.com> wrote:

Thanks!

we certainly ran into some performance issues (lags) when we had over
1k users in one channel, we will have to:

- turn off logging (or make it async)
- track presention ourselves (exactly the reason)
- look into scaling into more than one OS thread (maybe using some
message queue, either internal - Queue or like RabbitMQ)
- eliminate nginx as a proxy (no websockets)
- ... ?

1) Do you know the cpu/memory usage of a) hookbox and b) the webhooks implementor (django?) during the test when you ran into lag issues?

2) In the webhook logic, were you saving all events to a database as they happened (connect, disconnect, subscribe, unsubscribe, publish) ?

3) Did you have anyone logged into the admin panel during the test; particularly, were you looking at either the webhooks or console log screen?

but we are determined to make it work, this was just a tesst to see
what we can do with default and it wasn't bad at all.

Thanks for the tips and for hookbox in the first place...

Cool, looking forward to helping you get this working for your use case. Your target is 5k concurrent users, right?

I assume your test involved live users? I've had some ideas about testing scaling without live users. Let me know if you're still interested in that.

btw I haven't forgotten about the REST interface, things have just
been fairly busy recently.

No worries, The docs are already behind as it is (with the addition of private user -> user messaging.) We gotta get back on top of docs first priority; second priority is more features.

Cheers,

Michael Carter

Honza Král

ulæst,

27. aug. 2010, 11.14.3227.08.2010

til hoo...@googlegroups.com

Hi,
thanks for the tips, we found that the biggest boost was to disable
logging, we saw alotof exceptions due to lost connections and writing
them to the console took a long time.

I will keep this list posted on our progress, thanks for the support.

On Fri, Aug 27, 2010 at 12:04 AM, Michael Carter
<carter...@gmail.com> wrote:
> Few quick questions so I can help you scale this as best I can.
>
> On Thu, Aug 26, 2010 at 7:54 PM, Honza Král <honza...@gmail.com> wrote:
>>
>> Thanks!
>>
>>
>> we certainly ran into some performance issues (lags) when we had over
>> 1k users in one channel, we will have to:
>>
>> - turn off logging (or make it async)
>> - track presention ourselves (exactly the reason)
>> - look into scaling into more than one OS thread (maybe using some
>> message queue, either internal - Queue or like RabbitMQ)
>> - eliminate nginx as a proxy (no websockets)
>> - ... ?
>>
>
> 1) Do you know the cpu/memory usage of a) hookbox and b) the webhooks
> implementor (django?) during the test when you ran into lag issues?

django was fine, we had a couple of servers though (the normal
giantbomb frontend farm)

> 2) In the webhook logic, were you saving all events to a database as they
> happened (connect, disconnect, subscribe, unsubscribe, publish) ?

nothing of that sort, we only pulled the user from DB (by primary key)
to check if they are logged in

> 3) Did you have anyone logged into the admin panel during the test;
> particularly, were you looking at either the webhooks or console log screen?

no, not at all, the admin was disabled

>> but we are determined to make it work, this was just a tesst to see
>> what we can do with default and it wasn't bad at all.
>>
>> Thanks for the tips and for hookbox in the first place...
>
> Cool, looking forward to helping you get this working for your use case.
> Your target is 5k concurrent users, right?

that and more, we will see how that goes

> I assume your test involved live users? I've had some ideas about testing
> scaling without live users. Let me know if you're still interested in that.

yes, real users, we shouted out to our users and they helped us.Having
an automated way to stress test the system would be indeed awesome and
allow us to move much quicker.

>>
>> btw I haven't forgotten about the REST interface, things have just
>> been fairly busy recently.
>>
>
> No worries, The docs are already behind as it is (with the addition of
> private user -> user messaging.) We gotta get back on top of docs first
> priority; second priority is more features.

have a look at my branch on github at the user handling - we are
storing additional information with users so I changed hookbox to
store users as json objects, not just strings

>
>
> Cheers,
>
> Michael Carter
>

steve hermes

ulæst,

27. aug. 2010, 11.53.4527.08.2010

til hoo...@googlegroups.com

Got this at the url

Honza Král

ulæst,

27. aug. 2010, 11.57.0127.08.2010

til hoo...@googlegroups.com

Yes, we took the chat down for some time. It was a limited time demo to accompany our live event, we will bring it back once we iron out the details.

sorry for the confusion

Honza Král
E-Mail: Honza...@gmail.com
Phone: +1-415-797-8453

Honza Král

ulæst,

1. sep. 2010, 20.50.5301.09.2010

til hoo...@googlegroups.com

And we are testing again on two sites this time:
http://www.giantbomb.com/chat/
and
http://www.screened.com/chat/

We have 10 hookbox instances working side by side, load balancing
courtesy of nginx, replication between them courtesy of webhooks and
django. So far we only have 400 users altogether and seeing no lag and
no load on the boxes. Real test is going to be tomorrow at 4pm PDT
during our regular live broadcast (usually around 3k users watch and
chat online) on giantbomb.

The chat can go down anytime, if you are getting 404s, it's because we
took the chat down, it's still only a demo.

Honza Král
E-Mail: Honza...@gmail.com
Phone: +1-415-797-8453

On Fri, Aug 27, 2010 at 8:57 AM, Honza Král <honza...@gmail.com> wrote:
>
> Yes, we took the chat down for some time. It was a limited time demo to accompany our live event, we will bring it back once we iron out the details.
>
> sorry for the confusion
>
> Honza Král
> E-Mail: Honza...@gmail.com
> Phone: +1-415-797-8453
>
>
> On Fri, Aug 27, 2010 at 8:53 AM, steve hermes <steve...@gmail.com> wrote:
>>
>> Got this at the url
>>
>>

marie_dk

ulæst,

3. sep. 2010, 03.20.2603.09.2010

til Hookbox User Group

On 2 Sep., 02:50, Honza Král <honza.k...@gmail.com> wrote:
> And we are testing again on two sites this time:http://www.giantbomb.com/chat/

> andhttp://www.screened.com/chat/

>
> We have 10 hookbox instances working side by side, load balancing
> courtesy of nginx, replication between them courtesy of webhooks and
> django. So far we only have 400 users altogether and seeing no lag and
> no load on the boxes. Real test is going to be tomorrow at 4pm PDT
> during our regular live broadcast (usually around 3k users watch and
> chat online) on giantbomb.

Could you tell a bit more about how replication works between hookbox
servers?

Do you have an estimate of how many users one hookbox server can
handle?

/marie_dk

Honza Král

ulæst,

3. sep. 2010, 03.33.3403.09.2010

til hoo...@googlegroups.com

On Fri, Sep 3, 2010 at 12:20 AM, marie_dk <derr...@gmail.com> wrote:
> On 2 Sep., 02:50, Honza Král <honza.k...@gmail.com> wrote:
>> And we are testing again on two sites this time:http://www.giantbomb.com/chat/
>> andhttp://www.screened.com/chat/
>>
>> We have 10 hookbox instances working side by side, load balancing
>> courtesy of nginx, replication between them courtesy of webhooks and
>> django. So far we only have 400 users altogether and seeing no lag and
>> no load on the boxes. Real test is going to be tomorrow at 4pm PDT
>> during our regular live broadcast (usually around 3k users watch and
>> chat online) on giantbomb.
>
> Could you tell a bit more about how replication works between hookbox
> servers?

hookbox doesn't have replication, we basically hacked around it - sticky
session on a load balancer in front of hookboxes and when a user publishes, we
send the message to all other hookboxes in the pool using REST call. (AKA very
quick and very dirty replication solution)

> Do you have an estimate of how many users one hookbox server can
> handle?

really depends on the traffic. Basically for every message hookbox must send
NUMBER_OF_USERS messages. On the first try we have been able to handle
approximately 1k users in one chatroom when we started to accumulate lag.

With five hookboxes for this chatroom and some optimizations
(presenceful=False, turn off logging) we have been able to handle 3k active
users without any lag, some of the servers were on 100% cpu but not constantly,
occasionally it dropped.

We will be looking into more robust solution that would enable us to
dynamically add hookbox instances to our pool and doing smarter load balancing.
But currently we have other priorities so I can't tell you when this will
happen :(.

> /marie_dk

marie_dk

ulæst,

3. sep. 2010, 04.09.5303.09.2010

til Hookbox User Group

On 3 Sep., 09:33, Honza Král <honza.k...@gmail.com> wrote:

> On Fri, Sep 3, 2010 at 12:20 AM, marie_dk <derri...@gmail.com> wrote:
> > Could you tell a bit more about how replication works between hookbox
> > servers?
>
> hookbox doesn't have replication, we basically hacked around it - sticky
> session on a load balancer in front of hookboxes and when a user publishes, we
> send the message to all other hookboxes in the pool using REST call. (AKA very
> quick and very dirty replication solution)

Aaah, yes... of cause. Well, who cares if its a bit dirty, as long as
it works! :-)

>
> > Do you have an estimate of how many users one hookbox server can
> > handle?
>
> really depends on the traffic. Basically for every message hookbox must send
> NUMBER_OF_USERS messages. On the first try we have been able to handle
> approximately 1k users in one chatroom when we started to accumulate lag.
>
> With five hookboxes for this chatroom and some optimizations
> (presenceful=False, turn off logging) we have been able to handle 3k active
> users without any lag, some of the servers were on 100% cpu but not constantly,
> occasionally it dropped.

That is good enough for me. I have 4 servers available and max 1500
users online.

> We will be looking into more robust solution that would enable us to
> dynamically add hookbox instances to our pool and doing smarter load balancing.
> But currently we have other priorities so I can't tell you when this will
> happen :(.

Thank you very much for sharing :-)

/marie_dk

Salman Haq

ulæst,

3. sep. 2010, 10.17.5303.09.2010

til hoo...@googlegroups.com

> approximately 1k users in one chatroom when we started to accumulate lag.
>
> With five hookboxes for this chatroom and some optimizations
> (presenceful=False, turn off logging) we have been able to handle 3k active
> users without any lag, some of the servers were on 100% cpu but not constantly,
> occasionally it dropped.
>
>

I can't help but wonder, if instead of using HTTP hooks for
inter-process-communication,
Hoobox used web sockets (or even plain tcp/udp) for IPC, would we get a
performance boost?

Shaq

Michael Carter

ulæst,

3. sep. 2010, 14.29.3903.09.2010

til hoo...@googlegroups.com

I think the idea behind this insight is a good one: Right now hookbox opens a *new socket* for every http callback to the web app. Simply switching to restkit and creating a pool of http clients will significantly cut down on overhead.

Likewise, I imagine many clients are just using urllib or their language's equivalent. We should probably work together on building hookbox client libraries; at the very least we should build a good python client library.

-Michael Carter

Salman Haq

ulæst,

9. sep. 2010, 18.03.3909.09.2010

til hoo...@googlegroups.com

Michael,

Did you have a general idea about the Python library so we can start prototyping something. I haven't used restkit, but I like the idea of it.

Also, I know there are several developers contributing to Hookbox and the programmer's documentation and code examples have already fallen behind. Or may be I'm out of the loop because of my absence on the IRC channel. But is there something we can do (wiki page?) to outline the ongoing development efforts?

Thanks,
Salman

Salman Haq

ulæst,

9. sep. 2010, 18.30.4409.09.2010

til hoo...@googlegroups.com

On 09/09/2010 06:03 PM, Salman Haq wrote:

On 09/03/2010 02:29 PM, Michael Carter wrote:

On Fri, Sep 3, 2010 at 7:17 AM, Salman Haq <salma...@asti-usa.com> wrote:

approximately 1k users in one chatroom when we started to accumulate lag.

With five hookboxes for this chatroom and some optimizations
(presenceful=False, turn off logging) we have been able to handle 3k active
users without any lag, some of the servers were on 100% cpu but not constantly,
occasionally it dropped.

I can't help but wonder, if instead of using HTTP hooks for inter-process-communication,
Hoobox used web sockets (or even plain tcp/udp) for IPC, would we get a performance boost?

I think the idea behind this insight is a good one: Right now hookbox opens a *new socket* for every http callback to the web app. Simply switching to restkit and creating a pool of http clients will significantly cut down on overhead.

Likewise, I imagine many clients are just using urllib or their language's equivalent. We should probably work together on building hookbox client libraries; at the very least we should build a good python client library.

Michael,

Did you have a general idea about the Python library so we can start prototyping something. I haven't used restkit, but I like the idea of it.

Clarification: By "Python library" I was referring to a "Python client library for hoobox".

Ziga Ham

ulæst,

9. sep. 2010, 19.03.5909.09.2010

til hoo...@googlegroups.com

I'm just about to make a python client using http://github.com/mtah/python-websocket and some json parser.

(I need it for some different purposes)

--
Best regards,
Žiga Ham

Michael Carter

ulæst,

10. sep. 2010, 02.06.5910.09.2010

til hoo...@googlegroups.com

On Thu, Sep 9, 2010 at 3:03 PM, Salman Haq <salma...@asti-usa.com> wrote:

On 09/03/2010 02:29 PM, Michael Carter wrote:

On Fri, Sep 3, 2010 at 7:17 AM, Salman Haq <salma...@asti-usa.com> wrote:

approximately 1k users in one chatroom when we started to accumulate lag.

With five hookboxes for this chatroom and some optimizations
(presenceful=False, turn off logging) we have been able to handle 3k active
users without any lag, some of the servers were on 100% cpu but not constantly,
occasionally it dropped.

I can't help but wonder, if instead of using HTTP hooks for inter-process-communication,
Hoobox used web sockets (or even plain tcp/udp) for IPC, would we get a performance boost?

I think the idea behind this insight is a good one: Right now hookbox opens a *new socket* for every http callback to the web app. Simply switching to restkit and creating a pool of http clients will significantly cut down on overhead.

Likewise, I imagine many clients are just using urllib or their language's equivalent. We should probably work together on building hookbox client libraries; at the very least we should build a good python client library.

Michael,

Did you have a general idea about the Python library so we can start prototyping something. I haven't used restkit, but I like the idea of it.

I basically envision a class, called HookboxAPI, lets say, which exposes a method for each of the web api calls. At a glance:

from hookboxapi import HookboxAPI

hookbox = HookboxAPI(security_token='secret')

# create the channel foo

hookbox.create_channel(name='foo', moderated=False, presenceful=True)

try:

hookbox.subscribe('mcarter', 'foo') # subscribes mcarter to foo

except hookboxapi.InvalidUser:

print "mcarter must not be signed on to hookbox now..."

Also, I know there are several developers contributing to Hookbox and the programmer's documentation and code examples have already fallen behind. Or may be I'm out of the loop because of my absence on the IRC channel. But is there something we can do (wiki page?) to outline the ongoing development efforts?

Yeah, you're right. I think the real problem is that we haven't scheduled a release and so the documentation is falling behind and features are being added sort of as people implement them.

Lets schedule a release 0.4.0 for September 30th. I'll go ahead and update the documentation and do my best to create a changelog.

At that point I can lay out a description of the enhancements that are most obviously needed so we can have some kind of roadmap.

Also, if you are ever wondering about something, just ask here, and don't be shy about updating the documentation and sending a pull request.

-Michael Carter

marie_dk

ulæst,

10. sep. 2010, 02.07.5610.09.2010

til Hookbox User Group

On 10 Sep., 00:03, Salman Haq <salman....@asti-usa.com> wrote:
> Also, I know there are several developers contributing to Hookbox and
> the programmer's documentation and code examples have already fallen
> behind. Or may be I'm out of the loop because of my absence on the IRC
> channel. But is there something we can do (wiki page?) to outline the
> ongoing development efforts?

I vote for that wiki page... I can't contribute with code, but I would
be more than happy to help keeping Hookbox docs up to date.

/marie_dk

Svar alle

Svar til forfatter

Videresend