Severe scalability problems - getting http server 500 error under heavy load.

76 vistas
Ir al primer mensaje no leído

Anonymous Coderrr

no leída,
15 abr 2009, 2:07:15 a.m.15/4/2009
para Google App Engine
I have a fairly simple app - it looks up a couple of objects from the
google datastore and then creates a page from django template - pure
vanilla.

I wanted to see how my app would perform under heavy load, so I set up
a simulation where 500 virtual web-browsers would attempt to request
my page twice - exactly at the same time.


The results were dismal!! Nearly 50% of the requests resulted in a
"HTTP response code: 500" from GAE - not my application, but
apparently GAE itself.

I check my dashboard logs - no errors from my application. No errors
anywhere I could find.

What can I do? I'm not expecting 500 requests per second, but
certainly maybe 100. The best rate I can get according to the
dashboard is about 4.5 requests per second. The dev server running on
my laptop does better than that!!!

Thanks

Anonymous Coderrr

no leída,
15 abr 2009, 2:17:35 a.m.15/4/2009
para Google App Engine
additionally, under this load, the time it takes to service a request
grows from 2 seconds per request (no load) to 15 seconds per request
(high load).

T.J. Crowder

no leída,
15 abr 2009, 3:03:18 a.m.15/4/2009
para Google App Engine
Hi,

Since this is just a test app, can you post the code (perhaps to Pastie
[1], for syntax coloring and the like) and your mechanism for storing
the data? That might help people help you figure out why it isn't
performing as you would hope.

[1] http://pastie.org

FWIW,
--
T.J. Crowder
tj / crowder software / com
Independent Software Engineer, consulting services available

T.J. Crowder

no leída,
15 abr 2009, 4:03:25 a.m.15/4/2009
para Google App Engine
Hi again,

> ...mechanism for storing the data...

Wow, I found a convoluted way to say that! ;-) Let me try again: And
your test data. E.g., the things people would need to replicate the
result.

Sorry for the poor phrasing...

-- T.J.

Barry Hunter

no leída,
15 abr 2009, 5:19:25 a.m.15/4/2009
para google-a...@googlegroups.com
One thing that has become apprent is appengine, is designed to scale
under real world usage.

So if your App went from 0/1 users to 500 in a matter of seconds, then
the system wont work well. You need to ramp up the usage slowly.

Even a slashdotting would result in a 'ramp' usage.

Also 500 users coming from once source, might be a bit suspicios, and
appengine could be weary of a DOS attack.
--
Barry

- www.nearby.org.uk - www.geograph.org.uk -

Alex

no leída,
15 abr 2009, 12:51:04 p.m.15/4/2009
para Google App Engine
I would echo Barry's point -- my guess is that if all of these
requests came from the same IP in a matter of seconds that GAE was
using a security measure to turn away the connections. They probably
also had a similar signature, being from the same script requesting
the same information from the same place going to the same place --
classic DoS.

Alex Foley

On Apr 15, 2:19 am, Barry Hunter <barrybhun...@googlemail.com> wrote:
> One thing that has become apprent is appengine, is designed to scale
> under real world usage.
>
> So if your App went from 0/1 users to 500 in a matter of seconds, then
> the system wont work well. You need to ramp up the usage slowly.
>
> Even a slashdotting would result in a 'ramp' usage.
>
> Also 500 users coming from once source, might be a bit suspicios, and
> appengine could be weary of a DOS attack.
>

Anonymous Coderrr

no leída,
15 abr 2009, 4:33:54 p.m.15/4/2009
para Google App Engine
Good points.

I rewrote the test so it fires off 20 requests from 20 distinct ip
addresses simultaneously, once a second 10 times.

In effect 20 concurrent requests, once a second.

I had about 15% loss and the request time degradation was there. (2
seconds to fulfill a request on an idle system, 15 seconds under
load).

This still is no where near advertised load rates.


On Apr 15, 2:19 am, Barry Hunter <barrybhun...@googlemail.com> wrote:
> One thing that has become apprent is appengine, is designed to scale
> under real world usage.
>
> So if your App went from 0/1 users to 500 in a matter of seconds, then
> the system wont work well. You need to ramp up the usage slowly.
>
> Even a slashdotting would result in a 'ramp' usage.
>
> Also 500 users coming from once source, might be a bit suspicios, and
> appengine could be weary of a DOS attack.
>
> On 15/04/2009, Anonymous Coderrr <greedw...@gmail.com> wrote:
>
>
>
>
>
>
>
> >  I have a fairly simple app - it looks up a couple of objects from the
> >  google datastore and then creates a page from django template - pure
> >  vanilla.
>
> >  I wanted to see how my app would perform under heavy load, so I set up
> >  a simulation where 500 virtual web-browsers would attempt to request
> >  my page twice - exactly at the same time.
>
> >  The results were dismal!!  Nearly 50% of the requests resulted in a
> >  "HTTP response code: 500" from GAE - not my application, but
> >  apparently GAE itself.
>
> >  I check my dashboard logs - no errors from my application.  No errors
> >  anywhere I could find.
>
> >  What can I do?  I'm not expecting 500 requests per second, but
> >  certainly maybe 100.  The best rate I can get according to the
> >  dashboard is about 4.5 requests per second.  The dev server running on
> >  my laptop does better than that!!!
>
> >  Thanks
>
> --
> Barry
>
> -www.nearby.org.uk-www.geograph.org.uk-- Hide quoted text -
>
> - Show quoted text -

boson

no leída,
15 abr 2009, 8:54:28 p.m.15/4/2009
para Google App Engine
You need to ramp up your tests over many minutes to allow GAE to spawn
enough instances to handle the traffic. I don't know their exact
algorithm, but I know it takes time to scale up.
> > -www.nearby.org.uk-www.geograph.org.uk--Hide quoted text -

Anonymous Coderrr

no leída,
16 abr 2009, 1:08:08 p.m.16/4/2009
para Google App Engine
Hi,

I changed my test so it sends a batch of requests every 2 seconds.
The batch size starts out as 1 and increases by 1 every x seconds.
Last run I did I had it increase by 1 every 100 seconds. This ramped
up things slowly.

I still started getting errors around 16-20 requests per second. My
dashboard says I don't get anything above 20 requests per second or
so.

Am I still ramping things up too fast? I'll try an increase of 1
request every 200 seconds a bit later.

Thanks
> > > -www.nearby.org.uk-www.geograph.org.uk--Hidequoted text -
>
> > > - Show quoted text -- Hide quoted text -
Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos