Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
wsgiserver and keep-alive timeouts
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  3 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
JeffB  
View profile  
 More options Jul 30, 1:50 pm
From: JeffB <jeff.bart...@gmail.com>
Date: Thu, 30 Jul 2009 10:50:40 -0700 (PDT)
Local: Thurs, Jul 30 2009 1:50 pm
Subject: wsgiserver and keep-alive timeouts
This post builds upon what was noted here:

http://groups.google.com/group/cherrypy-devel/browse_thread/thread/b3...

but that was nearly a year ago and I thought I might offer up some
fresh thoughts.

In short, I am seeing the same issues mentioned in the above post,
where the WSGI server goes into a "blocked" state if the number of
concurrent connections exceeds the size of the thread pool.  This
leads to 10 second (or longer) delays on the browser-side.  I have a
very short example if one is interested but this behavior can be
easily duplicated in 3.1.2 simply by reducing the "server.thread_pool"
value down to 2 (or even better, 1) and attempting to load a page with
a lot of static content.

Again, as stated in the aforementioned post, the crux of the issue is
how the WSGI server deals with keep-alive connections.

Having looked over the code it appears that the problem comes down to
the use of blocking sockets.  By blocking on a recv, the calling
thread is tied up until data arrives or a timeout occurs.  On a keep-
alive connection, once all pipelined requests have been received and
processed, it's unlikely that any further data will arrive.  However,
we can't know this for sure so we have to try to keep reading - for
how long seems to be an arbitrary decision - the current value is 10
seconds (modified by "server.socket_timeout").

This architecture fundamentally limits the number of concurrent
connections that can be processed in a reasonable amount of time to,
at most, the number of threads in the thread pool.  One can increase
the number of threads but this is not efficient, for various reasons.
One could also reduce the timeout value but on a slow connection, this
could lead to dropped requests and other ugliness.

I suppose one could argue that if performance is important, use Apache
+ the Cherrypy application framework and be done with it.  However, I
happen to like the compactness and simplicity of the all-in-one
solution Cherrypy provides out-of-the-box - it's what drew me to it in
the first place.  So, I want something that will scale reasonably well
(beyond 10 concurrent users, which is where the system would choke in
the default config).

To fix this, I think two fundamental changes to the WSGI server are
required:

1. Use non-blocking sockets.  This will allow threads to return
immediately if there is no pending data to be read or if data cannot
be immediately sent, thereby potentially improving thread
utilization.  However, this requires a second change...

2. The use of non-blocking sockets fundamentally changes the flow of
certain parts of the server logic.  Specifically, there is no longer a
guarantee that any data will arrive when recv() is called.  In fact,
it may be tens of milliseconds (or longer) before something useful
arrives.  We don't want the thread just sitting there idle while it
waits.  Instead, it should relinquish control of the connection and
come back to it some time later, moving on to other pending
connections.  However, this means each connection needs to keep track
of its state so that when data finally does arrive, it can continue
where it left off.  This can be accomplished with a state machine
inside of the HTTPRequest class.

The WSGI server is already pretty compact.  While the changes above
are intrusive, they can easily be implemented without a full rewrite
of the file.  The majority of the changes would be to three classes:
HTTPConnection, HTTPRequest and CP_fileobject.

I would be interested in hearing the thoughts of the CP team in
regards to this problem.  I'm sure there are issues that I have not
considered in my analysis.

Thanks,

Jeff

Related tickets (there are probably others):

http://www.cherrypy.org/ticket/764
http://cherrypy.org/ticket/539            (btw, i don't think this is
a good idea...)


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Christian Wyglendowski  
View profile  
 More options Jul 30, 3:15 pm
From: Christian Wyglendowski <christ...@dowski.com>
Date: Thu, 30 Jul 2009 15:15:33 -0400
Local: Thurs, Jul 30 2009 3:15 pm
Subject: Re: [cherrypy-devel] wsgiserver and keep-alive timeouts

On Thu, Jul 30, 2009 at 1:50 PM, JeffB<jeff.bart...@gmail.com> wrote:
> This architecture fundamentally limits the number of concurrent
> connections that can be processed in a reasonable amount of time to,
> at most, the number of threads in the thread pool.  One can increase
> the number of threads but this is not efficient, for various reasons.

I'm interested to hear why you think this isn't efficient.  I know
there are limits and async servers can handle many more concurrent
connections, but there are tradeoffs when you go that route.

How many concurrent connections are you looking to maintain?  I'd
suggest bumping up the # of threads and doing some load testing to see
if there are real issues (maybe you already have?).  I think that
might make for an interesting analysis.

Christian
http://www.dowski.com


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JeffB  
View profile  
 More options Jul 30, 4:10 pm
From: JeffB <jeff.bart...@gmail.com>
Date: Thu, 30 Jul 2009 13:10:20 -0700 (PDT)
Local: Thurs, Jul 30 2009 4:10 pm
Subject: Re: wsgiserver and keep-alive timeouts

On Jul 30, 12:15 pm, Christian Wyglendowski <christ...@dowski.com>
wrote:

> I'm interested to hear why you think this isn't efficient.  I know
> there are limits and async servers can handle many more concurrent
> connections, but there are tradeoffs when you go that route.

There are tradeoffs for both cases, to be sure but, I believe, in
general, that most servers out there (web or otherwise) that expect
high concurrent connection rates (>100, say) would not use a one-
thread-per-connection model.  The overhead for managing that level of
threads becomes onerous and the CPU starts spending more time task-
switching than doing real work.  I mean, it's small, I don't want to
overstate the impact, but like all things, it adds up over time and
ultimately leads to a system that is less efficient than it could be.

For my work, 20 threads would probably be more than adequate for the
expected load - this is a simple in-house app, nothing fancy.  I could
just bump up the thread count and move on.

I guess it's a matter of principal.  The idea of a thread sitting
blocked for 10 seconds waiting on data that is never going to arrive,
tying up a critical resource, just seems wrong.

Jeff


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google