HTTP Thread Usage Saturation with Persistent Connections

206 views
Skip to first unread message

mbendiksen

unread,
Aug 30, 2008, 2:04:12 PM8/30/08
to cherrypy-devel
I am seeing instances of poor performance (hangs up to the default 10
second socket timeout) when the HTTP server threads all have active
persistent connections open. I believe this case can happen more
frequently than anticipated. I added debug logging to wsgiserver to
see what was occurring in these situations. Below is my analysis and
suggestions. Note I just dove into this issue yesterday and I'm not an
HTTP expert. So feel free to correct / critique my analysis and
results. :-)

Let's walk through an example. Using Safari or Firefox I refresh a
CherryPy page containing multiple embedded image URLs. As an
optimization, by default the browsers perform multiple (Firefox
setting network.http.max-persistent-connections-per-server) HTTP
requests concurrently. If server.thread_pool is less than 4 (to
definitely see it happen set it to 1), then the HTTP requests quickly
saturate the number of threads available, and the overflows are pushed
onto the ThreadPool's _queue. This in itself sounds okay, but the
problem is the browsers (Safari and Firefox for sure, not sure about
IE) leave these HTTP 1.1 connections open even after the server has
fulfilled all of the active requests for a given connection. At this
point we now have all of our request threads sitting idle on
rfile.readline() but there are still pending requests in the
ThreadPool's _queue that have not yet been assigned to a thread. All
the threads are saturated but at the same time they are just blocked
and not doing any useful work (even though there are other requests
that need to be fulfilled). The browser load appears hung at this
point, until finally the socket times out (10 seconds by default) and
the thread grabs the next request from the _queue. The resulting
appearance is that of a server which is very overloaded, even though
the threads are blocked and not consuming any CPU cycles.

I do not believe the answer to this thread usage saturation is to just
increase the thread count. In addition to seeing this problem with a
single browser page refresh (where the page contains multiple embedded
images and thread_pool is set to a low value), I think this problem
will also occur in other scenarios with higher thread_pool values. For
example, I have another page that uses AJAX to make a simple HTTP
request from CherryPy every few seconds. In these cases the browser
nor CherryPy closes the HTTP connection because it is kept alive by
the requests that occur at intervals less than 10 seconds apart. This
means that thread saturation will occur after the number of connected
browser AJAX clients is equal to thread_pool. These connections are
idle a majority of the time, with each thread only fulfilling a single
AJAX request every few seconds. But once the number of connected
Browsers is equal to thread_pool count, future connections will be
queued but not handled, even though the load per thread is extremely
low. Example: an AJAX application that polls the server every 8
seconds will saturate 30 threads with 30 connected Browser clients
even though it is only handling 3.75 requests per second.

CherryPy currently is not attempting to control how long these
persistent connections are open. While increasing the thread_pool
value to a much higher value is an option, it still seems suboptimal
to have requests going onto the _queue never being fulfilled when a
majority of the threads are blocked on rfile.readline() not doing any
work. Additionally, there may be some instances where the developer
wants thread_pool set to 1 (if their handlers aren't thread safe).
When thread_pool is currently set to 1, I believe one is almost
certain to run into this problem with all HTTP 1.1 browsers.

Another possible solution is dynamically creating more worker threads
as discussed here:

http://cherrypy.org/ticket/539

However, this isn't fully implemented yet, and even once implemented
the same problem (requests not being fulfilled but threads are just
blocked) can occur once the thread pool hits thread_pool_max. And
again, if the developer only wants a single thread, then this isn't an
option.

One possible solution is to not allow persistent HTTP connections when
the thread pool is near saturation. To do this I added a new environ
key to tick():

environ["PREVENT_PERSISTENT_CONN"] =
(self.requests._queue.qsize() > self.requests.idle - 2)

And then in send_headers() force the server to return a connection
close in this case:

self.close_connection =
self.environ["PREVENT_PERSISTENT_CONN"]

Having done that CherryPy will allow persistent connections to stay
open until it gets down to its last idle thread. That thread then does
not allow the connection to stay open, which frees up the thread to
get the next request from the _queue. The result is the 10 second
timeout deadlock never occurs. This works when thread_pool is any
value, including 1. If it is 1, then CherryPy never allows the
connection to stay open, which is what is required to prevent the 10
second delays with browsers that make concurrent HTTP requests.

I'm not sure how bulletproof the 2 line fix above is. It might be
possible for a thread synchronization issue where requests.idle
doesn't report the value we exactly need? I'm not sure, but based on
my limited testing it appears to work pretty well.

Another possible solution would be to somehow have _parse_request()
not block (instead throw a timeout error immediately) in cases where
all the threads are busy and there is a pending elem in the ThreadPool
_queue. Not sure exactly how one would go about implementing that. :-)

I think the description on ticket #764 (http://www.cherrypy.org/ticket/
764) might also be another instance of this same problem. The ticket
author seems to think it is related to socket's readline changing, but
the HTTP server isn't using readline anymore. The basic symptom is
identical to what I originally saw, which is that stylesheets and
images can be very slow to load (pauses up to the 10 second default
socket timeout).

Regards,
Matt Bendiksen

Sylvain Hellegouarch

unread,
Aug 30, 2008, 2:10:11 PM8/30/08
to cherryp...@googlegroups.com
mbendiksen a écrit :
Hi Matt,

Just skimmed through your message. Apache has an interesting
alternatives. It let you decide how many requests can be pipelined
before closing down the connection to free it. Pipelining is mostly
interesting when loading a whole page as browsers will request each
element of the page either concurrently or in a pipelined fashion. Once
that's done, I don't believe the browser should keep the connection open
until next batch of several elements to be fetched. In other words, it'd
be interesting to let CP know that after X pipelined requests in a
persistent connection it can close down the connection.

- Sylvain

Robert Brewer

unread,
Aug 30, 2008, 3:54:35 PM8/30/08
to cherryp...@googlegroups.com
mbendiksen wrote:
> I am seeing instances of poor performance (hangs up to the default 10
> second socket timeout) when the HTTP server threads all have active
> persistent connections open. I believe this case can happen more
> frequently than anticipated. I added debug logging to wsgiserver to
> see what was occurring in these situations. Below is my analysis and
> suggestions. Note I just dove into this issue yesterday and I'm not an
> HTTP expert. So feel free to correct / critique my analysis and
> results. :-)

I think your analysis is right on the mark.

If the developer only wants one thread, there's not much we can do about
saturation at that point. The developer can set
environ["HTTP_CONNECTION"] = "close" for all responses, and wsgiserver
will dutifully close the conn. The environ is inherited from
HTTPConnection, which is inherited from CherryPyWSGIServer, and
currently you have to set that environ entry before the app is called if
you want the server to close the conn; if you set the header after that
point, the client will probably close the conn. So there are two
improvements:

1. Inside send_headers , make wsgiserver set close_connection (close
the conn itself) if HTTP_CONNECTION == "close".
2. Perhaps do that automatically if thread_pool is 1?

> One possible solution is to not allow persistent HTTP connections when
> the thread pool is near saturation. To do this I added a new environ
> key to tick():
>
> environ["PREVENT_PERSISTENT_CONN"] =
> (self.requests._queue.qsize() > self.requests.idle - 2)
>
> And then in send_headers() force the server to return a connection
> close in this case:
>
> self.close_connection =
> self.environ["PREVENT_PERSISTENT_CONN"]

I don't think you need a new environ key.
Just set environ["HTTP_CONNECTION"] = "close".

> Having done that CherryPy will allow persistent connections to stay
> open until it gets down to its last idle thread. That thread then does
> not allow the connection to stay open, which frees up the thread to
> get the next request from the _queue. The result is the 10 second
> timeout deadlock never occurs. This works when thread_pool is any
> value, including 1. If it is 1, then CherryPy never allows the
> connection to stay open, which is what is required to prevent the 10
> second delays with browsers that make concurrent HTTP requests.
>
> I'm not sure how bulletproof the 2 line fix above is. It might be
> possible for a thread synchronization issue where requests.idle
> doesn't report the value we exactly need? I'm not sure, but based on
> my limited testing it appears to work pretty well.

It should be accurate within a few milliseconds, which is enough. As
long as it doesn't crash I think we're OK with slightly stale data. :)

> Another possible solution would be to somehow have _parse_request()
> not block (instead throw a timeout error immediately) in cases where
> all the threads are busy and there is a pending elem in the ThreadPool
> _queue. Not sure exactly how one would go about implementing that. :-)

Even if that were possible, it sounds worse than just blocking or
closing the conn on all requests.

Regardless of the mechanism, I'd really like to see this improved via a
ThreadPool alternative that's more dynamic, since that's the API-blessed
extension point; developers could then plug it in without having to
patch wsgiserver. If you'd like to contribute one, a lot of people would
appreciate it.

Please open tickets for any of the above. :)

> I think the description on ticket #764
(http://www.cherrypy.org/ticket/
> 764) might also be another instance of this same problem. The ticket
> author seems to think it is related to socket's readline changing, but
> the HTTP server isn't using readline anymore. The basic symptom is
> identical to what I originally saw, which is that stylesheets and
> images can be very slow to load (pauses up to the 10 second default
> socket timeout).

Seems reasonable.


Robert Brewer
fuma...@aminus.org

Mr. Green

unread,
Aug 30, 2008, 8:28:27 PM8/30/08
to cherryp...@googlegroups.com
> ...if you set the header after that point, the client will probably close the conn. So there
> are twoimprovements:
>
>     1. Inside send_headers, make wsgiserver set close_connection (close

>    the conn itself) if HTTP_CONNECTION == "close".

I tried adding this to send_headers():

   if self.environ.get("HTTP_CONNECTION", "") == "close":
       self.close_connection = True


>     2. Perhaps do that automatically if thread_pool is 1?

If we add the change after this one, then #2 isn't needed as queue size will always be greater than -1 in this case.


>> To do this I added a new environ key to tick():
>>
>>     environ["PREVENT_PERSISTENT_CONN"] = (self.requests._queue.qsize() > self.requests.idle - 2)
>
> I don't think you need a new environ key.
> Just set environ["HTTP_CONNECTION"] = "close".

And I added this inside tick():

  if self.requests._queue.qsize() > self.requests.idle - 2:
      environ["HTTP_CONNECTION"] = "close"

But it didn't work. HTTPRequest read_headers() ended up stomping on environ's HTTP_CONNECTION with the browser's request "close, keep-alive". Unless you have another approach, I think we may need a new environ key or some other mechanism for ThreadPool's tick() to tell the request to not persist.


>> I'm not sure how bulletproof the 2 line fix above is. It might be
>> possible for a thread synchronization issue where requests.idle
>> doesn't report the value we exactly need? I'm not sure, but based on
>> my limited testing it appears to work pretty well.
>
> It should be accurate within a few milliseconds, which is enough. As
> long as it doesn't crash I think we're OK with slightly stale data. :)

Agreed. It seems safe. If it ever fails then the worst that will happen is a 10 second socket timeout (like we have now) or an HTTP connection being closed that could have stayed open. Based on my testing (with the new environ key), it appears to work.


> Regardless of the mechanism, I'd really like to see this improved via a
> ThreadPool alternative that's more dynamic, since that's the API-blessed
> extension point; developers could then plug it in without having to
> patch wsgiserver. If you'd like to contribute one, a lot of people would
> appreciate it.

I'd love to, but I'm afraid my CherryPy-fu isn't quite to novice Jedi level yet. Anything beyond few line patches is probably asking for trouble. :-)


> Please open tickets for any of the above. :)

Sure. I probably should have started this as a ticket, but couldn't see how to login for access to create a ticket. I figured that out now though... so let me know what you want to do about the new environ key and I'll post a ticket.

Matt Bendiksen
Reply all
Reply to author
Forward
0 new messages