Just skimmed through your message. Apache has an interesting
alternatives. It let you decide how many requests can be pipelined
before closing down the connection to free it. Pipelining is mostly
interesting when loading a whole page as browsers will request each
element of the page either concurrently or in a pipelined fashion. Once
that's done, I don't believe the browser should keep the connection open
until next batch of several elements to be fetched. In other words, it'd
be interesting to let CP know that after X pipelined requests in a
persistent connection it can close down the connection.
- Sylvain
I think your analysis is right on the mark.
If the developer only wants one thread, there's not much we can do about
saturation at that point. The developer can set
environ["HTTP_CONNECTION"] = "close" for all responses, and wsgiserver
will dutifully close the conn. The environ is inherited from
HTTPConnection, which is inherited from CherryPyWSGIServer, and
currently you have to set that environ entry before the app is called if
you want the server to close the conn; if you set the header after that
point, the client will probably close the conn. So there are two
improvements:
1. Inside send_headers , make wsgiserver set close_connection (close
the conn itself) if HTTP_CONNECTION == "close".
2. Perhaps do that automatically if thread_pool is 1?
> One possible solution is to not allow persistent HTTP connections when
> the thread pool is near saturation. To do this I added a new environ
> key to tick():
>
> environ["PREVENT_PERSISTENT_CONN"] =
> (self.requests._queue.qsize() > self.requests.idle - 2)
>
> And then in send_headers() force the server to return a connection
> close in this case:
>
> self.close_connection =
> self.environ["PREVENT_PERSISTENT_CONN"]
I don't think you need a new environ key.
Just set environ["HTTP_CONNECTION"] = "close".
> Having done that CherryPy will allow persistent connections to stay
> open until it gets down to its last idle thread. That thread then does
> not allow the connection to stay open, which frees up the thread to
> get the next request from the _queue. The result is the 10 second
> timeout deadlock never occurs. This works when thread_pool is any
> value, including 1. If it is 1, then CherryPy never allows the
> connection to stay open, which is what is required to prevent the 10
> second delays with browsers that make concurrent HTTP requests.
>
> I'm not sure how bulletproof the 2 line fix above is. It might be
> possible for a thread synchronization issue where requests.idle
> doesn't report the value we exactly need? I'm not sure, but based on
> my limited testing it appears to work pretty well.
It should be accurate within a few milliseconds, which is enough. As
long as it doesn't crash I think we're OK with slightly stale data. :)
> Another possible solution would be to somehow have _parse_request()
> not block (instead throw a timeout error immediately) in cases where
> all the threads are busy and there is a pending elem in the ThreadPool
> _queue. Not sure exactly how one would go about implementing that. :-)
Even if that were possible, it sounds worse than just blocking or
closing the conn on all requests.
Regardless of the mechanism, I'd really like to see this improved via a
ThreadPool alternative that's more dynamic, since that's the API-blessed
extension point; developers could then plug it in without having to
patch wsgiserver. If you'd like to contribute one, a lot of people would
appreciate it.
Please open tickets for any of the above. :)
> I think the description on ticket #764
(http://www.cherrypy.org/ticket/
> 764) might also be another instance of this same problem. The ticket
> author seems to think it is related to socket's readline changing, but
> the HTTP server isn't using readline anymore. The basic symptom is
> identical to what I originally saw, which is that stylesheets and
> images can be very slow to load (pauses up to the 10 second default
> socket timeout).
Seems reasonable.
Robert Brewer
fuma...@aminus.org