I am ready to try the switch, already compiled and run mod_wsgi 4.0 on
a test machine with the same config as usual.
Before going on production I would need to know (Graham I understand
you have a full time job at NewRelic so thanks in advance for when
you'll have the time to jump on this!)
* what are the new mod_wsgi settings for the apache2 conf? what about
"listen-backlog" (http://groups.google.com/group/modwsgi/browse_thread/
thread/b6d66d3fe5a53d2c/
what about "blocked-requests" and "blocked-timeout"
http://groups.google.com/group/modwsgi/msg/2a968d820e18e97d
In Graham's answer to my serverfault question:
> if number of processes/threads across Apache child worker processes is less than 100, the daemon process listener backlog, then all those threads can also get stuck and you will not know
I currently use these settings on a quadcore:
<IfModule mpm_worker_module>
StartServers 2
ServerLimit 4
MinSpareThreads 2
MaxSpareThreads 4
ThreadLimit 32
ThreadsPerChild 16
MaxClients 64#128
MaxRequestsPerChild 10000
</IfModule>
WSGIDaemonProcess subdomain.domain user=www-data group=www-data
threads=25
Is this sensible ?
After reading http://groups.google.com/group/modwsgi/browse_thread/thread/edffb22b2eac134b
and again http://groups.google.com/group/modwsgi/browse_thread/thread/b6d66d3fe5a53d2c/
I see that my threads settings might not be fit...
Anything else I'd need to know?
Most of the information is covered in those posts you link to.
First up just suggest using blocked-timeout of 60 as fails safe to at
least trigger a restart when everything blocked up for 60 seconds.
Wouldn't worry about blocked-requests yet as I need to tweak stuff
related to that option.
As to listen-backlog, still trying to work out what is best thing to
do with new ability to change it. Even if dropped to be low value, a
retry mechanism kicks in with mod_wsgi when it tries to connect to
daemon processes. One needs this to ensure that when daemon processes
all restart at same time and new process not quite it state to accept
new connection that it does fail straight away. Am not totally sure
that is valid though and have to dig into it further. The retry may in
part not be needed as strictly speaking may only kick in when listen
backlog of daemon full.
So, have to do some further analysis.
BTW, make sure you have:
LogLevel info
in order to get stack trace dumps. I still need to change things so
that they are log at error level so always visible and change message
about why being logged.
Graham
> --
> You received this message because you are subscribed to the Google Groups "modwsgi" group.
> To post to this group, send email to mod...@googlegroups.com.
> To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
>
this is very timely answer for a side-project!
I'll try this then, and post back here my experience though I forecast
it will not be before a few days (we don't want to put anything on the
production server on a friday for obvious reasons, and I don't think
I'll manage today).
I'll be monitoring the mailing list should you come up with some
updates!
thanks
Stefano
On Dec 8, 4:33 am, Graham Dumpleton <graham.dumple...@gmail.com>
wrote:
> Sorry for not getting to this in a timely manner. Am having a bit of a
> mind block on this stuff at the moment. I still need to go in and
> tweak some stuff in mod_wsgi and not able to get my head around it.
>
> Most of the information is covered in those posts you link to.
>
> First up just suggest using blocked-timeout of 60 as fails safe to at
> least trigger a restart when everything blocked up for 60 seconds.
> Wouldn't worry about blocked-requests yet as I need to tweak stuff
> related to that option.
>
> As to listen-backlog, still trying to work out what is best thing to
> do with new ability to change it. Even if dropped to be low value, a
> retry mechanism kicks in with mod_wsgi when it tries to connect to
> daemon processes. One needs this to ensure that when daemon processes
> all restart at same time and new process not quite it state to accept
> new connection that it does fail straight away. Am not totally sure
> that is valid though and have to dig into it further. The retry may in
> part not be needed as strictly speaking may only kick in when listen
> backlog of daemon full.
>
> So, have to do some further analysis.
>
> BTW, make sure you have:
>
> LogLevel info
>
> in order to get stack trace dumps. I still need to change things so
> that they are log at error level so always visible and change message
> about why being logged.
>
> Graham
>
> On 6 December 2011 21:47, stefanoC <stefano.cro...@gmail.com> wrote:
>
>
>
>
>
>
>
> > Finally managed to jump from
> >http://serverfault.com/questions/335633/apachemod-wsgi-configuration-...
> > After readinghttp://groups.google.com/group/modwsgi/browse_thread/thread/edffb22b2...
> > and againhttp://groups.google.com/group/modwsgi/browse_thread/thread/b6d66d3fe...
Interesting thread. We might be hitting some similar issue
> I am ready to try the switch, already compiled and run mod_wsgi 4.0 on
> a test machine with the same config as usual.
We were about to try it, but some other stuff got in the way. We plan to do it,
but not sure when :(
But if you do it before us, please let us know how it went and how you did it.
Are you using debian/ubuntu ? Did you make a debian package ? I was planning to
make a debian package (we are using ubuntu) based on debian's package, using
uupdate (I think it should be that easy, although I didn't try to do it yet)
Thanks,
Rodrigo
I'm currently making some tests on the pre-production machine before
we jump on production, and the biggest headache is still the best
apache2 worker conf that will not overrun the machine.
But I'll keep posted!
On Dec 11, 7:54 pm, Rodrigo Campos <rodr...@sdfg.com.ar> wrote:
> On Tue, Dec 06, 2011 at 02:47:49AM -0800, stefanoC wrote:
> > Finally managed to jump from
> >http://serverfault.com/questions/335633/apachemod-wsgi-configuration-...
Yep. I am aware of the issue. So long as the requests threads don't
block completely, which is a different issue handling with
blocked-requests option, it isn't so much that it locks Apache up, but
that it causes an internal backlog of requests, which even when the
long requests finish the daemon processes will still process the
backlog even though the original user may have given up. In processing
the big backlog, because you then get a big influx of requests
together, you might again end up with a lot of longer requests all
coinciding again and so it starts over. Thus it can take a while for
things to stabalise, although if the resources of the systems as a
whole aren't sufficient, it could simply make the whole box grind to a
halt.
This can to a degree also happen when nginx is used as a front end as
well, as any multi hop solution will introduce these potential backlog
points solely due to the socket listen queue size for each socket.
Apache/mod_wsgi currently makes it a bit worse and easier for it to
trigger though.
For Apache/mod_wsgi you have the default (but configurable) listen
backlog of 100. So, if all processes/threads were busy, 100 more
requests could still queue up before clients start getting connection
refused. At the same time, you will have as many requests as you have
processes/threads in an accepted state and being handled within Apache
child worker processes themselves, or if using daemon mode, being
proxied to the daemon process.
Normally the number of Apache MPM processes/threads would be more than
mod_wsgi daemon process, but because the daemon processes also have a
100 listen backlog, again when all daemon process/threads busy, then
proxied requests will queue up internally and depending on how the
numbers work, it all acts as a big funnel with no way for things to
break out. In other words, if Apache MPM threads in total across all
processes is less than daemon threads +100, you will never get a
connection refused from daemon process.
Even if it was exceeded and you got a connection refused, the proxy
code for talking to the daemon process makes further attempts to
connect to the daemon process. This was done to cope with issues where
daemon processes not quite ready due to restarts or otherwise.
The problem here is the combination of the large daemon listen backlog
(which hasn't even been configurable until 4.0) as well as the retry
mechanism.
What I have started playing with, but never got a chance to finish
what I was doing, was to make the daemon listen backlog configurable,
or automatically adjust based on daemon and MPM config, but also
change when/how reconnect attempts are made.
The eventual aim was to introduce a way out of the funnel so you don't
get the backlogging problem and the issues it causes with daemon
processes getting overwhelmed and not being able to catch up again.
So, by one way or another, the aim is for 503 errors to be generated
when internal backlog occurs with the 503 going back to the client.
That way if the daemon processes get overwhelmed, then new requests
coming in will timeout and get thrown away. Yes it will mean users see
errors, but at least then you don't get a backlog and so when daemon
recovers, it has not got a pipe stuffed full of requests it will then
not be able to handle.
Graham
Either way, just don't have the time right now to even work on
mod_wsgi 4.0 and make that behaviour better.
Hopefully things will ease off a little after PyCon and will have some
more time then to work on mod_wsgi and also followup questions on
mailing list properly.
Graham
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/modwsgi/-/qA73RUPFWHQJ.