--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To post to this group, send email to mod...@googlegroups.com.
To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
Sorry to correct you Carl, but that isn't quite how it works.
I'll respond in more detail later to original question. Still 7am here
and just got off phone from a work meeting. So need to wake up a bit
more first. :-)
Graham
They likely aren't being killed because there isn't actually a
deadlock of a single thread which hasn't release the GIL.
In other words, what the dead lock timeout will not protect against is
threads calling into C code, releasing the GIL and then deadlocking in
C code.
In your case, the problem is going to be the lxml module. This module
is known not to work in Python sub interpreters properly.
Specifically, the lxml can release the GIL and then attempt to do a
callback into Python code. To do this, it uses the simplified GIL
state API in Python to reacquire the GIL, but that API is only
supposed to be used if running in the main Python interpreter and not
a sub interpreter. When used in a sub interpreter, the code will
deadlock on trying to reacquire the Python GIL.
That lxml is a problem is documented in:
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Multiple_Python_Sub_Interpreters
The solution, since you are only delegating one application to that
mod_wsgi daemon process group, is to add:
WSGIApplicationGroup %{GLOBAL}
This will force the application to run in the main Python interpreter
and avoid the shortcomings of lxml module.
As how you might protect against this sort of deadlock in C code when
GIL isn't locked, the only way is to use 'inactivity-timeout'. This
will cause a restart when there has been no new requests and/or no
reading of request content or generation of response content for that
timeout period. So, this could be used as a fail safe, but if your
application is used in frequently, it will also have the affect of
causing your idle process to be restarted after the timeout period as
well.
BTW, in worst cases, for detecting what process is doing, one can use either:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Extracting_Python_Stack_Traces
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_Crashes_With_GDB
> I'm thinking of switching to MPM/prefork, but I'm not sure if that
> should have any effect, given that I'm in daemon mode already.
Prefork for some people has been causing subtle problems and I would
avoid it if you can.
Graham
The small correction is that once that number of threads is reached
for the whole process, irrespective of how many threads are running in
the process, then the process as a whole is killed off and restarted.
It isn't done at individual thread level within ongoing process.
The maximum-requests option should be avoided in production processes
if at all possible because the quicker requests come through the more
frequently the process will restart, which is likely the last thing
you want to happen when under load.
As to number of processes/threads, as Carl pointed out, OP should
avoid having high numbers of threads in a single process and instead
create multiple processes with a small number of threads.
For most people, the default of 15 threads per process is likely
overkill with that many concurrent requests never actually occurring,
so increasing it with no good reason is not a good idea. If you have
the memory available, possibly better off going to 3 processes each
with 5 threads only.
Graham
From memory, the problem with lxml is because it uses SWIG to generate
Python wrappers for C internals and it is SWIG that uses simplified
GIL state API when doing callbacks.
Thus, this problem generally affects anything that uses SWIG and which
is doing callbacks from C code into Python.
Even if my memory is bad and lxml doesn't use SWIG, the issue with
SWIG still stands.
Graham
That section was more relevant when Django 1.0 had only just come out,
which was the first version of Django for which the core was
supposedly thread safe.
Anyway, the MPM you use isn't particularly relevant as you are using
daemon mode and not embedded mode. Which MPM you use is only critical
if you are using embedded mode.
In daemon mode you have the arbitrary ability to control
processes/threads based on whether your application is thread safe.
For related reading see:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
BTW, the IntegrationWithDjango page in the wiki is likely to be
completely removed at some point in the near future and I will stop
providing details for specific frameworks to cover where frameworks
don't themselves provide enough information. I have already removed
the pages for most of the other frameworks already. End result is that
the frameworks themselves will need to provide decent documentation
themselves to cover any idiosyncrasies that exist in setting up their
framework to work with mod_wsgi which are due to issues or design
decisions related to their framework and which are nothing to do with
mod_wsgi. I have had enough of trying to document these framework
specific subtleties and framework authors tend to express a belief
that their own documentation is already more than adequate even though
from what I have seen people still get tripped up when they follow
only the documentation provided by the framework. So, I will be
devoting my time elsewhere now and not worrying about documenting
stuff related to the frameworks or actively assisting users of
frameworks on forums related to those frameworks or on general forums
such as StackOverflow. Instead, if it is a framework specific issue,
you will need to seek help from the developers or the community for
that framework.
Graham
A possible reason why you are seeing less problems with only 5 threads
in a process is that your code or a third party C extension is not
thread safe and are perhaps deadlocking.
You really need to ascertain when process threads are starting to hang and use:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Extracting_Python_Stack_Traces
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_Crashes_With_GDB
to work out what it is doing at that time.
Graham
On 6 June 2011 05:24, rwman <someuni...@gmail.com> wrote:
> Is there a way to make apache work even when such deadlock occur?
When using daemon mode it has for some time had the ability to detect
timeouts and it should kill off process after 300 seconds. This
doesn't apply if using embedded mode, so when explaining your original
problem you should explain the configuration you are using and
preferably post the mod_wsgi bits from the Apache configuration.
There are some extreme cases where a third party Python extension
module might defeat the deadlock detection, but the extension module
would need to be doing things it probably shouldn't be doing. The dead
lock timeout also will not kick in your code is simply looping or suck
in database queries that take a long time.
> Can a process be killed and restarted automatically?
For true deadlocks, that is what the deadlock detection of daemon mode
does. There is also an optionally enabled inactivity timeout failsafe
as well that can be turned on which helps to recover from non deadlock
cases where request handlers are looping in stuck in database queries.
> I know, it is not a
> solution for actual problem and should be solved by eliminating
> deadlock, but the goal is to make production server work while
> debugging the problem.
> I tried all options of modwsgi that seemed relevant, but could not
> achieve stable apache counficuration. It stuck after some time for
> about 5 hours.
Without an explanation of your original problem, it isn't clear that
you are having a deadlock problem. It could be that you have request
handlers that are getting on loops and never completing, thereby using
up all the request handler threads.
So, give your current configuration and what other variations you have
used, so can see what you are doing and confirm whether using embedded
mode or daemon mode. Also indicate if using Apache prefork or worker
MPM and whether PHP being used in same Apache web server.
Indicate whether you have looked at inactivity-timeout option for
WSGIDaemonProcess and whether you have at least seen deadlock-timeout
option, although the latter defaults to on anyway.
Also indicate whether you have tried adding any variant of the code as
explained in:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Extracting_Python_Stack_Traces
to try and get the daemon process to dump Python stack traces when it
does get stuck so you might work out what it is doing.
You could also try extracting C stack traces as explained in:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_Crashes_With_GDB
Graham