First let me state the versions of the programs involved:
- mod_wsgi 1.3
- Pylons 0.9.6.1 and its stack of dependencies
- Python 2.4.4
- Ubuntu Linux 7.10
- DBXML 2.3.10
- Berkeley DB 4.5.20
I think that's all. Note: the same Pylons app without the
initialization step of the DBXML does work
The section "Python Simplified GIL State API" in
http://code.google.com/p/modwsgi/wiki/ApplicationIssues
kind of put me on the right path but there's surely something I'm
missing. I read also this thread:
http://groups.google.com/group/modwsgi/browse_frm/thread/ddf4f4c62b39d3a
Eventually I tried all the debugging techniques listed here:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques
My configuration section of WSGI inside apache's conf is:
WSGIScriptAlias /wsgi /home/loluyede/scripts/demo.wsgi
WSGIApplicationGroup %{GLOBAL}
WSGIDaemonProcess www.foobaz.com maximum-requests=10000
<Directory /home/loluyede/scripts>
Order allow,deny
Allow from all
</Directory>
(it is placed inside a virtual host directive, the default one)
Any suggestions?
--
Lawrence, oluyede.org - neropercaso.it
"It is difficult to get a man to understand
something when his salary depends on not
understanding it" - Upton Sinclair
http://code.google.com/p/modwsgi/wiki/IssuesWithExpatLibrary
BTW, you cant use the gdb debugging technique when using
WSGIDaemonProcess, but then, you hadn't seemed to have used
WSGIProcessGroup to select the daemon process anyway.
Graham
The symptom is the same, sadly I do not have issues with expath
importing and I noticed Apache 2 and Python uses the same
version of expat.
> BTW, you cant use the gdb debugging technique when using
> WSGIDaemonProcess,
I tried once again deubbing with daemon mode disabled but nothing
> but then, you hadn't seemed to have used
> WSGIProcessGroup to select the daemon process anyway.
If the WSGIProcessGroup is specified like this:
WSGIDaemonProcess www.sis.com maximum-requests=10000
WSGIProcessGroup %{GLOBAL}
(again, in the virtual host)
nothing changes. If I use "www.sis.com" I get a "premature end of
script" error before the segmentation fault
Thanks for the help
What you will need to do this is work out the process id of the
mod_wsgi daemon process by looking at Apache error log with Apache
LogLevel directive set to 'info'. Then use gdb to attach to the
process ID explicitly. Depending on the platform this may need to be
done as root. Thus something like:
sudo gdb /usr/bin/httpd 1234
where '1234' is the process ID.
Type 'cont' to continue in the debugger and then access the
application from the browser and see if it crashes.
Graham
I tried using httpd -X to start the web server and in the logs it says:
[Wed Nov 21 09:23:59 2007] [info] mod_wsgi (pid=10297): Starting
process 'www.foobaz.com' with uid=33, gid=33 and threads=15.
[Wed Nov 21 09:23:59 2007] [info] mod_wsgi (pid=10297): Attach interpreter ''.
[Wed Nov 21 09:23:59 2007] [info] mod_wsgi (pid=10296): Attach interpreter ''.
> Then use gdb to attach to the
> process ID explicitly. Depending on the platform this may need to be
> done as root. Thus something like:
>
> sudo gdb /usr/bin/httpd 1234
Then I tried to connect to the 10297 process.
> where '1234' is the process ID.
>
> Type 'cont' to continue in the debugger and then access the
> application from the browser and see if it crashes.
It sadly does. No info whatsoever though
You didn't need to start with -X option when doing it this way.
If it crashed and you were attached to the correct process, ie., the
one that crashed, then gdb should have picked it up and dropped you
back to the prompt. Doing 'where' should give the stack trace for
where it crashed. Are you saying this isn't what happened? Was the
process that crashed definitely the one that gdb was attached to?
BTW, using 'threads=1' to WSGIDaemonProcess will be better if there is
a need to look at stack traces for all threads. By rights though, the
one the crashes should be the primary one that gdb looks at after the
crash.
Graham
http://www.modpython.org/pipermail/mod_python/2005-January/017126.html
Another possibility is the Berkley DB library itself, as that may be
used by Apache or some other Apache module, even PHP if that is also
being loaded.
http://www.modpython.org/pipermail/mod_python/2005-August/018851.html
Thus, do a ldd on Apache and all .so files in Apache modules directory
and see if any appear to reference Berkley DB libraries. Also run ldd
on dbxml Python modules to see what they use and see if there are any
conflicts in shared library versions.
There have also been issues with dbxml not doing multithreading/sub
interpreter stuff properly, but you used %{GLOBAL} for application
group, so shouldn't be that. Plus in mod_python it would still crash
because of bugs in mod_python over how threading done for main
interpreter. In mod_wsgi I fixed this as far as I know and it should
at least work in %{GLOBAL} for application group if all other shared
library conflicts eliminated.
Graham
I finally got it. The problem seems to be in Berkeley DB's
set_timeout() function. This is the raw stacktrace:
#0 0x00000000 in ?? ()
#1 0xb4c29ad3 in DBEnv_set_timeout (self=0x8883e80, args=0x88837cc,
kwargs=0x0) at extsrc/_bsddb.c:3964
#2 0xb767641d in PyCFunction_Call () from /usr/lib/libpython2.4.so.1.0
#3 0xb76b2429 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#4 0xb76b33e9 in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#5 0xb7662f6a in ?? () from /usr/lib/libpython2.4.so.1.0
#6 0x088820a0 in ?? ()
#7 0x0888824c in ?? ()
#8 0x00000000 in ?? ()
I commented out the set_timeout call and it doesn't segfault anymore.
I have other exceptions (all related to DBXML)
but it's not the right place to discuss further
I really thank you Graham, you helped me a lot. By the way this DBXML
is kind of tricky used by Python.
Cheers
Most likely then, it is what I suggested in another email, that you
might find that Apache or some other Apache module is linking in a
different version of Berkley database library than what the dbxml
Python module is. Thus, try running 'ldd' on various things as
mentioned, or even run 'nm' on things looking for somewhere else that
defines symbol 'DBEnv_set_timeout'.
Graham