Linux version 2.6.32-71.29.1.el6.x86_64
(mock...@c6b5.bsys.dev.centos.org) (gcc version 4.4.4 20100726 (Red
Hat 4.4.4-13) (GCC) ) #1 SMP Mon Jun 27 19:49:27 BST 2011
Python2.6.5 (the server had it installed before),
mod_wsgi 3.3 compiled from source,
django 1.2
Oracle 11g client libraries, database is in a windows 2003 server.
Apache version is:
Server version: Apache/2.2.15 (Unix)
Server built: Jul 7 2011 11:30:37
Server's Module Magic Number: 20051115:24
Server loaded: APR 1.3.9, APR-Util 1.3.9
Compiled using: APR 1.3.9, APR-Util 1.3.9
Architecture: 64-bit
Server MPM: Prefork
threaded: no
forked: yes (variable process count)
Server compiled with....
-D APACHE_MPM_DIR="server/mpm/prefork"
-D APR_HAS_SENDFILE
-D APR_HAS_MMAP
-D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
-D APR_USE_SYSVSEM_SERIALIZE
-D APR_USE_PTHREAD_SERIALIZE
-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
-D APR_HAS_OTHER_CHILD
-D AP_HAVE_RELIABLE_PIPED_LOGS
-D DYNAMIC_MODULE_LIMIT=128
-D HTTPD_ROOT="/etc/httpd"
-D SUEXEC_BIN="/usr/sbin/suexec"
-D DEFAULT_PIDLOG="run/httpd.pid"
-D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
-D DEFAULT_LOCKFILE="logs/accept.lock"
-D DEFAULT_ERRORLOG="logs/error_log"
-D AP_TYPES_CONFIG_FILE="conf/mime.types"
-D SERVER_CONFIG_FILE="conf/httpd.conf"
This is the first time that this application has been installed in
RHEL 6 and oracle, because we have the same application in another
server with RHEL 5 and database postgres 8.4 and it's functioning
properly.
In the apache error_log , with loglevel debug, the only messages that
appear are:
[notice] SIGHUP received. Attempting to restart
[info] mod_wsgi (pid=10403): Terminating Python.
[info] mod_wsgi (pid=10403): Python has shutdown.
[notice] Digest: generating secret for digest authentication ...
[notice] Digest: done
[debug] util_ldap.c(2058): LDAP merging Shared Cache conf:
shm=0x7f799694cd38 rmm=0x7f799694cd90 for VHOST: myvhost.com
[debug] util_ldap.c(2058): LDAP merging Shared Cache conf:
shm=0x7f799694cd38 rmm=0x7f799694cd90 for VHOST: myvhost2.com
[info] APR LDAP: Built with OpenLDAP LDAP SDK
[info] LDAP: SSL support available
[info] Init: Seeding PRNG with 256 bytes of entropy
[info] Init: Generating temporary RSA private keys (512/1024 bits)
[info] Init: Generating temporary DH parameters (512/1024 bits)
[debug] ssl_scache_shmcb.c(253): shmcb_init allocated 512000 bytes of
shared memory
[debug] ssl_scache_shmcb.c(272): for 511920 bytes (512000 including
header), recommending 32 subcaches, 133 indexes each
[debug] ssl_scache_shmcb.c(306): shmcb_init_memory choices follow
[debug] ssl_scache_shmcb.c(308): subcache_num = 32
[debug] ssl_scache_shmcb.c(310): subcache_size = 15992
[debug] ssl_scache_shmcb.c(312): subcache_data_offset = 3208
[debug] ssl_scache_shmcb.c(314): subcache_data_size = 12784
[debug] ssl_scache_shmcb.c(316): index_num = 133
[info] Shared memory session cache initialised
[info] Init: Initializing (virtual) servers for SSL
[info] mod_ssl/2.2.15 compiled against Server: Apache/2.2.15, Library:
OpenSSL/1.0.0-fips
[info] mod_wsgi (pid=10403): Initializing Python.
[debug] mod_wsgi.c(11355): mod_wsgi (pid=10403): Reset default user
for daemon process group 'quicklearn' to uid=48.
[debug] mod_wsgi.c(11364): mod_wsgi (pid=10403): Reset default group
for daemon process group 'quicklearn' to gid=48.
[debug] mod_wsgi.c(9958): mod_wsgi (pid=10403): Socket for
'quicklearn' is '/etc/httpd/run/wsgi.10403.10.1.sock'.
[info] mod_wsgi (pid=8334): Starting process 'quicklearn' with uid=48,
gid=48 and threads=16.
[info] mod_wsgi (pid=8335): Starting process 'quicklearn' with uid=48,
gid=48 and threads=16.
[info] mod_wsgi (pid=8336): Starting process 'quicklearn' with uid=48,
gid=48 and threads=16.
[info] mod_wsgi (pid=8337): Starting process 'quicklearn' with uid=48,
gid=48 and threads=16.
[info] mod_wsgi (pid=8338): Starting process 'quicklearn' with uid=48,
gid=48 and threads=16.
[info] mod_wsgi (pid=8339): Starting process 'quicklearn' with uid=48,
gid=48 and threads=16.
In the virtualhost error log it shows:
[error] [client 172.16.0.2] Premature end of script headers:
pinax.wsgi, referer: http://xxxx/lms/courses/
[info] mod_wsgi (pid=25667): Attach interpreter ''.
[info] mod_wsgi (pid=25667): Adding '/var/quicklearn/env/lib/python2.6/
site-packages' to path.
[info] mod_wsgi (pid=25667): Adding '/var/quicklearn/trunk' to path.
[info] mod_wsgi (pid=25667): Create interpreter 'myvhost2.com|'.
[info] mod_wsgi (pid=25667): Adding '/var/quicklearn/env/lib/python2.6/
site-packages' to path.
[info] mod_wsgi (pid=25667): Adding '/var/quicklearn/trunk' to path.
[info] [client 172.16.0.2] mod_wsgi (pid=25667, process='quicklearn',
application='myvhost2.com|'): Loading WSGI script '/var/quicklearn/
trunk/deploy/pinax.wsgi'.
The virtualhost configuration for myvhost2.com is:
WSGIDaemonProcess quicklearn processes=6 threads=16 python-path=/var/
quicklearn/env/lib/python2.6/site-packages:/var/quicklearn/trunk
WSGIProcessGroup quicklearn
WSGIScriptAlias / /var/quicklearn/trunk/deploy/pinax.wsgi
<Directory /var/quicklearn/trunk>
Order deny,allow
Allow from all
</Directory>
mod_python is not installed.
I've tried most of the debugging techniques for mod_wsgi but i've not
found anything relevant for solving this.I verified shared libraries,
expat library etc. Maybe i'm overlooking something, but for a week
i've been dedicated to this issue and i'm lost.
What can i do to solve this?
Sorry for my bad english
Thank you very much
Thanks for your quick response.
This message appears randomly, i'm not doing apache restart. I have
been doing stress tests on this application and this is the only way
that i can easily reproduce this issue on this server.
First i thougth that it was a problem with concurrent users, but when
there was only one user logged in , the message appeared too.
--
Caro
On 21 nov, 15:01, Graham Dumpleton <graham.dumple...@gmail.com> wrote:
> Can you clarify whether the message about premature end of script headers
> is on every request or only when you perform an apache restart? If on
> restart confirm that you are doing a graceful restart.
>
> Thanks.
>
> Graham
>
>
>
>
>
>
>
>
>
> On Tuesday, 22 November 2011, caroline <carolina.sam...@gmail.com> wrote:
> > Hello
> > I have a pinax application that have been installed in a server with
> > the following features:
>
> > Linux version 2.6.32-71.29.1.el6.x86_64
> > (mockbu...@c6b5.bsys.dev.centos.org) (gcc version 4.4.4 20100726 (Red
Have a very good look through your main Apache error log (not virtual
host) for messages like 'segmentation fault'.
For it to happen randomly, would indicate that the mod_wsgi daemon
process is crashing.
If this is the only Python web application being hosted add to Apache
configuration:
WSGIApplicationGroup %{GLOBAL}
This can work around issues with third party C extension modules for
Python which aren't written to work properly in Python sub
interpreters within a process. The directive will force use of main
Python interpreter within process.
Graham
I've made the change you mentioned on the main apache configuration
file, but now the error message of 'premature end of script headers'
appears almost constantly. It appears like this in the vhost error
log:
[Tue Nov 22 09:21:19 2011] [error] [client 172.16.0.2] Premature end
of script headers: pinax.wsgi, referer: http:///www.myvhost2.com/account/login/?next=/
[Tue Nov 22 09:21:19 2011] [error] [client 172.16.0.2] Premature end
of script headers: pinax.wsgi, referer: http://www.myvhost2.com/lms/courses/
[Tue Nov 22 09:21:19 2011] [error] [client 172.16.0.2] Premature end
of script headers: pinax.wsgi, referer: http://www.myvhost2.com/lms/courses/
[Tue Nov 22 09:21:20 2011] [info] mod_wsgi (pid=25450): Attach
interpreter ''.
[Tue Nov 22 09:21:20 2011] [info] mod_wsgi (pid=25450): Adding '/var/
quicklearn/env/lib/python2.6/site-packages' to path.
[Tue Nov 22 09:21:20 2011] [info] mod_wsgi (pid=25450): Adding '/var/
quicklearn/trunk' to path.
[Tue Nov 22 09:21:21 2011] [info] [client 172.16.0.2] mod_wsgi
(pid=25450, process='quicklearn', application=''): Loading WSGI script
'/var/quicklearn/trunk/deploy/pinax.wsgi'.
And in the apache main error log at the same time :
[Tue Nov 22 09:20:31 2011] [info] mod_wsgi (pid=25149): Process
'quicklearn' has died, restarting.
[Tue Nov 22 09:20:31 2011] [info] mod_wsgi (pid=25413): Starting
process 'quicklearn' with uid=48, gid=48 and threads=16.
[Tue Nov 22 09:21:20 2011] [info] mod_wsgi (pid=25145): Process
'quicklearn' has died, restarting.
[Tue Nov 22 09:21:20 2011] [info] mod_wsgi (pid=25450): Starting
process 'quicklearn' with uid=48, gid=48 and threads=16.
[Tue Nov 22 09:21:32 2011] [debug] proxy_util.c(1818): proxy: grabbed
scoreboard slot 0 in child 25475 for worker proxy:reverse
[Tue Nov 22 09:21:32 2011] [debug] proxy_util.c(1837): proxy: worker
proxy:reverse already initialized
[Tue Nov 22 09:21:32 2011] [debug] proxy_util.c(1934): proxy:
initialized single connection worker 0 in child 25475 for (*)
[Tue Nov 22 09:22:31 2011] [debug] proxy_util.c(1818): proxy: grabbed
scoreboard slot 0 in child 25493 for worker proxy:reverse
[Tue Nov 22 09:22:31 2011] [debug] proxy_util.c(1837): proxy: worker
proxy:reverse already initialized
[Tue Nov 22 09:22:31 2011] [debug] proxy_util.c(1934): proxy:
initialized single connection worker 0 in child 25493 for (*)
I have this on the main apache configuration file:
WSGILazyInitialization Off
WSGIRestrictEmbedded On
WSGIApplicationGroup %{GLOBAL}
But it's not working. And i've searched for the segmentation fault
message but it doesn't appear anywhere.
What more could i do?
Thanks
On 21 nov, 18:52, Graham Dumpleton <graham.dumple...@gmail.com> wrote:
Anything prior to this in same time frame?
> And in the apache main error log at the same time :
>
> [Tue Nov 22 09:20:31 2011] [info] mod_wsgi (pid=25149): Process
> 'quicklearn' has died, restarting.
Were there any messages in main Apache error log prior to this in same
time frame.
It may not be segmentation fault. It is generally though a message
which doesn't have the date/time prefix as output at a lower level
than Apache logging API. Provide the preceding messages so I can look
through.
If you operating system has a crash logs directory you might look in
there for evidence of crashing apache/httpd processes.
If the problem is now more frequent, try and work out if it is a
specific URL. Look in Apache access logs to see if 500 error status
for same URL a lot of the time. In other words, try and reliably
duplicate it.
Being able to duplicate it, you could then attach gdb to a mod_wsgi
daemon process and try and catch the crash. Details on how to do that
can be found in:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_Crashes_With_GDB
> [Tue Nov 22 09:20:31 2011] [info] mod_wsgi (pid=25413): Starting
> process 'quicklearn' with uid=48, gid=48 and threads=16.
> [Tue Nov 22 09:21:20 2011] [info] mod_wsgi (pid=25145): Process
> 'quicklearn' has died, restarting.
> [Tue Nov 22 09:21:20 2011] [info] mod_wsgi (pid=25450): Starting
> process 'quicklearn' with uid=48, gid=48 and threads=16.
> [Tue Nov 22 09:21:32 2011] [debug] proxy_util.c(1818): proxy: grabbed
> scoreboard slot 0 in child 25475 for worker proxy:reverse
> [Tue Nov 22 09:21:32 2011] [debug] proxy_util.c(1837): proxy: worker
> proxy:reverse already initialized
> [Tue Nov 22 09:21:32 2011] [debug] proxy_util.c(1934): proxy:
> initialized single connection worker 0 in child 25475 for (*)
> [Tue Nov 22 09:22:31 2011] [debug] proxy_util.c(1818): proxy: grabbed
> scoreboard slot 0 in child 25493 for worker proxy:reverse
> [Tue Nov 22 09:22:31 2011] [debug] proxy_util.c(1837): proxy: worker
> proxy:reverse already initialized
> [Tue Nov 22 09:22:31 2011] [debug] proxy_util.c(1934): proxy:
> initialized single connection worker 0 in child 25493 for (*)
>
> I have this on the main apache configuration file:
>
> WSGILazyInitialization Off
> WSGIRestrictEmbedded On
BTW, you sure you don't mean:
WSGILazyInitialization On
WSGIRestrictEmbedded On
You generally wouldn't want to turn off lazy initialisation as Python
interpreter leaks memory into Apache parent process on restarts if
lazy initialisation is turned off.
Graham
Is there a list of known/suspected third party C extension modules
that don't work well with multiple sub interpreters due to use of the
simplified threading API? I'm using numpy and scipy, and so was the OP
of the thread from a few weeks ago entitled "scipy import problem in
apache-mod_wsgi environment". The mod_wsgi docs mention a few
extensions that may have problems (dbxml, Xapian, possibly Subversion)
though they do not list scipy/numpy:
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API
Thanks,
William
On Nov 21, 6:52 pm, Graham Dumpleton <graham.dumple...@gmail.com>
wrote:
On 22 nov, 17:20, Graham Dumpleton <graham.dumple...@gmail.com> wrote:
In the apache error log file i only found messages like this:
[Wed Nov 23 17:01:59 2011] [info] mod_wsgi (pid=25954): Process
'quicklearn' has died, restarting.
[Wed Nov 23 17:01:59 2011] [info] mod_wsgi (pid=14133): Starting
process 'quicklearn' with uid=48, gid=48 and threads=16.
[Wed Nov 23 17:02:09 2011] [debug] proxy_util.c(1818): proxy: grabbed
scoreboard slot 0 in child 14165 for worker proxy:reverse
[Wed Nov 23 17:02:09 2011] [debug] proxy_util.c(1837): proxy: worker
proxy:reverse already initialized
[Wed Nov 23 17:02:09 2011] [debug] proxy_util.c(1934): proxy:
initialized single connection worker 0 in child 14165 for (*)
[Wed Nov 23 17:02:44 2011] [debug] proxy_util.c(1818): proxy: grabbed
scoreboard slot 0 in child 14169 for worker proxy:reverse
[Wed Nov 23 17:02:44 2011] [debug] proxy_util.c(1837): proxy: worker
proxy:reverse already initialized
[Wed Nov 23 17:02:44 2011] [debug] proxy_util.c(1934): proxy:
initialized single connection worker 0 in child 14169 for (*)
[Wed Nov 23 17:03:38 2011] [debug] proxy_util.c(1818): proxy: grabbed
scoreboard slot 0 in child 14187 for worker proxy:reverse
[Wed Nov 23 17:03:38 2011] [debug] proxy_util.c(1837): proxy: worker
proxy:reverse already initialized
[Wed Nov 23 17:03:38 2011] [debug] proxy_util.c(1934): proxy:
initialized single connection worker 0 in child 14187 for (*)
[Wed Nov 23 17:04:20 2011] [info] mod_wsgi (pid=25953): Process
'quicklearn' has died, restarting.
[Wed Nov 23 17:04:20 2011] [info] mod_wsgi (pid=14198): Starting
process 'quicklearn' with uid=48, gid=48 and threads=16.
[Wed Nov 23 17:04:23 2011] [debug] proxy_util.c(1818): proxy: grabbed
scoreboard slot 0 in child 14222 for worker proxy:reverse
[Wed Nov 23 17:04:23 2011] [debug] proxy_util.c(1837): proxy: worker
proxy:reverse already initialized
[Wed Nov 23 17:04:23 2011] [debug] proxy_util.c(1934): proxy:
initialized single connection worker 0 in child 14222 for (*)
[Wed Nov 23 17:04:48 2011] [debug] proxy_util.c(1818): proxy: grabbed
scoreboard slot 0 in child 14227 for worker proxy:reverse
[Wed Nov 23 17:04:48 2011] [debug] proxy_util.c(1837): proxy: worker
proxy:reverse already initialized
[Wed Nov 23 17:04:48 2011] [debug] proxy_util.c(1934): proxy:
initialized single connection worker 0 in child 14227 for (*)
[Wed Nov 23 17:04:51 2011] [debug] proxy_util.c(1818): proxy: grabbed
scoreboard slot 0 in child 14228 for worker proxy:reverse
[Wed Nov 23 17:04:51 2011] [debug] proxy_util.c(1837): proxy: worker
proxy:reverse already initialized
[Wed Nov 23 17:04:51 2011] [debug] proxy_util.c(1934): proxy:
initialized single connection worker 0 in child 14228 for (*)
[Wed Nov 23 17:05:05 2011] [info] mod_wsgi (pid=14198): Process
'quicklearn' has died, restarting.
[Wed Nov 23 17:05:05 2011] [info] mod_wsgi (pid=14238): Starting
process 'quicklearn' with uid=48, gid=48 and threads=16.
>
> It may not be segmentation fault. It is generally though a message
> which doesn't have the date/time prefix as output at a lower level
> than Apache logging API. Provide the preceding messages so I can look
> through.
>
I've searched for messages like this on logs but i haven't found
anything like you mention.
> If you operating system has a crash logs directory you might look in
> there for evidence of crashing apache/httpd processes.
>
> If the problem is now more frequent, try and work out if it is a
> specific URL. Look in Apache access logs to see if 500 error status
> for same URL a lot of the time. In other words, try and reliably
> duplicate it.
>
> Being able to duplicate it, you could then attach gdb to a mod_wsgi
> daemon process and try and catch the crash. Details on how to do that
> can be found in:
>
> http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_C...
>
>
>
I'm going to apply this technique, and i'll inform you about its
results.
Thanks about that! i changed this too
> ...
>
> leer más »
--
Caro
Only what you found. I have added:
http://code.google.com/p/modwsgi/issues/detail?id=252
to remind me to do such a thing and for people to add others to.
Graham
Only other thing can think of is that something is sending a single to
daemon processes and killing them. This can't just be SIGINT or SIGHUP
though as the daemon process in those cases would log something and it
isn't that I can see from the parts of the virtual host log file you
have posted.
BTW, you might consider New Relic (www.newrelic.com) trial to see if
it that sheds any light. Both the Python agent and server monitoring.
Disclaimer. I work at New Relic and am writing the Python agent. The
pro trial you get on initial sign up and before it drops down to free
lite subscription still could be useful to help track things down,
although if processes are dying suddenly then may not.
Graham
I've made some more tests, using the debugging technique that you
wrote (with gdb), and when i put threads=1 the premature end of script
headers message doesn't appear anymore, so i couldn't see anything
relevant on gdb, Is this normal?. I repeatedly tested the application
without any values or with threads greater than 1 in the wsgi daemon
process definition (without using gdb of course) but the message
appeared again, so now i'm kind of confused.
Why could be the reason for the message doesn't appearing when
threads=1?
Thanks for your help
--
On 23 nov, 18:01, Graham Dumpleton <graham.dumple...@gmail.com> wrote:
> Do you run any sort of monitoring system which kills processes
> automatically if memory exceeds some set amount?
>
> Only other thing can think of is that something is sending a single to
> daemon processes and killing them. This can't just be SIGINT or SIGHUP
> though as the daemon process in those cases would log something and it
> isn't that I can see from the parts of the virtual host log file you
> have posted.
>
> BTW, you might consider New Relic (www.newrelic.com) trial to see if
> it that sheds any light. Both the Python agent and server monitoring.
>
> Disclaimer. I work at New Relic and am writing the Python agent. The
> pro trial you get on initial sign up and before it drops down to free
> lite subscription still could be useful to help track things down,
> although if processes are dying suddenly then may not.
>
> Graham
>
> ...
>
> leer más »
BTW, you didn't need to force single thread to attach gdb to try and
capture crash, but perhaps fortuitous you did as has narrowed down
what the problem is. :-)
Graham
What can i do for detecting which c extension module isn't thread
safe ?
Caro
On 26 nov, 23:21, Graham Dumpleton <graham.dumple...@gmail.com> wrote:
> You could be using a C extension module which isn't thread safe.
>
> BTW, you didn't need to force single thread to attach gdb to try and
> capture crash, but perhaps fortuitous you did as has narrowed down
> what the problem is. :-)
>
> Graham
>
> >> >> >> On 22 November 2011 09:35, caroline...
>
> leer más »
Also list what third party Python packages you are using which have a
C extension component. Can do this by looking for any packages in your
site-packages which have a .so file.
Graham
I've finally found the solution to this issue. It was cx_Oracle module
that was causing the process to crash. Because the default value of
this module is threaded=False, then i had to add this parameter in
the database configuration of our application:
'OPTIONS': {
'threaded': True,
}
It was documented in the django-oracle documentation but i didn't
remember to put this.
Thank you very much for all your help.
Caro
On 27 nov, 17:30, Graham Dumpleton <graham.dumple...@gmail.com> wrote:
> Use gdb as discussed before to try and capture a stack trace of where
> problem occurs. Only way you can even start to work it out.
>
> Also list what third party Python packages you are using which have a
> C extension component. Can do this by looking for any packages in your
> site-packages which have a .so file.
>
> Graham
>
> >> >> >> > proxy:reverse already...
>
> leer más »