Bleeding Edge Update: 228

86 views
Skip to first unread message

Graham Dumpleton

unread,
May 20, 2007, 7:02:38 AM5/20/07
to mod...@googlegroups.com
For those who like to walk on the wild side, I have a significant
bleeding edge update to mod_wsgi you might be interested in trying. If
you want something that is stable, keep using 215 for now.

The big change in this update is the first pass at support for running
your WSGI applications in a separate daemon process(es), optionally
running as a user different to that which Apache would normally run
child processes as if initially started as root. In other words,
similar functionality to that offered by FASTCGI and SCGI solutions
but hopefully easier to setup and manage.

Note that although the intention is to eventually support being able
to use multiple threads within the daemon processes, this first go at
it only supports single threading. Thus you will probably want to
start multiple processes in the daemon process group as shown below so
that multiple requests can still be handled in parallel.

Two new Apache directives are provided in the mod_wsgi module to
configure and setup use of the daemon processes. The names of the
directives might yet change, but at the moment they are as described
below.

WSGIStartDaemon - This is used to start up a group of daemon
processes. This directive must be defined at global scope within the
main Apache configuration files, ie., outside of any VirtualHost
directives or other container directives. An example of its use would
be:

WSGIStartDaemon django user=grahamd group=grahamd processes=5 threads=1

This will result in five processes being created in a group called
'django', each with a single thread for now, and running as user/group
of grahamd.

WSGIProcessGroup - Indicates which daemon process group to use for an
application. This directive can be defined in main Apache
configuration file at global scope, within a VirtualHost container, or
in either Directory or Location containers. An example of its use
would be:

WSGIProcessGroup django

This will result in any requests going to a WSGI application managed
by mod_wsgi within that scope being handled by the separate daemon
processes created in the group named 'django'.

A complete configuration for where Django was used to implement a
virtual host might therefore be:

WSGIStartDaemon django user=grahamd group=grahamd processes=5 threads=1

<VirtualHost www.mysite.com:80>

WSGIProcessGroup django

AliasMatch ^/([^/]+)/media/(.*) /usr/local/django/$1/media/$2

<DirectoryMatch ^/usr/local/django/([^/]+)/media>
Order deny,allow
Allow from all
</DirectoryMatch>

WSGIScriptAliasMatch ^/([^/]+) /usr/local/django/$1/apache/$1.wsgi

<DirectoryMatch ^/usr/local/django/([^/]+)/apache>
Order deny,allow
Allow from all
</DirectoryMatch>

</VirtualHost>

Note that this feature is only available if using Apache 2.X and is
not available if using Apache 1.3. The feature will also not be
supported on Windows even if someone steps up to help me out with some
build scripts for Windows, as it relies on UNIX specific features.
There is a chance the feature also may not be available on Apache 2.X,
but that would only be the case if someone had deliberately built the
APR library used by Apache with certain features for managing child
processes disabled.

When this feature is used, if any of the daemon processes die or exit,
the Apache parent processes should automatically start a new process
in its place just like it would for any other Apache child process. If
you yourself need to force your daemon processes to exit and thus be
restarted, without Apache as a whole needing to be restarted, you can
send a SIGTERM signal to each of the daemon processes.

That the daemon processes can be run as a distinct user, sending
signals in this way would then give you as that user (without
requiring root privileges) the ability to trigger a reload of your
application in order to pick up changes to any Python code. You could
even possibly code a page within your application which looks for all
Apache httpd processes owned by the same user and send them all
SIGTERM to affect the restart of your WSGI application.

At the moment the code has only been tested on Apache 2.2.4 (worker)
on Mac OS X with Python 2.3.5. Tomorrow I'll do some further testing
on Ubuntu and later some testing with prefork Apache.

As well as adding supporting for multiple threads in daemon processes,
still need to work out how to get logging in daemon processes to go to
the correct virtual host log file, currently will go to main log file,
or allow a special log file just for the process group to be defined.
Also need to look at issues such as setting working directory for
daemon process and simulating user environment so stuff like HOME
directory resolves correctly thereby allowing Python eggs to unpack
into a user account cache without special action.

Hopefully these changes are of interest and don't crash too much for
you. If you have any comments, feedback or problems, then please let
me know through the list.

Have fun.

Graham

Graham Dumpleton

unread,
May 20, 2007, 6:34:26 PM5/20/07
to mod...@googlegroups.com
On 20/05/07, Graham Dumpleton <graham.d...@gmail.com> wrote:
> For those who like to walk on the wild side, I have a significant
> bleeding edge update to mod_wsgi you might be interested in trying. If
> you want something that is stable, keep using 215 for now.

Too bleeding edge. Although it works fine on MacOS X there are a
number of problems on Ubuntu ranging from undefined symbols to accept
mutex locking problems. :-(

Thus, perhaps hold off a little for now.

Graham

Graham Dumpleton

unread,
May 21, 2007, 1:23:49 AM5/21/07
to mod...@googlegroups.com

Okay, problem was embarrassingly stupid and obvious, although why Mac
OS X didn't have a problem with it I don't know.

In short, it doesn't help if when dealing with a mutex one goes:

unlock
do stuff
unlock

instead of:

lock
do stuff
unlock

Revsion 230 should be okay to give a go now. :-)

Graham

Manuzhai

unread,
May 21, 2007, 5:40:15 AM5/21/07
to mod...@googlegroups.com
On 5/20/07, Graham Dumpleton <graham.d...@gmail.com> wrote:
> For those who like to walk on the wild side, I have a significant
> bleeding edge update to mod_wsgi you might be interested in trying. If
> you want something that is stable, keep using 215 for now.

Very cool! I tried it out, but it didn't work for me (this is actually
using the new feature):

[Mon May 21 11:35:00 2007] [error] mod_wsgi (pid=5497): Process 'trac'
has died, restarting.
[Mon May 21 11:35:00 2007] [notice] mod_wsgi (pid=5505): Starting
process 'trac' with uid=1005 and gid=1005.
[Mon May 21 11:35:00 2007] [notice] mod_wsgi (pid=5505): Attach interpreter ''.
[Mon May 21 11:35:00 2007] [notice] mod_wsgi (pid=5498): Create
interpreter 'trac.xavamedia.nl|'.
Fatal Python error: PyEval_AcquireThread: non-NULL old thread state
[Mon May 21 11:35:00 2007] [error] [client 89.98.241.141] Premature
end of script headers: trac.wsgi
[Mon May 21 11:35:01 2007] [notice] child pid 5498 exit signal Aborted (6)
[Mon May 21 11:35:01 2007] [error] mod_wsgi (pid=5498): Process 'trac'
has died, restarting.
[Mon May 21 11:35:01 2007] [notice] mod_wsgi (pid=5506): Starting
process 'trac' with uid=1005 and gid=1005.
[Mon May 21 11:35:01 2007] [notice] mod_wsgi (pid=5506): Attach interpreter ''.

My config:

WSGIStartDaemon trac user=trac group=trac processes=2 threads=1
WSGIScriptAlias / /var/trac/wsgi/trac.wsgi
WSGIProcessGroup trac

(I just added the new directives, the first at the global level, the
second was already in the vhost, and I added the third to the vhost
conf.)

This is with r232, by the way.

I was wondering if you could add the new directives to the
ConfigurationDirectives wiki page (probably with some big fat
disclaimer) so I can see what the default options are. Also, I first
tried to use user=root, that obviously doesn't work. And, have you
tried to benchmark this against "normal" mod_wsgi performance? The
permanently running daemons should make mod_wsgi faster still, right?

Regards,

Manuzhai

Graham Dumpleton

unread,
May 21, 2007, 6:59:11 AM5/21/07
to mod...@googlegroups.com
On 21/05/07, Manuzhai <manu...@gmail.com> wrote:
>
> On 5/20/07, Graham Dumpleton <graham.d...@gmail.com> wrote:
> > For those who like to walk on the wild side, I have a significant
> > bleeding edge update to mod_wsgi you might be interested in trying. If
> > you want something that is stable, keep using 215 for now.
>
> Very cool! I tried it out, but it didn't work for me (this is actually
> using the new feature):
>
> [Mon May 21 11:35:00 2007] [error] mod_wsgi (pid=5497): Process 'trac'
> has died, restarting.
> [Mon May 21 11:35:00 2007] [notice] mod_wsgi (pid=5505): Starting
> process 'trac' with uid=1005 and gid=1005.
> [Mon May 21 11:35:00 2007] [notice] mod_wsgi (pid=5505): Attach interpreter ''.
> [Mon May 21 11:35:00 2007] [notice] mod_wsgi (pid=5498): Create
> interpreter 'trac.xavamedia.nl|'.
> Fatal Python error: PyEval_AcquireThread: non-NULL old thread state
> [Mon May 21 11:35:00 2007] [error] [client 89.98.241.141] Premature
> end of script headers: trac.wsgi
> [Mon May 21 11:35:01 2007] [notice] child pid 5498 exit signal Aborted (6)
> [Mon May 21 11:35:01 2007] [error] mod_wsgi (pid=5498): Process 'trac'
> has died, restarting.
> [Mon May 21 11:35:01 2007] [notice] mod_wsgi (pid=5506): Starting
> process 'trac' with uid=1005 and gid=1005.
> [Mon May 21 11:35:01 2007] [notice] mod_wsgi (pid=5506): Attach interpreter ''.

Hmmm, Python threading problems is the last thing I would have expected to see.

What operating system are you using? What version of Python?

Did you do a 'stop' and then 'start' of Apache as opposed to a
'restart' to ensure that old mod_wsgi was unloaded properly?

Is mod_python being loaded at the same time? Can you disable
mod_python if it is?

> My config:
>
> WSGIStartDaemon trac user=trac group=trac processes=2 threads=1
> WSGIScriptAlias / /var/trac/wsgi/trac.wsgi
> WSGIProcessGroup trac
>
> (I just added the new directives, the first at the global level, the
> second was already in the vhost, and I added the third to the vhost
> conf.)
>
> This is with r232, by the way.

If that is the latest should be okay as added some changes to fix
issues with accept mutex when shutting down.

> I was wondering if you could add the new directives to the
> ConfigurationDirectives wiki page (probably with some big fat
> disclaimer) so I can see what the default options are.

Didn't want to document it until confident it worked okay across
various platforms and until am sure about names I'll use for
directives.

> Also, I first
> tried to use user=root, that obviously doesn't work

Correct, am currently blocking processes from being run as root
because of security implications. Not sure whether I really need to be
that cautious.

> And, have you
> tried to benchmark this against "normal" mod_wsgi performance? The
> permanently running daemons should make mod_wsgi faster still, right?

It will not be quicker as request is still initially handled by Apache
child process. The request is then proxied across to the daemon
process. Thus, operates similar to fastcgi and scgi albeit perhaps
with a bit less overhead.

Thus you sacrifice some speed to get ability to run stuff as separate
user and dedicate processes to specific WSGI applications. On the
wrong computer to do further tests at this point, but it was running
similar speeds to mod_python from memory on Mac OS X although really
depends on process configuration used. In the context of a large web
framework, difference is still going to be negligible as the web
frameworks or database are always the bottleneck.

I'll see if I can hope on to my other computer and get onto my FC2 box
and see if it works there.

Anyone else have any success or otherwise?

Graham

Manuzhai

unread,
May 21, 2007, 7:29:54 AM5/21/07
to mod...@googlegroups.com
On 5/21/07, Graham Dumpleton <graham.d...@gmail.com> wrote:
> What operating system are you using? What version of Python?

This is on Gentoo Linux (up-to-date 2007.0). Python 2.4.4 on Apache 2.0.58.

> Did you do a 'stop' and then 'start' of Apache as opposed to a
> 'restart' to ensure that old mod_wsgi was unloaded properly?

Yep (Gentoo has it's own startup scripts with a restart option that
doesn't do the -k graceful that seems to be the thing you are
referring to).

> Is mod_python being loaded at the same time? Can you disable
> mod_python if it is?

That fixed the problem. With mod_python it still gives the same error,
without it it seems to work.

> It will not be quicker as request is still initially handled by Apache
> child process. The request is then proxied across to the daemon
> process. Thus, operates similar to fastcgi and scgi albeit perhaps
> with a bit less overhead.

Okay. For my understanding: does mod_wsgi keep the WSGI script and
everything it is referring to in memory? So that I could do some
expensive things at startup only and it will only do it once per
interpreter, and in principle work for a large number of requests?

Regards,

Manuzhai

Graham Dumpleton

unread,
May 21, 2007, 7:48:40 AM5/21/07
to mod...@googlegroups.com
On 21/05/07, Manuzhai <manu...@gmail.com> wrote:
>
> On 5/21/07, Graham Dumpleton <graham.d...@gmail.com> wrote:
> > What operating system are you using? What version of Python?
>
> This is on Gentoo Linux (up-to-date 2007.0). Python 2.4.4 on Apache 2.0.58.
>
> > Did you do a 'stop' and then 'start' of Apache as opposed to a
> > 'restart' to ensure that old mod_wsgi was unloaded properly?
>
> Yep (Gentoo has it's own startup scripts with a restart option that
> doesn't do the -k graceful that seems to be the thing you are
> referring to).

A 'restart', whether it be 'graceful' or not doesn't actually shutdown
the main Apache process instead trying to unload the Apache modules it
has loaded and then reload them. This can sometimes cause strange
problems when mod_python is also loaded as mod_python doesn't
correctly shut itself down and cleanup the Python interpreter. Also,
if there are significant changes to an Apache module a 'restart' can
also give problems. Thus always a good idea when upgrading a module
version to perform an actual 'stop' and 'start' so that complete
Apache is shutdown.

Still not sure how that relates to your Gentoo startup script. :-)

> > Is mod_python being loaded at the same time? Can you disable
> > mod_python if it is?
>
> That fixed the problem. With mod_python it still gives the same error,
> without it it seems to work.

I am going to guess that it is the problem discussed in:

http://groups.google.com/group/modwsgi/browse_thread/thread/2d577e1bee151500

and also mentioned in latest README and InstallationIssues pages:

http://modwsgi.googlecode.com/svn/trunk/README
http://code.google.com/p/modwsgi/wiki/InstallationIssues

The issue is summarised in 'Using ModPython and ModWsgi' of the
InstallationIssues documentation.

In short, your Python version possibly doesn't have a shared library
so that separate static version of Python library in both
mod_python.so and mod_wsgi.so. This would result in the errors you
were seeing.

If you need to run both mod_python and mod_wsgi at same time, use the
workaround of not linking the Python library into mod_wsgi as
described in that document.

> > It will not be quicker as request is still initially handled by Apache
> > child process. The request is then proxied across to the daemon
> > process. Thus, operates similar to fastcgi and scgi albeit perhaps
> > with a bit less overhead.
>
> Okay. For my understanding: does mod_wsgi keep the WSGI script and
> everything it is referring to in memory?

Yes.

> So that I could do some
> expensive things at startup only and it will only do it once per
> interpreter, and in principle work for a large number of requests?

Except that the WSGI script is only loaded on the first request that
arrives for it, not when the process is started. Once loaded it stays
put for life of the process.

In other words, there is no equivalent to the PythonImport directive
in mod_python to force a module to be loaded when the process first
starts. I am not overly keen on adding such a feature as useless in
web hosting environments as you will not have access to main Apache
configuration anyway.

Graham

Graham Dumpleton

unread,
May 21, 2007, 8:29:47 AM5/21/07
to mod...@googlegroups.com
On 21/05/07, Graham Dumpleton <graham.d...@gmail.com> wrote:
> > That fixed the problem. With mod_python it still gives the same error,
> > without it it seems to work.
>
> I am going to guess that it is the problem discussed in:
>
> http://groups.google.com/group/modwsgi/browse_thread/thread/2d577e1bee151500
>
> and also mentioned in latest README and InstallationIssues pages:
>
> http://modwsgi.googlecode.com/svn/trunk/README
> http://code.google.com/p/modwsgi/wiki/InstallationIssues
>
> The issue is summarised in 'Using ModPython and ModWsgi' of the
> InstallationIssues documentation.
>
> In short, your Python version possibly doesn't have a shared library
> so that separate static version of Python library in both
> mod_python.so and mod_wsgi.so. This would result in the errors you
> were seeing.
>
> If you need to run both mod_python and mod_wsgi at same time, use the
> workaround of not linking the Python library into mod_wsgi as
> described in that document.

Not the same problem as I get the same thing on FC2 and I am already
using the workaround of not linking in Python library to mod_wsgi.so.

[Mon May 21 08:15:00 2007] [notice] Apache/2.0.51 (Fedora) configured
-- resuming normal operations
[Mon May 21 08:15:16 2007] [warn] (95)Operation not supported:
apr_socket_opt_set: (TCP_NODELAY)
[Mon May 21 08:15:16 2007] [notice] mod_wsgi (pid=11879): Create
interpreter 'www.dscpl.com.au|/~grahamd/hello.wsgi'.


Fatal Python error: PyEval_AcquireThread: non-NULL old thread state

[Mon May 21 08:15:16 2007] [error] [client 121.44.46.5] Premature end
of script headers: hello.wsgi
[Mon May 21 08:15:16 2007] [notice] child pid 11879 exit signal Aborted (6)
[Mon May 21 08:15:16 2007] [error] mod_wsgi (pid=11879): Process
'wsgi' has died, restarting.
[Mon May 21 08:15:16 2007] [notice] mod_wsgi (pid=12010): Starting
process 'wsgi' with uid=501 and gid=501.
[Mon May 21 08:15:16 2007] [notice] mod_wsgi (pid=12010): Attach interpreter ''.

It only happens though when trying to run WSGI application in the
separate process and not in the main Apache child processes.

I think I know what the problem is and that it is occurring probably
means I never tested both mod_python and mod_wsgi together in this
configuration before, which is bad of me.

In short, the problem is that the way mod_python initialises Python,
it leaves an active thread state around which would normally be
eliminated when mod_python child process init function is run. In the
mod_wsgi daemon process this mod_python child process init function
isn't run, but my one is and it is assuming that there is no active
thread state and thus crashes when it does something that will only
work with no active thread state.

Getting late now, so will have to fix this tomorrow.

Graham

Graham Dumpleton

unread,
May 21, 2007, 7:25:21 PM5/21/07
to mod...@googlegroups.com
The problem with mod_python and mod_wsgi coexisting when daemon
process is being used should now be fixed in revision 234.

Graham

Graham Dumpleton

unread,
May 22, 2007, 6:31:18 AM5/22/07
to mod...@googlegroups.com
On 22/05/07, Graham Dumpleton <graham.d...@gmail.com> wrote:
> The problem with mod_python and mod_wsgi coexisting when daemon
> process is being used should now be fixed in revision 234.

FYI, I have though noted that if there are unhandled exceptions
propagated back to mod_wsgi from Python code that although it does log
the error details, it then promptly crashes the mod_wsgi daemon
process. Because the main Apache child process sees a truncated
response from the daemon, the result sent back to the web browser is
still a 500 error response though, so from point of view of web
browser they see nothing different. :-)

Graham

Graham Dumpleton

unread,
May 24, 2007, 3:00:53 AM5/24/07
to mod...@googlegroups.com

Fixed this issue now so it doesn't crash on 500 errors. Revision to
use is now 234 (head) to play with this fixed version.

The main thing I need to still sort out before I try and add
multithreading is getting logging to go to the correct virtual host
log file for stuff logged through the wsgi error object in the daemon
process. A solution is proving rather elusive at the moment. :-(

Graham

Clodoaldo

unread,
May 26, 2007, 6:34:46 PM5/26/07
to modwsgi
I'm trying revision 237 and receiving an error:

[Sat May 26 19:16:45 2007] [notice] [client 10.1.1.101] (13)Permission
denied: mod_wsgi (pid=4725): Unable to connect to WSGI daemon process
'hello' on '/etc/httpd/logs/wsgi.4717.0.1.sock' after multiple
attempts.

Permissions for the sock file:

srwx------ 1 apache root 0 May 26 19:16 wsgi.4717.0.1.sock

The config:

WSGIStartDaemon hello user=cpn group=cpn processes=5 threads=1
<VirtualHost 10.1.1.101:80>
ServerName wsgiapp.dkt
DocumentRoot /var/www/html/wsgiapp
WSGIScriptAlias / /var/www/html/wsgiapp/
WSGIProcessGroup hello
<Directory /var/www/html/wsgiapp>
Order allow,deny
Allow from all
</Directory>
</VirtualHost>

I'm in FC 5 and BTW the error in the Browser is a 503 Service
Temporarily Unavailable

Regards, Clodoaldo Pinto Neto

On May 24, 4:00 am, "Graham Dumpleton" <graham.dumple...@gmail.com>
wrote:
> On 22/05/07, Graham Dumpleton <graham.dumple...@gmail.com> wrote:

Graham Dumpleton

unread,
May 26, 2007, 7:26:25 PM5/26/07
to mod...@googlegroups.com
Set at global scope the configuration directive:

WSGISocketPrefix run/wsgi

On some platforms the 'logs' directory isn't readable to others and so
the Apache child processes cannot access the sockets in the default
Apache runtime directory (ie., logs).

These platforms often have a 'run' directory, which through a symlink
maps to /var/run/httpd or similar. Thus, above directive tells it to
use that directory instead.

If there isn't a run directory, then simply set it to:

WSGISocketPrefix /tmp/wsgi

so the sockets are put in /tmp instead.

Graham

Clodoaldo

unread,
May 26, 2007, 7:56:54 PM5/26/07
to mod...@googlegroups.com
2007/5/26, Graham Dumpleton <graham.d...@gmail.com>:

>
> Set at global scope the configuration directive:
>
> WSGISocketPrefix run/wsgi
>
> On some platforms the 'logs' directory isn't readable to others and so
> the Apache child processes cannot access the sockets in the default
> Apache runtime directory (ie., logs).
>
> These platforms often have a 'run' directory, which through a symlink
> maps to /var/run/httpd or similar. Thus, above directive tells it to
> use that directory instead.
>
> If there isn't a run directory, then simply set it to:
>
> WSGISocketPrefix /tmp/wsgi
>
> so the sockets are put in /tmp instead.

It is working now. Fedora Core has a /etc/httpd/run symlink pointing
to /var/run.

Clodoaldo

Graham Dumpleton

unread,
May 27, 2007, 7:05:24 AM5/27/07
to mod...@googlegroups.com
On 20/05/07, Graham Dumpleton <graham.d...@gmail.com> wrote:
> Two new Apache directives are provided in the mod_wsgi module to
> configure and setup use of the daemon processes. The names of the
> directives might yet change, but at the moment they are as described
> below.
>
> WSGIStartDaemon - This is used to start up a group of daemon
> processes. This directive must be defined at global scope within the
> main Apache configuration files, ie., outside of any VirtualHost
> directives or other container directives. An example of its use would
> be:
>
> WSGIStartDaemon django user=grahamd group=grahamd processes=5 threads=1

Please note that I have decided to rename this directive to
WSGIDaemonProcess instead. Actually, I think that is what I had it in
the first place before making it the above for a while. Thus:

WSGIDaemonProcess django user=grahamd group=grahamd processes=5 threads=1

This change applies from revision 239.

> As well as adding supporting for multiple threads in daemon processes,
> still need to work out how to get logging in daemon processes to go to
> the correct virtual host log file, currently will go to main log file,

Logging should now go to the correct log file when virtual hosts are being used.

Time now to get multithreading working in daemon processes as
everything else seems to be working okay, unless I screwed something
up with last lot of changes which I haven't found yet. I will also now
start documenting new directives.

FWIW, using a daemon process fits in at about 500 on my performance
scale whereby a static page was at 1000 and mod_wsgi in embedded mode
was 900. This value of 500 still places it above mod_python. My tests
with fastcgi/flup had it performing worse than mod_python, so although
mod_wsgi daemon mode is somewhat below the faster embedded mode, still
better than the other options out there which work in a similar way.
:-)

Graham

Reply all
Reply to author
Forward
0 new messages