Configuring multiple Django installs transparently via FastCGI mutiplexing

8 views
Skip to first unread message

Brian

unread,
Nov 26, 2008, 10:06:29 PM11/26/08
to Django users
Hi everyone,

Here's a questions I just posted on stackoverflow.com (because I like
that forum's layout) - but I thought posting it here might lead to
more / better coverage. See: http://stackoverflow.com/questions/322694/

Multiple installs of Django - How to configure transparent multiplex
through the webserver (Lighttpd)?

Hi Everyone,

This question flows from the answer to: How does one set up multiple
accounts with separate databases for Django on one server? (http://
stackoverflow.com/questions/314515)

I haven't seen anything like this on Google or elsewhere (perhaps I
have the wrong vocabulary), so I think input could be a valuable
addition to the internet discourse.

How could one configure a server likeso:

* One installation of Lighttpd
* Multiple Django projects running as FastCGI
* The Django projects may be added/removed at will, and ought not to
require restarting the webserver
* Transparent redirection of all requests/responses to a particular
Django installation depending on the current user

I.e. Given Django projects (with corresponding FastCGI socket):

* Bob (/tmp/bob.fcgi)
* Sue (/tmp/sue.fcgi)
* Joe (/tmp/joe.fcgi)

The Django projects being started with a (oversimplified) script
likeso:

#!/bin/sh
NAME=bob

SOCKET=/tmp/$NAME.fcgi

PROTO=fcgi
DAEMON=true

/django_projects/$NAME/manage.py runfcgi protocol=$PROTO socket=
$SOCKET
daemonize=$DAEMON
# ---- end

I want traffic to http://www.example.com/ to direct the request to the
correct Django application depending on the user that is logged in.

In other words, http://www.example.com should come "be" /tmp/bob.fcgi
if bob is logged in, /tmp/joe.fcgi if joe is logged in, /tmp/sue.fcgi
if sue is logged in. If no-one is logged in, it should redirect to a
login page.

I've contemplated a demultiplexing "plexer" FastCGI script with the
following algorithm:

1. If the cookie $PLEX is set, pipe request to /tmp/$PLEX.fcgi

2. Otherwise redirect to login page (which sets the cookie PLEX based
on a many-to-one mapping of Username => PLEX)

Of course as a matter of security $PLEX should be taint checked, and
$PLEX shouldn't give rise to any presumption of trust.

A Lighttpd configuration would be likeso (though Apache, Nginx, etc.
could be used just as easily):

fastcgi.server = ( "plexer.fcgi" =>
( "localhost" =>
(
"socket" => "/tmp/plexer.fcgi",
"check-local" => "disable"
)
)
)
# ---- end

Input and thoughts, helpful links, and to know how to properly
implement the FastCGI plexer would all be appreciated.

Thank you.

Graham Dumpleton

unread,
Nov 26, 2008, 10:32:21 PM11/26/08
to Django users


On Nov 27, 2:06 pm, Brian <brianmh...@gmail.com> wrote:
> Hi everyone,
>
> Here's a questions I just posted on stackoverflow.com (because I like
> that forum's layout) - but I thought posting it here might lead to
> more / better coverage.  See:http://stackoverflow.com/questions/322694/
>
> Multiple installs of Django - How to configure transparent multiplex
> through the webserver (Lighttpd)?

Are you stuck with using Lighttpd?

Can you explain the background of the situation you have that requires
such a setup? May help in working out what to suggest.

Graham
> I want traffic tohttp://www.example.com/to direct the request to the
> correct Django application depending on the user that is logged in.
>
> In other words,http://www.example.comshould come "be" /tmp/bob.fcgi

Brian

unread,
Nov 26, 2008, 11:17:11 PM11/26/08
to Django users
Hi Graham,

Thanks for the reply. No, I'm not stuck using Lighttpd at all - I just
like it because it's simple and fast. :)

Here's a link to a description of what I'd like to see:
http://stackoverflow.com/questions/314515

The situation is this: I'm creating a web-site with a bunch of
accounts - each account having its own database. I want people to go
to the site (i.e. www.example.com/) and when they log in the
application gets all its requests/responses from their account's
database. For the moment, all accounts will be using the same Django
applications, though that is subject to change so I'd prefer not to
rely on a solution that precludes that possibility.

I suspect (but am happy to be corrected...) that the easiest and
safest way to do this is to have a Django instance running in FastCGI
mode with a socket for each account. When a user is logged in, their
requests/responses are mapped to/from the proper Django socket via the
multiplexing solution I've suggested in my original post.

As mentioned, accounts may crop up and disappear, and shouldn't
require restarting the web-server. There could be dozens of accounts
(which means lots of Django instances).

Is there any more information that would be helpful?

Cheers & thank you,
Brian


On Nov 26, 10:32 pm, Graham Dumpleton <Graham.Dumple...@gmail.com>
wrote:
> > I want traffic tohttp://www.example.com/todirect the request to the
> > correct Django application depending on the user that is logged in.
>
> > In other words,http://www.example.comshouldcome "be" /tmp/bob.fcgi

Graham Dumpleton

unread,
Nov 26, 2008, 11:31:46 PM11/26/08
to Django users


On Nov 27, 3:17 pm, Brian <brianmh...@gmail.com> wrote:
> Hi Graham,
>
> Thanks for the reply. No, I'm not stuck using Lighttpd at all - I just
> like it because it's simple and fast. :)
>
> Here's a link to a description of what I'd like to see:http://stackoverflow.com/questions/314515
>
> The situation is this: I'm creating a web-site with a bunch of
> accounts - each account having its own database. I want people to go
> to the site (i.e.www.example.com/) and when they log in the
> application gets all its requests/responses from their account's
> database. For the moment, all accounts will be using the same Django
> applications, though that is subject to change so I'd prefer not to
> rely on a solution that precludes that possibility.
>
> I suspect (but am happy to be corrected...) that the easiest and
> safest way to do this is to have a Django instance running in FastCGI
> mode with a socket for each account. When a user is logged in, their
> requests/responses are mapped to/from the proper Django socket via the
> multiplexing solution I've suggested in my original post.
>
> As mentioned, accounts may crop up and disappear, and shouldn't
> require restarting the web-server. There could be dozens of accounts
> (which means lots of Django instances).

How often would accounts be changed and if not that often, why would
restarting the web server be a problem?

Graham
> > > I want traffic tohttp://www.example.com/todirectthe request to the

Brian

unread,
Nov 26, 2008, 11:45:54 PM11/26/08
to Django users
Accounts could be created as often as hourly. I'd be very bad to have
the webserver go down while people use the system (unless it was for
less than a second or two... but even then, it's still be very
bad :) ).


On Nov 26, 11:31 pm, Graham Dumpleton <Graham.Dumple...@gmail.com>
> > > > I want traffic tohttp://www.example.com/todirecttherequest to the

Malcolm Tredinnick

unread,
Nov 27, 2008, 12:01:40 AM11/27/08
to django...@googlegroups.com

On Wed, 2008-11-26 at 20:17 -0800, Brian wrote:
[...]

> The situation is this: I'm creating a web-site with a bunch of
> accounts - each account having its own database. I want people to go
> to the site (i.e. www.example.com/) and when they log in the
> application gets all its requests/responses from their account's
> database. For the moment, all accounts will be using the same Django
> applications, though that is subject to change so I'd prefer not to
> rely on a solution that precludes that possibility.

Whilst I'm all in favour of attempting to solve problems given arbitrary
constraints as a thought exercise, I think this one isn't really the
best practical solution to anything.

You are proposing having a single addressable URL that points to vastly
different content based on some other, transparent piece of information
(a cookie). Why not use the URL space as it's been designed and give
each instance its own URL? We have the domain namespace for that at the
topmost level, as well as the full URL namespace for subdivisions at a
different layer: www.example.com/user1/ and www.example.com/user2/ ,
for example. You can do the authentication checks as an addendum to
that, for example at the indivdual lighttpd (or other webserver of
choice) level.

I'll go so far as to claim that your proposed setup up breaks the web.
Not a single request to that site will be effectively long-term
cacheable, since they will all have to vary on cookie. And you're using
the same resource name for an arbitrarily large number of difference
resources. The web performance and behaviour of people using such a
setup is actively harmed when it's so easily avoidable.

If you really, really wanted to go down the one name to rule them all
path (for example, it wins you a really large bet that you accepted by
accident), you could use mod_rewrite to do an internal redirect to the
individually named URLs (maybe combined with some other modules).

I'm not going to participate much more in this, since there isn't really
any Django content here (you mention the word Django a few times, but
nothing is specific to Django or even uses Django in the solution you've
proposed). I think re-evaluating your initial design would be
beneficial, though.

Regards,
Malcolm

Graham Dumpleton

unread,
Nov 27, 2008, 12:12:18 AM11/27/08
to Django users


On Nov 27, 3:45 pm, Brian <brianmh...@gmail.com> wrote:
> Accounts could be created as often as hourly. I'd be very bad to have
> the webserver go down while people use the system (unless it was for
> less than a second or two... but even then, it's still be very
> bad :) ).

I don't know how lighttpd works, but if one does a graceful restart
(or even a restart) with Apache, in the main it isn't noticeable to
the user as the listener socket is never released and so new
connections just queue up and aren't outright refused, ie., server
isn't actually completely stopped. The issue is more the startup time
of Django instances and whether a restart will cause active login
sessions to be terminated based on how application is written. This is
because on a restart, active instances which are still required are
restarted regardless.

A few more questions.

The actual Django application is something the users themselves are
just a user of? There is no requirement for them to be able to make
changes to a segment of code base and force their own restarts of
their instance to pick up changes.

For each account, through what do they login initially? Are you
expecting to use Django based login mechanisms for that, or do you
front it all with HTTP Basic Authentication. If you are going to
somehow switch based on their identity it presumably needs to be done
outside of the context of the target Django instance else you will not
know which to go to.

Does the account have a distinct UNIX account associated with it, or
would all Django instances run as same user and you are then just
mapping a logical account name to a specific instance attached to a
specific database.

Would there be a calculable cap on the number of accounts you would
have active at any one time. Or would it at least be acceptable that
if there is a preconfigured number of instances you can switch between
and that limit is reached, that restarting web server would then be
seen as okay?

Sorry, if I seem to be asking a lot of questions, but believe might
have a manageable solution for you, but want to be clear on these
things so know if will be or not and what configuration would need to
be.

Graham
> > > > > I want traffic tohttp://www.example.com/todirecttherequestto the

Brian

unread,
Nov 27, 2008, 1:19:08 AM11/27/08
to Django users
Malcolm:

Thanks for the reply.

> Whilst I'm all in favour of attempting to solve problems given arbitrary
> constraints as a thought exercise, I think this one isn't really the
> best practical solution to anything.

I'm sure that's a fault of my explanation, not the design. ;o)

> You are proposing having a single addressable URL that points to vastly
> different content based on some other, transparent piece of information
> (a cookie). Why not use the URL space as it's been designed and give
> each instance its own URL? We have the domain namespace for that at the
> topmost level, as well as the full URL namespace for subdivisions at a
> different layer:  www.example.com/user1/andwww.example.com/user2/,
> for example. You can do the authentication checks as an addendum to
> that, for example at the indivdual lighttpd (or other webserver of
> choice) level.
>
> I'll go so far as to claim that your proposed setup up breaks the web.
> Not a single request to that site will be effectively long-term
> cacheable, since they will all have to vary on cookie. And you're using
> the same resource name for an arbitrarily large number of difference
> resources. The web performance and behaviour of people using such a
> setup is actively harmed when it's so easily avoidable.

You're right in that the clandestine compacting of the url (especially
with disregard to plausible collisions) in lieu of a cookie to make
decisions about what the webserver resource location generally flies
in the face of fundamental principles of web design. However, this
choice has been carefully contemplated, though it may be reconsidered.

For edification, the project is not a public web-site, but has limited
authenticated-only access, with all transmissions encrypted. There is
a one-to-one mapping between authenticated users' account (and,
notably, a cookie of theirs) and the content they may access on the
site; i.e. user1 may have example.com/link, and user2 have example.com/
link. They are mutually exclusive, and neither is ever accessible by
the public. Thus the userx/ is superfluous. As well, due to the nature
of the site, it is extraordinarily unlikely that it would be accessed
by the same web browser for different accounts. It's all rather
unconventional, I admit, but certainly not arbitrary.

As well, while there will be object caching, page caching is not the
right answer; the content is highly dynamic and it is a relatively low-
volume site, so caching is not a major concern.

All to say, I'm cognizant of the concerns expressed, and while I'd
agree with your concerns for conventional web-sites, I'm quite
confident in the defensibility of the design choices made, poorly
though I may have explained them. I'm rather certain it won't break
the web. In any event URL compaction is not relevant to the crux of
the real problem, and as you suggest below, URL compaction it's just a
mod_rewrite. Please disregard any references to URL mapping or
compaction.


> If you really, really wanted to go down the one name to rule them all
> path (for example, it wins you a really large bet that you accepted by
> accident), you could use mod_rewrite to do an internal redirect to the
> individually named URLs (maybe combined with some other modules).
>
> I'm not going to participate much more in this, since there isn't really
> any Django content here (you mention the word Django a few times, but
> nothing is specific to Django or even uses Django in the solution you've
> proposed). I think re-evaluating your initial design would be
> beneficial, though.

Quite alright. The question at hand is where to demultiplex a user
request to their respective datesets. It can happen either in Django
or at the web-server. I believe the crux of the problem, if Django is
the demultiplexer, could be expressed (making some presumptions about
plausible solutions) as follows:

How could one have a Django installation, where (for example) one may,
based on the account of the logged-in User:

(a) Add a "prefix_" to each database table; or

(b) Change the database.

I'd be particularly interested in seeing, for example, middleware that
takes a Request's User and changes the django.conf.SETTINGS
['DATABASE_NAME'] to the database for this user's account, or
alternatively sets a prefix for all subsequent database access.

However, this middleware approach raises red flags (and I don't know
Django well enough to overcome them), viz.
* Is changing django.conf.SETTINGS thread-safe?
* Is there just an inherent danger in changing django.conf.SETTINGS --
would the DB connection be setup already?
* Is changing the DB connection part of the public API?

I'm also conscious that these could require User authentication to be
stored and accessed in a different mechanism than the accounts'
respective databases. That's easy enough thanks to the delightfully
decoupled user authentication system in Django.

I hold out hope that there is a brilliant, simple solution to this
problem via Django. However, my searches and inquiries suggest that
it's not a common problem, nor especially obvious. Failing a safe,
sensible way to change the database settings in a running Django
process (or some other intelligent way to demultiplex user requests to
their respective datasets; essentially semantic, mutually exclusive
sharding), I believe you've astutely pointed out that this isn't a
Django issue, and I should meander over to the webserver forums. I'd
prefer a Django solution to a webserver one, though, because it's the
right place to do it.

I hope that clarifies the explanation some, and makes this vein of
inquiry of some value to the body of Django knowledge. If it's not
something that's a workable Django solution, I'd be happy to have some
certitude about that, too.

Thank you, and best regards,

Brian

Graham Dumpleton

unread,
Nov 27, 2008, 5:37:29 PM11/27/08
to Django users
Since haven't seen a response to my other questions and with Malcolm's
rebuff to your idea, do I take it you aren't interested any more?

Graham

Brian

unread,
Nov 27, 2008, 8:06:31 PM11/27/08
to Django users
Hi Graham,

Sorry I didn't respond earlier - for some reason your last reply with
the questions didn't show up until late today. Very odd -- didn't mean
to ignore your question or give the impression that I was ignoring it.
I most certainly am interested in potential solutions.

> I don't know how lighttpd works, but if one does a graceful restart
> (or even a restart) with Apache, in the main it isn't noticeable to
> the user as the listener socket is never released and so new
> connections just queue up and aren't outright refused, ie., server
> isn't actually completely stopped. The issue is more the startup time
> of Django instances and whether a restart will cause active login
> sessions to be terminated based on how application is written. This is
> because on a restart, active instances which are still required are
> restarted regardless.

I know little about webservers, but I'm relieved to hear that the
restarts are graceful. I believe this gracefulness was a recent
addition to Lighttpd, also.

Incidentally, I simply chose Lighttpd because I've never used it
before, and I figure it'd be fun to learn how to set it up.

I would think the Django processes ought to be persistent in the
background (unless the webserver is starting/stopping them).

> The actual Django application is something the users themselves are
> just a user of? There is no requirement for them to be able to make
> changes to a segment of code base and force their own restarts of
> their instance to pick up changes.

Yes, they're just Users, and will never restart the instance.

> For each account, through what do they login initially? Are you
> expecting to use Django based login mechanisms for that, or do you
> front it all with HTTP Basic Authentication. If you are going to
> somehow switch based on their identity it presumably needs to be done
> outside of the context of the target Django instance else you will not
> know which to go to.

This is a good question, and I haven't come to a conclusion. Probably
it'll be the Django built-in application-based authentication,
particular to each Django instance. There will be a master map of all
users to their respective Django instance.

> Does the account have a distinct UNIX account associated with it, or
> would all Django instances run as same user and you are then just
> mapping a logical account name to a specific instance attached to a
> specific database.

I'm thinking they'll all run as the same Unix user, and it's just a
logical mapping from an account to a database.

This could change.

> Would there be a calculable cap on the number of accounts you would
> have active at any one time. Or would it at least be acceptable that
> if there is a preconfigured number of instances you can switch between
> and that limit is reached, that restarting web server would then be
> seen as okay?

Yes, a cap on the number of accounts is definitely possible in the
short term.

> Sorry, if I seem to be asking a lot of questions, but believe might
> have a manageable solution for you, but want to be clear on these
> things so know if will be or not and what configuration would need to
> be.

That's quite exciting. :o) Thank you, in advance for any solution you
may be able to suggest.

Brian

Graham Dumpleton

unread,
Nov 27, 2008, 10:11:13 PM11/27/08
to Django users
On Nov 28, 12:06 pm, Brian <brianmh...@gmail.com> wrote:
> > For each account, through what do they login initially? Are you
> > expecting to use Django based login mechanisms for that, or do you
> > front it all with HTTP Basic Authentication. If you are going to
> > somehow switch based on their identity it presumably needs to be done
> > outside of the context of the target Django instance else you will not
> > know which to go to.
>
> This is a good question, and I haven't come to a conclusion. Probably
> it'll be the Django built-in application-based authentication,
> particular to each Django instance. There will be a master map of all
> users to their respective Django instance.

That seems a bit like a chicken and egg problem. At the point that you
need to make the decision as to which Django instance to use, you
haven't yet logged in them in. Thus if you are going to try and use
their own instance to log them in, that can't work.

If you use one special Django instance to handle login, then issue is
having any session information in that instance also used by the other
instances such that when you go to actual instance on subsequent
requests, it knows you are allowed to access it.

I'll describe any solution in terms of HTTP Basic authentication first
and then can let you think about authentication when you see how the
multiplexing is achieved.

Graham

Brian

unread,
Nov 27, 2008, 10:55:12 PM11/27/08
to Django users
> > This is a good question, and I haven't come to a conclusion. Probably
> > it'll be the Django built-in application-based authentication,
> > particular to each Django instance. There will be a master map of all
> > users to their respective Django instance.
>
> That seems a bit like a chicken and egg problem. At the point that you
> need to make the decision as to which Django instance to use, you
> haven't yet logged in them in. Thus if you are going to try and use
> their own instance to log them in, that can't work.

If usernames are unique, e.g. email addresses, there can be a map,
e.g.:
a...@example.com => DjangoInstanceA,
d...@example.com => DjangoInstanceB,

This User-Instance mapping could happen prior to authentication if,
when the user submits login-information from a form, a process
(demultiplexer) decides based upon the username which Django instance
to direct the login-request to. The Django instance then handles all
subsequent requests.

I don't know if this will actually work.

> If you use one special Django instance to handle login, then issue is
> having any session information in that instance also used by the other
> instances such that when you go to actual instance on subsequent
> requests, it knows you are allowed to access it.

Perhaps HTTP Basic authentication might be the simplest solution. :)

> I'll describe any solution in terms of HTTP Basic authentication first
> and then can let you think about authentication when you see how the
> multiplexing is achieved.

That'd be great!

Graham Dumpleton

unread,
Nov 28, 2008, 5:31:01 PM11/28/08
to Django users
Just letting you know off list that I will respond, just need to find
some uninterrupted time to do so. Reply will be back to the list.

Graham

Graham Dumpleton

unread,
Nov 28, 2008, 5:32:47 PM11/28/08
to Django users


On Nov 29, 9:31 am, Graham Dumpleton <Graham.Dumple...@gmail.com>
wrote:
> Just letting you know off list that I will respond, just need to find
> some uninterrupted time to do so. Reply will be back to the list.

Hmmm, wrong button. No matter ......

Graham

Graham Dumpleton

unread,
Nov 29, 2008, 6:44:56 AM11/29/08
to Django users
On Nov 28, 2:55 pm, Brian <brianmh...@gmail.com> wrote:
Okay, the solution I am going to describe uses Apache/mod_wsgi. This
is an Apache module specifically designed for hosting Python WSGI
applications within Apache. The mod_wsgi module provides two modes of
operations. The first is embedded mode, which works similar to
mod_python in that applications run within the actual Apache child
worker processes. The second mode is daemon mode, which is similar in
some respects to fastcgi solutions, with applications running in
distinct processes from Apache child worker processes, and with Apache
child worker processes merely acting as a proxy to the WSGI
application daemon processes. We will be using mod_wsgi daemon mode in
this case, as it will allow for each Django instance to run in a
separate process.

How one would normally set up Apache/mod_wsgi for Django is described
at:

http://code.google.com/p/modwsgi/wiki/IntegrationWithDjango

What we will be doing here goes beyond that, but please ensure you
read that first and perhaps get a single Django instance running that
way before contemplating the multiplexing arrangement described here.

Getting back to what you wanted, you wanted multiple users to see a
Django instance mounted at same URL, but for each to actually get a
distinct instance associated with a distinct database.

For this, as would normally be done, still use WSGIScriptAlias to
mount the WSGI script file at the appropriate URL. Here we assume it
will be root of web server. Thus:

WSGIScriptAlias / /usr/local/django/mysite/apache/django.wsgi

By default, this will have Django instance running in embedded mode so
we want to override that. This would normally be done by configuring a
daemon process group and delegating application to run in it.

WSGIDaemonProcess django1 display-name=%{GROUP}

WSGIProcessGroup django1

The 'display-name' option here means that 'ps' command will show
'(wsgi:django1)' in output rather that 'httpd' process name. This will
be important for later on.

The way WSGIProcessGroup is used here means it is a static mapping, so
although we might be able to create more daemon process groups, you
would have to change the Apache configuration and restart Apache to be
able to delegate application to run in different daemon process group.
Obviously this isn't what we want.

So, instead of a static mapping, we use ability of mod_wsgi for the
process group to which application is delegated to be specified
dynamically. There are actually a number of ways this can be done when
using mod_wsgi, but will use a method which uses mod_rewrite as a
helper.

You indicated that there would be a cap on the number of instances of
Django that need to be running at any one time. Thus, what we will do
is pre define that many daemon process groups.

WSGIDaemonProcess django1 display-name=%{GROUP}
WSGIDaemonProcess django2 display-name=%{GROUP}
WSGIDaemonProcess django3 display-name=%{GROUP}
...
WSGIDaemonProcess djangon display-name=%{GROUP}

We also want which daemon process group is used based on identity of
logged in user. We will use HTTP Basic authentication as
authentication as that is the easiest. As long as you run stuff
through HTTPS using HTTP Basic authentication wouldn't be an issue.

Before we get onto how to use user identity from HTTP Basic
authentication, lets look at the dynamic mapping issue. To do this
what we are going to define is:

WSGIProcessGroup %{ENV:PROCESS_GROUP}

What this says is that name of process group should instead be source
from request environment variable called 'PROCESS_GROUP'. To set that,
we are going to use a rewrite rule and source the value to set it to
from a mapping file in the file system. Note using mapping file here
as it allows the value to then be set outside of Apache configuration
with Apache automatically picking up the change. Also key to when we
move on to dealing with user identity. Adding that we then have:

RewriteEngine On
RewriteMap procmap txt:/usr/local/django/mysite/apache/procmap.txt
RewriteRule . - [E=PROCESS_GROUP:${procmap:django|undefined}]

WSGIProcessGroup %{ENV:PROCESS_GROUP}

The 'procmap.txt' file will contain:

django django1

This file is read by Apache and cached, but will be reread when it
changes.

With the file written as is, means that Django instance will be
delegated to daemon process group called 'django1'. If you wanted it
instead to run in daemon process group called 'django2', you would
simply edit the 'procmap.txt' file and change it to:

django django2

If for some reason the file didn't contain key 'django' used in
rewrite rule, would use value of 'undefined' for process group name.
Since no such daemon process group defined, then mod_wsgi would return
500 error to request indicating that no valid daemon process group.

This sort of setup where delegation is manual may be a way of handling
swapping between application versions when upgrading a site, but we
need to introduce the identity of the user.

I will not show how to setup HTTP Basic authentication as you just
need to follow Apache documentation for that. Important thing to know
is that in using HTTP Basic authentication, the identity of the user
is then available to rewrite rules as "REMOTE_USER'. To use that, we
then change above to:

RewriteEngine On
RewriteMap procmap txt:/usr/local/django/mysite/apache/procmap.txt
RewriteRule . - [E=PROCESS_GROUP:${procmap:%{REMOTE_USER}|
undefined}]

WSGIProcessGroup %{ENV:PROCESS_GROUP}

The difference here is that instead of lookup key in rewrite map being
fixed value of 'django' we use the identity of the logged in user. The
'procmap.txt' file would then contain multiple entries, one per user
who could access the site.

graham django1
brian django2
macolm django3

Looking at that altogether, what we have is a pool of daemon process
groups that can be used and a way of dynamically, via the mapping
file, mapping different users requests into Django instances running
in those different daemon process group.

That is the multiplexing done, but it doesn't address how we have the
instance in each daemon process group use a different database.

Normally when using Django with Apache/mod_wsgi, the WSGI script file
would contain:

import os, sys
sys.path.append('/usr/local/django')
sys.path.append('/usr/local/django/mysite')
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'

import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

We don't want this though, as that would result in same Django
settings module and thus same database configuration being used for
each.

So, what we are going to do is to split this into two parts. The WSGI
script file will now just be:

import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

All this is doing is setting the Django WSGI application entry point.
It is not setting up sys.path or defining what the Django settings
module is. For the latter, we will instead use a separate code file
and load a unique one for each process group. To do that we use the
WSGIImportScript directive to preload the configuration into the
daemon process group at startup.

One issue with WSGIImportScript though is that have to specify both
the process group and application group (sub interpreter) into which
the file should be loaded. Since at moment the application group will
depend on name of host, let us instead force Django to run in main
interpreter so can use known name. This is done using
WSGIApplicationGroup directive.

What we will then have is:

WSGIScriptAlias / /usr/local/django/mysite/apache/django.wsgi

WSGIApplicationGroup %{GLOBAL}

RewriteEngine On
RewriteMap procmap txt:/usr/local/django/mysite/apache/procmap.txt
RewriteRule . - [E=PROCESS_GROUP:${procmap:%{REMOTE_USER}|
undefined}]

WSGIProcessGroup %{ENV:PROCESS_GROUP}

WSGIDaemonProcess django1 display-name=%{GROUP}
WSGIImportScript /usr/local/django/mysite/apache/django1.wsgi \
process-group=django1 application-group=%{GLOBAL}

WSGIDaemonProcess django2 display-name=%{GROUP}
WSGIImportScript /usr/local/django/mysite/apache/django2.wsgi \
process-group=django2 application-group=%{GLOBAL}

WSGIDaemonProcess django3 display-name=%{GROUP}
WSGIImportScript /usr/local/django/mysite/apache/django-config3.wsgi
\
process-group=django3 application-group=%{GLOBAL}

...

WSGIDaemonProcess djangon display-name=%{GROUP}
WSGIImportScript /usr/local/django/mysite/apache/djangon.wsgi \
process-group=djangon application-group=%{GLOBAL}

Presuming we only want to override database setting, 'django1.wsgi'
would then have:

import os, sys
sys.path.append('/usr/local/django')
sys.path.append('/usr/local/django/mysite')
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'

import mysite.settings

# Override per instance settings.
mysite.settings.DATABASE_HOST = ...
mysite.settings.DATABASE_NAME = ...
...

The file for 'django2.wsgi' would be similar but different overrides
for values.

Instead of importing Django settings module and overriding them, you
could also have distinct settings modules and just set
DJANGO_SETTINGS_MODULE differently for each.

What will now happen is that when Apache/mod_wsgi starts up a daemon
process, it will import that configuration script which will set
DJANGO_SETTINGS_MODULE and specify any overrides.

When a request actually arrives, then the normal WSGI script file is
loaded, along with actual Django code, but where configuration is
based on the preloaded script.

In this arrangement, the Django code will actually be lazily loaded,
that is, only on first request. If you wanted to preload that as well,
would just need to import necessary Django modules at end of config
script.

That is basically it and hopefully it makes sense. It may seem a bit
complex, but what you want to do isn't normal. Also, it all could have
been done a bit simpler if were using mod_wsgi 3.0 development code
from subversion trunk as with that have a way of avoiding use of
WSGIImportScript. This is because in that next version, name of daemon
process group is actually available from within WSGI script file at
global scope. Thus could have setting DJANGO_SETTINGS_MODULE in WSGI
script file such that it used name of process group. For example:

import mod_wsgi
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.%ssettings' %
mod_wsgi.process_group

Then all we need to do is have separate Django settings modules for
the instance in each process group.

The only thing to know cover is what to do when introducing and/or
removing users.

For that the steps would be:

1. Modify appropriate 'djangon.wsgi' config file referenced by
WSGIImportScript.

2. Use 'ps' to identity PID that that daemon process and send it
SIGINT. This will cause it to shutdown and restart.

3. Modify 'procpmap.txt' file to add in entry mapping new user to
appropriate 'djangon' process group.

4. Add new user into authentication user database.

For deletion, opposite done.

Anyway, work through that and see if it makes any sense at all. As I
said, could be a bit cleaner if using mod_wsgi 3.0 development
version, or if you want to make a 4 line code change to mod_wsgi 2.X
version you use. Could well be worth making the change just to make it
that little bit simpler.

BTW, also note this is all just typed in out of my head, I haven't
actually gone and tested it, although believe it should work. The only
issue you might run across is that the WSGIImportScript directive at
the moment actually has to go outside of any VirtualHost containers
where as WSGIDaemonProcess it applies to can be inside. There is a
ticket for mod_wsgi to address this, but haven't got around to fixing
it.

If there are any questions, let me know.

Graham







Graham Dumpleton

unread,
Nov 29, 2008, 6:15:29 PM11/29/08
to Django users
Graham Dumpleton wrote:
> That is basically it and hopefully it makes sense. It may seem a bit
> complex, but what you want to do isn't normal. Also, it all could have
> been done a bit simpler if were using mod_wsgi 3.0 development code
> from subversion trunk as with that have a way of avoiding use of
> WSGIImportScript.

There is actually a more simpler way that it could be done even when
using mod_wsgi 2.X. It does do something that I tend to discourage,
and for which I complain about Django/mod_python integration doing,
but in this case do it for single variable and not many like Django/
mod_python. :-)

What we will do this time is use a second rewrite map file to hold
which Django settings module should be used. This will be passed as
WSGI environment variable, but then forced into process environment
variables.

WSGIScriptAlias / /usr/local/django/mysite/apache/django.wsgi

RewriteEngine On

RewriteMap procmap txt:/usr/local/django/mysite/apache/procmap.txt
RewriteRule . - [E=PROCESS_GROUP:${procmap:%{REMOTE_USER}|
undefined}]

RewriteMap configmap txt:/usr/local/django/mysite/apache/
configmap.txt
RewriteRule . - [E=DJANGO_SETTINGS_MODULE:${configmap:%
{ENV:PROCESS_GROUP}|mysite.settings}]

WSGIProcessGroup %{ENV:PROCESS_GROUP}

WSGIDaemonProcess django1 display-name=%{GROUP}
WSGIDaemonProcess django2 display-name=%{GROUP}
WSGIDaemonProcess django3 display-name=%{GROUP}
...
WSGIDaemonProcess djangon display-name=%{GROUP}

The WSGI script file would then be:

import os, sys
sys.path.append('/usr/local/django')
sys.path.append('/usr/local/django/mysite')

import django.core.handlers.wsgi

_application = django.core.handlers.wsgi.WSGIHandler()

def application(environ, start_response):
os.environ['DJANGO_SETTINGS_MODULE] = environ
['DJANGO_SETTINGS_MODULE']
return _application(environ, start_response)

The bit I don't like here is that we are setting a process environment
variable on every request. For it to work, we are relying on fact that
Django settings module is actually lazily loaded only when first
required (at least I believe it is). Thus, we use the value of
DJANGO_SETTINGS_MODULE derived from very first request passed in by
Apache. After that request, for that life of the daemon process, even
if value changes, it is ignored.

The 'configmap.txt' file would then contain:

django1 mysite.settings1
django2 mysite.settings2
...

All this allows us to then get rid of the WSGIImportScript and the
associated preconfig scripts. It does mean having distinct settings
module files for each instance. They would be completely separate, or
would import 'mysite.settings' and then override only what is
required.

from settings import *

DATABASE_HOST = ...
DATABASE_NAME = ...

Adding new site would then be:

1. Add new instance settings file and configure it. Setup database
etc.

2. Edit configmap.txt and map instance to new settings file.

3. Use 'ps' to determine PID of instance and send it SIGINT.

4. Edit procmap.txt and map user to instance.

Give that one a whirl as well and let me know if have questions or
something doesn't quite work as advertised. Only bit I am not sure
about is if %{ENV:PROCESS_GROUP} will be available for use in second
rewrite rule already.

Graham



Reply all
Reply to author
Forward
0 new messages