[modwsgi] Serving multiple branches

28 views
Skip to first unread message

paryl

unread,
May 2, 2010, 3:07:50 PM5/2/10
to modwsgi
Hi All,

At work we maintain a large-scale PHP project. For our dev
environment, each developer is constantly creating new svn checkouts
for revisions, each one in a unique directory. We have Apache
configured for a virtual hosting environment, and the URL determines
which branch to serve from.

We made the decision to slowly make the transition to Python, and we
are working on getting things set up in a similar way for our
developers. We are using repoze.bfg for the framework.

If we create a separate vhost file for each directory, things work
fine. The problem is, 5-10 new directories are created per day, and
developers don't have permissions to restart/reload apache. Because
of that, I've been trying to figure out how to get mod_wsgi configured
once in a vhost file, and the rest of the configuration controlled by
the .htaccess file in each dev directory. The files are (at this point
in time):

70_mod_wsgi.conf
++++++
<IfDefine WSGI>
LoadModule wsgi_module modules/mod_wsgi.so
WSGIDaemonProcess %{ENV:APPLICATION_GROUP} processes=1 threads=1
</IfDefine>

.htaccess
++++++
DirectoryIndex bfg.wsgi
AddHandler wsgi-script .wsgi
SetEnv APPLICATION_GROUP unique-name

This works... basically. It serves each branch, and all we have to do
for new branches is change the APPLICATION_GROUP variable
in .htaccess, which is the desired behavior. The problem we are
seeing is that each bfg process is interfering with the others,
serving another branch's static files, etc.

I've tried different variations of the above Apache config using
WSGIProcessGroup and WSGIApplicationGroup, and I've tried using
different environment variables for all of them. The only combination
that seems to work is the above, though it doesn't make sense because
the documentation doesn't indicate that expanding variables are
acceptable for WSGIDaemonProcess. Setting processes and threads
higher still gives the issues, it just takes a little longer to see
them.

I've searched and searched, but I can't find anyone having the same
problems. Do you guys know of a way to do what I'm trying to do?

Thanks for your help!
-phillip

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To post to this group, send email to mod...@googlegroups.com.
To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.

Jason Garber

unread,
May 2, 2010, 7:16:53 PM5/2/10
to mod...@googlegroups.com
Hi Phillip,

A bit of a side-answer, but we have this problem 100% solved for multiple developers using either PHP or Python and version control, tons of branches, and separate Live/Preview/Dev sites.  Also worked around the headache of hand-writing <VirtualHost /> entries all the time.

I won't bore the rest of the list with the details, but if you'd like to review, let me know.

Sincerely,
Jason Garber

Graham Dumpleton

unread,
May 2, 2010, 7:24:22 PM5/2/10
to mod...@googlegroups.com
On 3 May 2010 05:07, paryl <pry...@gmail.com> wrote:
> Hi All,
>
> At work we maintain a large-scale PHP project.  For our dev
> environment, each developer is constantly creating new svn checkouts
> for revisions, each one in a unique directory.  We have Apache
> configured for a virtual hosting environment, and the URL determines
> which branch to serve from.
>
> We made the decision to slowly make the transition to Python, and we
> are working on getting things set up in a similar way for our
> developers.  We are using repoze.bfg for the framework.

How much memory does a single instance of your Python web application
take up in memory?

> If we create a separate vhost file for each directory, things work
> fine.  The problem is, 5-10 new directories are created per day,

Are the old revision directories cleaned up and no longer used, or are
you expecting old revisions to be accessible forever?

> and
> developers don't have permissions to restart/reload apache.  Because
> of that, I've been trying to figure out how to get mod_wsgi configured
> once in a vhost file, and the rest of the configuration controlled by
> the .htaccess file in each dev directory. The files are (at this point
> in time):
>
> 70_mod_wsgi.conf
> ++++++
> <IfDefine WSGI>
> LoadModule wsgi_module modules/mod_wsgi.so
> WSGIDaemonProcess %{ENV:APPLICATION_GROUP} processes=1 threads=1

Cant do that. The value '%{ENV:APPLICATION_GROUP}' will be used literally.

> </IfDefine>
>
> .htaccess
> ++++++
> DirectoryIndex bfg.wsgi

Generally you cant do that. When you use DirectoryIndex you cannot
supply any additional path information in a URL beyond that used to
map to that directory.

If the bfg.wsgi file is also accessed directly as URL of form
'/some/url/bfg.wsgi', you may also have issue whereby access via
directory is against separate instance in memory. This is because
perceived mount point of each will be different.

> AddHandler wsgi-script .wsgi
> SetEnv APPLICATION_GROUP unique-name

You shouldn't be using application group. That specifies which
interpreter within a process is used, not which process. If you are
setting all instances to have same application group, that will be why
you are having a problem. Because of the defaults, you shouldn't have
to touch application group normally as instances will be correctly
separated based on URL mount point.

> This works... basically.  It serves each branch, and all we have to do
> for new branches is change the APPLICATION_GROUP variable
> in .htaccess, which is the desired behavior.  The problem we are
> seeing is that each bfg process is interfering with the others,
> serving another branch's static files, etc.

How are static files being served, by the WSGI application itself, or
are you using some other Apache configuration to have Apache serve
them?

> I've tried different variations of the above Apache config using
> WSGIProcessGroup and WSGIApplicationGroup, and I've tried using
> different environment variables for all of them.  The only combination
> that seems to work is the above, though it doesn't make sense because
> the documentation doesn't indicate that expanding variables are
> acceptable for WSGIDaemonProcess.  Setting processes and threads
> higher still gives the issues, it just takes a little longer to see
> them.
>
> I've searched and searched, but I can't find anyone having the same
> problems.  Do you guys know of a way to do what I'm trying to do?

Can you answer the questions above? Especially address that about how
long older revisions need to be available.

Do that I will describe some things you can try.

Graham

paryl

unread,
May 3, 2010, 8:27:53 AM5/3/10
to modwsgi
Hi Jason,

That would be fantastic. Please share!

-phillip
> > modwsgi+u...@googlegroups.com<modwsgi%2Bunsu...@googlegroups.com>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/modwsgi?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups "modwsgi" group.
> To post to this group, send email to mod...@googlegroups.com.
> To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
> For more options, visit this group athttp://groups.google.com/group/modwsgi?hl=en.

paryl

unread,
May 3, 2010, 8:39:22 AM5/3/10
to modwsgi
Hi Graham,

Great questions!

On May 2, 6:24 pm, Graham Dumpleton <graham.dumple...@gmail.com>
wrote:
> On 3 May 2010 05:07, paryl <pry...@gmail.com> wrote:
>
> How much memory does a single instance of your Python web application
> take up in memory?

At this point, not much. Since only a few of the branches will be in
use at any given time, memory hasn't been a huge concern.

>
> > If we create a separate vhost file for each directory, things work
> > fine.  The problem is, 5-10 new directories are created per day,
>
> Are the old revision directories cleaned up and no longer used, or are
> you expecting old revisions to be accessible forever?

They get cleaned up over time. They might be there from a few minutes
to many months, but eventually they go away.

>
> > 70_mod_wsgi.conf
> > ++++++
> > <IfDefine WSGI>
> > LoadModule wsgi_module modules/mod_wsgi.so
> > WSGIDaemonProcess %{ENV:APPLICATION_GROUP} processes=1 threads=1
>
> Cant do that. The value '%{ENV:APPLICATION_GROUP}' will be used literally.

Aha, I figured as much, but it's good to have confirmation.

>
> > This works... basically.  It serves each branch, and all we have to do
> > for new branches is change the APPLICATION_GROUP variable
> > in .htaccess, which is the desired behavior.  The problem we are
> > seeing is that each bfg process is interfering with the others,
> > serving another branch's static files, etc.
>
> How are static files being served, by the WSGI application itself, or
> are you using some other Apache configuration to have Apache serve
> them?

They are being served by the wsgi app, as in my example above.

>
> Can you answer the questions above? Especially address that about how
> long older revisions need to be available.
>
> Do that I will describe some things you can try.
>
> Graham

Thanks for your help!
-phillip

Graham Dumpleton

unread,
May 3, 2010, 9:32:09 PM5/3/10
to mod...@googlegroups.com
On 3 May 2010 05:07, paryl <pry...@gmail.com> wrote:
> Hi All,
>
> At work we maintain a large-scale PHP project.  For our dev
> environment, each developer is constantly creating new svn checkouts
> for revisions, each one in a unique directory.  We have Apache
> configured for a virtual hosting environment, and the URL determines
> which branch to serve from.

Sorry, some more questions.

How are you current setting up virtual hosts when using PHP?

If you are using VirtualHost directive you would have same issue with
needing to restart Apache to have it recognise the new virtual host.

Does the system where you are running Python also need to support PHP
in parallel and thus whatever you use for Python has to interwork with
supporting PHP?

Need to know how PHP comes into the picture or whether PHP support can
simply be ignored, thus simplifying how Python might be handled.

Graham

Graham Dumpleton

unread,
May 4, 2010, 7:45:29 AM5/4/10
to mod...@googlegroups.com
On 4 May 2010 11:32, Graham Dumpleton <graham.d...@gmail.com> wrote:
> On 3 May 2010 05:07, paryl <pry...@gmail.com> wrote:
>> Hi All,
>>
>> At work we maintain a large-scale PHP project.  For our dev
>> environment, each developer is constantly creating new svn checkouts
>> for revisions, each one in a unique directory.  We have Apache
>> configured for a virtual hosting environment, and the URL determines
>> which branch to serve from.
>
> Sorry, some more questions.
>
> How are you current setting up virtual hosts when using PHP?
>
> If you are using VirtualHost directive you would have same issue with
> needing to restart Apache to have it recognise the new virtual host.
>
> Does the system where you are running Python also need to support PHP
> in parallel and thus whatever you use for Python has to interwork with
> supporting PHP?
>
> Need to know how PHP comes into the picture or whether PHP support can
> simply be ignored, thus simplifying how Python might be handled.

Pending an answer on the above, I'll assume that you don't have to
worry about hosting PHP and that this Apache instance is used only for
hosting Python applications.

First thing to note is that if you use WSGIScriptAlias, then to add a
new application you must restart Apache. We thus need to avoid that.

The way of doing that is to use AddHandler instead. This will allow
one to drop a .wsgi file corresponding to a new application instance
into a directory and will be immediately available.

Using AddHandler does however have the consequence that by default the
name of the .wsgi file becomes part of the URL. If your application is
relocatable and not dependent on being hosted at the root of the web
server, this would be more than adequate in a development and test
environment. It even allows you to do it within a single virtual host.

<VirtualHost *:80>
ServerName example.com

DocumentRoot /var/www/htdocs

<Directory /var/www/htdocs>
Order deny,allow
Allow from All
Options ExecCGI
AddHandler wsgi-script .wsgi
</Directory>
</VirtualHost>

So, for each instance, add a file into DocumentRoot directory. For
example, you might have 'rev100.wsgi', 'rev101.wsgi', 'rev102.wsgi',
etc. These would be access as:

http://example.com/rev101.wsgi
http://example.com/rev102.wsgi
http://example.com/rev103.wsgi

To bring a new instance online, just add a new .wsgi file.

The issue is now separation between the instances.

By default each application instance is run within a distinct sub
interpreter of a process. The name of the sub interpreter is created
from combination of server name and application mount point
(SCRIPT_NAME). Thus for above, the sub interpreters are named as
follows:

example.com|/rev101.wsgi
example.com|/rev102.wsgi
example.com|/rev103.wsgi

So separation should hopefully not be an issue as each application
will run in separate sub interpreter.

You could still have problems with third party C extension modules or
issues with process global configuration at C level, such as time
zones or language locale.

Thus, process separation, rather than separation by way of sub
interpreters within a process, would be a preferable solution.

Process separation of a different type is also an issue. This is that
at present we are running in embedded mode. This means to drop an in
memory instance of an application as it is no longer needed, or we
want to reload an application because we changed the code, we still
need to restart Apache.

First step at least is thus to use daemon mode. For that, add
WSGIDaemonProcess/WSGIProcessGroup.

<VirtualHost *:80>
ServerName example.com

DocumentRoot /var/www/htdocs

WSGIDaemonProcess apps
WSGIProcessGroup apps

<Directory /var/www/htdocs>
Order deny,allow
Allow from All
Options ExecCGI
AddHandler wsgi-script .wsgi
</Directory>
</VirtualHost>

This gets the applications out of the Apache server child processes
albeit they are still in the same process.

We can at least though cause that process to be restarted if one
application changes by touching the .wsgi file for that application.

For actual process separation, we need a
WSGIDaemonProcess/WSGIProcessGroup for each, but to add a new one for
each .wsgi script file, then need to again restart Apache.

To avoid that, we can create a pool of daemon process groups and
dynamically delegate a new .wsgi file to an available process.

<VirtualHost *:80>
ServerName example.com

DocumentRoot /var/www/htdocs

WSGIDaemonProcess wiggles display-name=%{GROUP}

WSGIDaemonProcess sam display-name=%{GROUP}
WSGIDaemonProcess murray display-name=%{GROUP}
WSGIDaemonProcess anthony display-name=%{GROUP}
WSGIDaemonProcess jeff display-name=%{GROUP}
WSGIDaemonProcess dorothy display-name=%{GROUP}
WSGIDaemonProcess henry display-name=%{GROUP}
WSGIDaemonProcess wags display-name=%{GROUP}

WSGIProcessGroup %{ENV:PROCESS_GROUP}
SetEnv PROCESS_GROUP wiggles

<Directory /var/www/htdocs>
Order deny,allow
Allow from All
Options ExecCGI
AddHandler wsgi-script .wsgi
AllowOverride FileInfo
</Directory>
</VirtualHost>

The WSGIProcessGroup directive you see sets a default, so if you don't
override it for a specific .wsgi file it will end up in the 'wiggles'
process.

In the .htaccess file of the DocumentRoot directory, we can now do the
following.

<Files rev101.wsgi>
SetEnv PROCESS_GROUP sam
</Files>

<Files rev102.wsgi>
SetEnv PROCESS_GROUP murray
</Files>

<Files rev103.wsgi>
SetEnv PROCESS_GROUP anthony
</Files>

So, as you add new .wsgi script files, you modify .htaccess file to
say what process to run it in. You can still run multiple applications
in one process if you want to.

If you remove a .wsgi script file, just remember to send a SIGINT to
the process so it will recycle and discard old application. Having
used display-name option, you can easily use 'ps' on most systems to
identify which process to send signal. For example '(wsgi:sam)' for
rev 101.

If you simple want to free up a process but still keep .wsgi file, you
can take out SetEnv for a .wsgi file and send signal and next time
.wsgi used, will use the default process.

One can take this further if you really needed a separate virtual host
for each even with the application mounted at root of web server.

The start of this would be to not use VirtualHost, but use other
methods for construction virtual hosts. One way of doing that is to
use mod_vhost_alias instead. I haven't tried this, but believe you
could use:

VirtualDocumentRoot /var/www/vhosts/%0/htdocs

WSGIDaemonProcess wiggles display-name=%{GROUP}

WSGIDaemonProcess sam display-name=%{GROUP}
WSGIDaemonProcess murray display-name=%{GROUP}
WSGIDaemonProcess anthony display-name=%{GROUP}
WSGIDaemonProcess jeff display-name=%{GROUP}
WSGIDaemonProcess dorothy display-name=%{GROUP}
WSGIDaemonProcess henry display-name=%{GROUP}
WSGIDaemonProcess wags display-name=%{GROUP}

WSGIProcessGroup %{ENV:PROCESS_GROUP}
SetEnv PROCESS_GROUP wiggles

<Directory /var/www/vhost/*/htdocs>
Order deny,allow
Allow from All
Options ExecCGI
AddHandler wsgi-script .wsgi
AllowOverride FileInfo
</Directory>

Each virtual host then has a separate document root directory. You
would still stick .wsgi files in that directory like before and use
SetEnv to dynamically dictate which daemon process group to use.

Finally, if you want application to appear at root of host, you should
be able to have in .htaccess something like the following.

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /site.wsgi/$1 [QSA,PT,L]

with the application file being called site.wsgi.

You will need to play with that though as recollect that is what one
would use in main Apache configuration. That or you may be able to
even put it in the main Apache configuration.

<Directory /var/www/vhost/*/htdocs>

Order deny,allow
Allow from All
Options ExecCGI
AddHandler wsgi-script .wsgi
AllowOverride FileInfo

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /site.wsgi/$1 [QSA,

</Directory>

If VirtualDocumentRoot doesn't work, then one can instead look at
doing dynamic virtual hosts using mod_rewrite. Can look at that later
if need be.

Hope that is of use. :-)

If you try out VirtualDocumentRoot let me know what works, as haven't
tried that one before. Just remember that if you use that, you
shouldn't use VirtualHost in same Apache configuration.

Bernd Zeimetz

unread,
May 4, 2010, 7:46:49 AM5/4/10
to mod...@googlegroups.com
On 05/03/2010 01:16 AM, Jason Garber wrote:
> Hi Phillip,
>
> A bit of a side-answer, but we have this problem 100% solved for multiple
> developers using either PHP or Python and version control, tons of branches,
> and separate Live/Preview/Dev sites. Also worked around the headache of
> hand-writing <VirtualHost /> entries all the time.
>
> I won't bore the rest of the list with the details, but if you'd like to
> review, let me know.

I'm not sure if the list would really be bored with such details - I'd be
interested to see ideas how such a solution could work.


--
Bernd Zeimetz Debian GNU/Linux Developer
http://bzed.de http://www.debian.org
GPG Fingerprints: 06C8 C9A2 EAAD E37E 5B2C BE93 067A AD04 C93B FF79
ECA1 E3F2 8E11 2432 D485 DD95 EB36 171A 6FF9 435F

Graham Dumpleton

unread,
May 4, 2010, 7:54:49 AM5/4/10
to mod...@googlegroups.com
Missed one thing. As documented in:

http://code.google.com/p/modwsgi/wiki/ConfigurationGuidelines#The_Apache_Alias_Directive

you need to do a SCRIPT_NAME fixup in .wsgi script file when using
that rewrite trick to have application mapped using AddHandler to root
of site. Thus:

def _application(environ, start_response):
# The original application.
...

import posixpath

def application(environ, start_response):
# Wrapper to set SCRIPT_NAME to actual mount point.
environ['SCRIPT_NAME'] = posixpath.dirname(environ['SCRIPT_NAME'])
if environ['SCRIPT_NAME'] == '/':
environ['SCRIPT_NAME'] = ''
return _application(environ, start_response)

Graham

paryl

unread,
May 4, 2010, 11:09:02 AM5/4/10
to modwsgi
Thank you Graham,

That is a great writeup, and it gave me a lot of things to try out.
Unfortunately, I feel like I'm back at square one with a few points.

To answer your previous questions...

> How are you current setting up virtual hosts when using PHP?
>
> If you are using VirtualHost directive you would have same issue with
> needing to restart Apache to have it recognise the new virtual host.

Our vhost file looks like this:

Listen *:80
<Directory /var/www/htdocs>
Options FollowSymLinks
Options +ExecCGI
AllowOverride All
</Directory>

NameVirtualHost *:80
<VirtualHost *:80>
VirtualDocumentRoot /var/www/htdocs/%-5/%-6/customer_web/
</VirtualHost>

> Does the system where you are running Python also need to support PHP
> in parallel and thus whatever you use for Python has to interwork with
> supporting PHP?
>
> Need to know how PHP comes into the picture or whether PHP support can
> simply be ignored, thus simplifying how Python might be handled.

Yes. For at least a few years (until all code is converted to Python)
we will need to support both languages.

As you can probably gather from the vhost file, each developer has a
'home' directory under htdocs, and the individual branches are set up
in their own directories under that developer directory. When hitting
a directory on the server, the URL is parsed to a given internal
directory using VirtualDocumentRoot. That works beautifully, btw.

The issue with your solution in our scenario is that it assumes a
single directory that all of the wsgi files reside under, and it
assumes a finite number of branches. At current count, we have around
150 branches that can be served, each one needs to be a standalone
'copy' of trunk, and the number is constantly fluctuating. Keeping
them straight using reserved group names, especially if it meant
editing a central set of files, would be impossible.

So based on your awesome explanation, I played around with some
different combinations to see what I could get it to do. One issue is
that the server name for a vhost doesn't take the subdomain into
account. For instance, if the server name is "dev.domain.com" and my
branch resides at "branch.phillip.dev.domain.com", the ServerName
environment variable is always the main domain. So with the default
settings, I get process='', application='www.domain.com|bfg.wsgi' for
all of the subdomains that I try. Because of that, I was setting an
environment variable in .htaccess and using it for
WSGIApplicationGroup. This works as far as the group itself goes...
the process has the right application name for each process (from the
env set in .htaccess), but they all run using the same process,
causing the "RuntimeError: class.__dict__ not accessible in restricted
mode" error.

It seems no matter what I try, I always hit a wall. Is there
something fundamentally wrong with what I'm doing, or am I just trying
to get mod_wsgi to do something it wasn't designed to do?

Thanks!
-phillip

Jason Garber

unread,
May 4, 2010, 8:00:24 PM5/4/10
to mod...@googlegroups.com
Ok, I'll post it here.  The problem boiled down to a 16,000 line apache configuration file that covered scores of domains and a dozen developers, as well as the issues with staging and production.  

At some point, I decided to more fully integrate Apache into our stack.  Prior to that, it was just a webserver.  After that, we started taking advantage of more modules - mod_rewrite, etc...  But with that comes the potential for mis-configured virtual hosts introducing violent bugs...

The goal was to have 100% consistent configurations based on the same code in dev/preview/production, and have the configuration update automatically based on where the checked out project was.  Furthermore, we wished to have project information available statically, rather than always determining it at runtime (simple stuff like project path, path to save files, credit-card-in-test-mode, etc...) as in including/importing a single file.

--
So I wrote a little python module called ConfTree.  It is used to compile an in-memory configuration tree based on at least 1 _Twig.py file in each project.  Here is a sample _Twig.py:

  4 # Create the project
  5 Project = Instance.Project.Add('MCorp')
  6 
  7 # Database password.  
  8 Project.MySQL.Password = '134573495872349754'
  9 
 12 # NoSSL
 13 VH1 = Project.VirtualHost.Add('http:mcorp.com')
 14 VH1.ServerName = 'mcorp.com'
 15 VH1.DocumentRoot = './Web'
 16 
 20 WSGI_Daemon_Name = VH1.ServerName
 21 Instance.Apache.Conf += [
 22   'WSGIDaemonProcess ' + WSGI_Daemon_Name + ' threads=5 python-path=' + abspath('./Python')
 23   ]
 24 
 25 # SSL
 26 VH2 = Project.VirtualHost.Add('https:secure.mcorp.com')
 27 VH2.Protocol = 'https'
 28 VH2.ServerName = 'secure.mcorp.com'
 29 VH2.DocumentRoot = './Web'
 30 
 32 extra = [
 33   'AddDefaultCharset utf8',
 34 
 35   'RewriteEngine on',
 36   'RewriteRule ^/$ /User/Login/ [R,L]',
 37   'RewriteRule ^/[aA]dmin$ /Admin/Login/ [R,L]',
 38 
 41   'WSGIScriptAlias /Admin ' + abspath('./WSGI/__init__.wsgi'),
 42   'WSGIScriptAlias /User ' + abspath('./WSGI/__init__.wsgi'),
 43   'WSGIScriptAlias /Main ' + abspath('./WSGI/__init__.wsgi'),
 44   'WSGIProcessGroup ' + WSGI_Daemon_Name,
 46   ]
 47 
 48 if Instance.DevLevel > 0:
 49   extra += [
 50     'LogLevel info',
 51     'ErrorLog ' + abspath('./apache-error.log'),
 52     ]
 53 
 55 VH1.Extra += extra
 56 VH2.Extra += extra
 57 
 60 # Specify what _Auto files to run (later)
 61 AutoConf('_Auto.py', Project=Project)
 62 

Since certain variables are determined automatically by ConfTree (such as DevLevel, DevUser, InstancePath, ProjectPath, etc...), the result of the in-memory configuration is 100% customized to the actual project path / dev level / developer / etc...

To support auto configurations, we have an additional file in each project called _Auto.py.  Here is an example:

  2 F = FileWriter()
  3 
  4 # Init the data
  5 F['InstancePath']       = Instance.Path
  6 F['DevLevel']           = Instance.DevLevel
  7 F['DevName']            = Instance.DevName
  8 F['ProjectIdentifier']  = Project.ID
  9 F['ProjectPath']        = Project.Path
 10 
 11 F['MySQL']              = Project.MySQL
 12 
 13 F['App_EmailDatabase']  = Project.OtherData['EmailDatabase']
 14 
 15 F['URL_Standard']       = Project.VirtualHost['http:www.twittertwenius.com'].URL
 16 F['URL_Secure']         = Project.VirtualHost['https:secure.twittertwenius.com'].URL
 17 
 18 # PHP needs to read it
 19 F.Write('AutoConf/Local.php', """<?php
 20 # Generated by AutoConf system
 21 
 22 define('WB4_Path', {InstancePath});
 23 define('WB4_DevLevel', {DevLevel});
 24 define('WB4_DevName', {DevName}); 
 25 
 26 App::$Identifier = {ProjectIdentifier};
 27 App::$Path = {ProjectPath};
 28 
 29 App::$URL_Standard = {URL_Standard};
 30 App::$URL_Secure = {URL_Secure};
 31 
 32 """)
 33 
 34 
 35 # Python needs to read it
 36 F.Write('AutoConf/Local.py', """
 37 # Generated by AutoConf system
 38 
 39 InstancePath = {InstancePath}
 40 DevLevel = {DevLevel}
 41 DevName = {DevName}
 42 
 43 EmailDatabase = {App_EmailDatabase}
 44 
 45 Identifier = {ProjectIdentifier}
 46 Path = {ProjectPath}
 47 
 48 MySQL = {MySQL}
 49 
 50 URL_Standard = {URL_Standard}
 51 URL_Secure = {URL_Secure}
 52 """)
 53 

You can see that the in-memory project data is now used to output 2 files -- Local.py and Local.php.  These files are stored within a directory that is ignored by version control, so they are always localized to the developer.  Here is a sample output:

Local.php
  1 <?php
  2 # Generated by AutoConf system
  3 
  4 define('WB4_Path', "/home/jason/Code/WhiteBoot4/DevLevel.2");
  5 define('WB4_DevLevel', 2);
  6 define('WB4_DevName', NULL);
  7 
  8 App::$Identifier = "Twenius";
  9 App::$Path = "/home/jason/Code/WhiteBoot4/DevLevel.2/Twenius";
 10 
 13 

and Local.py
  2 # Generated by AutoConf system
  3 
  4 InstancePath = "/home/jason/Code/WhiteBoot4/DevLevel.2"
  5 DevLevel = 2
  6 DevName = None
  7 EncryptKey = "48cdsfsdfsdfsdfsds"
  8 
  9 EmailDatabase = "ModEmail_2"
 10 
 11 Identifier = "Twenius"
 12 Path = "/home/jason/Code/WhiteBoot4/DevLevel.2/Twenius"
 13 
 14 MySQL = {
 15     "Username": "Twenius",
 16     "Host": "localhost",
 17     "Password": "yeah-right",
 18     "Port": 3306,
 19     "Database": "Twenius_2",
 20     }
 21 
 24 

(this given project uses BOTH php and python)
Lastly, the final output of this in-memory data is a valid apache configuration file.  Here is a snippet from a very large apache conf file:

   12 WSGISocketPrefix run/wsgi
   13 WSGIApplicationGroup %{GLOBAL}
   14 WSGIDaemonProcess mcorp.com.2.jason.star.appcove.net threads=5 python-path=/home/jason/Code/WhiteBoot4/DevLevel.2/MCorp/Python
    ...snip...
1726 # mcorp.com.2.jason.star.appcove.net|NoSSL (DevLevel.2)
1727 <VirtualHost *>
1729    DocumentRoot /home/jason/Code/WhiteBoot4/DevLevel.2/MCorp/Web
1730    RewriteEngine on
1731    RewriteOptions inherit
1732    AddDefaultCharset utf8
1733    RewriteEngine on
1734    RewriteRule ^/$ /User/Login/ [R,L]
1735    RewriteRule ^/[aA]dmin$ /Admin/Login/ [R,L]
1736    WSGIScriptAlias /Admin /home/jason/Code/WhiteBoot4/DevLevel.2/MCorp/WSGI/__init__.wsgi
1737    WSGIScriptAlias /User /home/jason/Code/WhiteBoot4/DevLevel.2/MCorp/WSGI/__init__.wsgi
1738    WSGIScriptAlias /Main /home/jason/Code/WhiteBoot4/DevLevel.2/MCorp/WSGI/__init__.wsgi
1739    WSGIProcessGroup mcorp.com.2.jason.star.appcove.net
1740    LogLevel info
1741    ErrorLog /home/jason/Code/WhiteBoot4/DevLevel.2/MCorp/apache-error.log
1742 </VirtualHost>
   
--------------
From a developer's perspective:

cd /path/to/project-folder/
git clone ssh://.../MCorp.git
wb4-conf
sudo acn-apache-restart

Of course, the sudo only has access to the acn-apache-restart script, which rebuilds the configuration, and then tests it, before doing a graceful restart.  

-------------
The nice thing is that the exact same setup is used to go to Preview, and to Live.  I cannot recall a single instance of ever having a configuration problem since we started doing it this way.  Our developers don't run over each other either.

By the way, we dumped svn and switched to git, and never looked back.  It's a snap to just switch branches and work on that, and back.

Notice the URL above ended was DOMAIN.COM.2.jason.star.appcove.net.  
That means DevLevel=2 (development), user=jason, server=star.  

A preview URL would be DOMAIN.COM.1.preview.appcove.net.  
That means DevLevel=1 (preview), etc...

-----------
I've been meaning to bundle this up into an OSS project, but haven't gotten to yet.  It is all available right now, abet with a few warts.

Feel free to get in touch. 
Jason Garber


Graham Dumpleton

unread,
May 4, 2010, 8:35:42 PM5/4/10
to mod...@googlegroups.com
No, the final variant I presented doesn't.

The first option did as it used a single virtual host with each
application at a distinct mount point.

The latter option which used VirtualDocumentRoot allows the .wsgi file
for a specific application to be in the document root for that
instance and don't need to be in the same directory.

> At current count, we have around
> 150 branches that can be served, each one needs to be a standalone
> 'copy' of trunk, and the number is constantly fluctuating. Keeping
> them straight using reserved group names, especially if it meant
> editing a central set of files, would be impossible.

Then mod_wsgi as it stands now isn't going to make your job easy as it
only supports static daemon process groups and not dynamic ones.
Because it only supports static, the only way to handle any
dynamicity, without restarting Apache, is to prespecify a huge pool of
daemon process groups and dynamically map to a required one. In the
example I present I did it in .htaccess so user could do it, but could
also be handled by a centralised mod_rewrite rewrite map file.

> So based on your awesome explanation, I played around with some
> different combinations to see what I could get it to do.  One issue is
> that the server name for a vhost doesn't take the subdomain into
> account.  For instance, if the server name is "dev.domain.com" and my
> branch resides at "branch.phillip.dev.domain.com", the ServerName
> environment variable is always the main domain.  So with the default
> settings, I get process='', application='www.domain.com|bfg.wsgi' for
> all of the subdomains that I try.

Hmmm, I overlooked that virtual host name will not be right, but we
can fix that up by using a mod_rewrite rule. Alternatively, we can use
a WSGIDispatchScript to dictate what application group should be. This
might be moot though given issues below with multiple sub
interpreters.

What you shouldn't get is process=''. If you are getting this you
haven't followed the instructions I gave and used daemon mode, instead
you are still using embedded mode which you want to avoid for various
reasons some of which I explained previously.

> Because of that, I was setting an
> environment variable in .htaccess and using it for
> WSGIApplicationGroup.  This works as far as the group itself goes...
> the process has the right application name for each process (from the
> env set in .htaccess), but they all run using the same process,
> causing the "RuntimeError: class.__dict__ not accessible in restricted
> mode" error.

If you are getting that error, it means you are using a C extension
module which has not been written so as to work properly from multiple
sub interpreters at the same time.

As a result, that C extension module is going to force you to use a
separate daemon process group for every application instance, you
cannot run more than one in a process.

If that C extension module doesn't implement that correctly, it is
quite possible that it isn't even guaranteed to work in a sub
interpreter. This means to ensure it works, you would need to force
application to run in the main interpreter of the process.

The following section in documentation describes this problem with C
extension modules.

http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Multiple_Python_Sub_Interpreters

> It seems no matter what I try, I always hit a wall.  Is there
> something fundamentally wrong with what I'm doing, or am I just trying
> to get mod_wsgi to do something it wasn't designed to do?

Supporting for transient daemon process groups, be they named or
dynamic are certainly not supported. I have posted about possibility
of support for this in the past at:

http://blog.dscpl.com.au/2007/07/commodity-shared-hosting-and-modwsgi.html

I have held back from implementing it as overall supplying something
that makes people think that one can safely use mod_wsgi in mass
public virtual hosting is probably a bad idea. For such mass public
virtual hosting, a purpose built solution should really be developed,
or simply ways of using Python with FASTCGI improved. So, still useful
for self hosted/managed systems, but opens a big can of worms where it
could be easily abused.

Anyway, it seems you are going to be quite restricted if your C
extension module is going to force only one application instance per
process. What third party C extension modules are you using?

Graham
Reply all
Reply to author
Forward
0 new messages