Setting a little shared hosting with mod_wsgi

202 views
Skip to first unread message

Fabien

unread,
Apr 28, 2008, 8:31:50 AM4/28/08
to mod...@googlegroups.com
Hello,

I'm currently trying to setup a little shared hosting on my dedicated
server in order to give a way to host Django apps to various friends.
In order to avoid one user being able to see the files of another
user, I'm planning to use mod_wsgi. Is there some best practices to do
this like or some resources ?

Thanks in advance.

--
Fabien SCHWOB
_____________________________________________________________
The is nothing more Permanent then a temporary solution

Graham Dumpleton

unread,
Apr 28, 2008, 7:09:39 PM4/28/08
to mod...@googlegroups.com
2008/4/28 Fabien <xphu...@gmail.com>:

>
> Hello,
>
> I'm currently trying to setup a little shared hosting on my dedicated
> server in order to give a way to host Django apps to various friends.
> In order to avoid one user being able to see the files of another
> user, I'm planning to use mod_wsgi. Is there some best practices to do
> this like or some resources ?

Yes. Give me some time and I'll go through it all.

Although others have posted some recipes for doing this, they don't go
to the full extent of what they should/could.

One recent example is:

http://www.davidcramer.net/code/django/108/setup-mod_wsgi-for-django-and-shared-hosting.html

To get a really good environment though, I'd be doing a range of other
things as well. Until I get a chance to post a full description, start
playing with what they describe, plus also ensure you look over the
mod_wsgi documentation. Some documents worth noting are:

http://code.google.com/p/modwsgi/wiki/ConfigurationGuidelines
http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives
http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode

Graham

Graham Dumpleton

unread,
Apr 28, 2008, 7:12:34 PM4/28/08
to mod...@googlegroups.com
Also read my post in:

http://groups.google.com/group/modwsgi/browse_frm/thread/60cb0ec3041ac1bc

In writing that post I was more focusing on issue related to memory
constraints of VPS systems rather than shared web hosting for
disparate users. Most of what was mentioned applies, but there is more
to do than just that.

Graham

2008/4/29 Graham Dumpleton <graham.d...@gmail.com>:

Graham Dumpleton

unread,
Apr 29, 2008, 4:07:13 AM4/29/08
to mod...@googlegroups.com
2008/4/29 Graham Dumpleton <graham.d...@gmail.com>:

Okay, here is my guidance on how one can use mod_wsgi to implemented
limited shared hosting. I'll probably break this up over a few mail
messages as don't have time to sit down and do it all in one sitting.
By giving it in parts you can start playing and try and work out where
I am going with it all. :-)

Step 1. Use Python virtual environments so that each user has their
own site-packages directory and no one is actually using the main
Python installations site-packages directory.

To do this, use 'virtualenv' from:

http://pypi.python.org/pypi/virtualenv

As per:

http://code.google.com/p/modwsgi/wiki/VirtualEnvironments

create a baseline virtual environment which is disassociated from the
main Python installations site-packages directory.

mkdir /usr/local/pythonenv
cd /usr/local/pythonenv

virtualenv --no-site-packages BASELINE

This virtual environment is not used by any specific user, instead
mod_wsgi is pointed at it so it is used as the Python HOME by Python
interpreter running in mod_wsgi. This is done by adding:

WSGIPythonHome /usr/local/pythonenv/BASELINE

at global scope within Apache configuration

Because '--site-packages' arguments is used when creating the virtual
environment, the site-packages directory will be empty, except for
setuptools package.

For each user account that you are going to setup, create a Python
virtual environment. This should again be disassociated from main
Python installations site-packages directory.

To set this up, each user should themselves run from inside of their
home directory:

virtualenv --no-site-packages PYTHONENV-1

When they want to install their own Python modules/packages, they
would first run:

source PYTHONENV-1/bin/activate

This will set their shell environment up so that it installs into
their person Python virtual environment. This might be put in their
login scripts so it is done automatically.

Step 2. Setup mod_wsgi daemon processes so they use the Python virtual
environment of the user whose VirtualHost it is. An example of a
VirtualHost configuration would then be:

<VirtualHost *:80>
ServerName username.cheaphosting.com

WSGIDaemonProcess username-1 threads=15 \
python-path=/home/username/PYTHONENV-1/lib/python2.5/site-packages

WSGIProcessGroup username-1

...
</VirtualHost>

This would provide a single daemon process running 15 threads for the
user of that site to run their Python web applications.

I deliberately haven't specific WSGIApplicationGroup here as by not
doing so it would allow multiple WSGI applications to be run in same
process but in different sub interpreters. I'll get to this more when
I explain where each user would put their files.

I'll deal with where users files are in next message. I'll also
recover various options to WSGIDaemonProcess as have already described
in others posts I referenced. Just because I haven't mentioned them
yet doesn't mean they apply, just want to introduce one concept at a
time.

Hopefully I'll get to next installment in a few hours.

Graham

Graham Dumpleton

unread,
May 4, 2008, 8:09:53 AM5/4/08
to mod...@googlegroups.com
Sorry for delay in the next installment of this. Had a few things get
me side tracked.

See below for Step 3. For Steps 1 and 2 see prior message.

2008/4/29 Graham Dumpleton <graham.d...@gmail.com>:

Step 3. Setup the document root for each VirtualHost.

Suggested that the directory you use as document root for each
VirtualHost should be in a common area. In other words, preferable
that you not stick them in a users account. The reason for this is
that this directory must be readable to the user that Apache runs as.
By putting it the document root in a users own home directory, then
you can't lock down that users account as securely as you may want to.
This issue of directory/file permissions and security I'll go over in
future installment.

The VirtualHost definition would thus be as follows. Note that I have
left Python virtual environment stuff out of this, you should add that
back in as explained previously.

<VirtualHost *:80>

ServerName www.site.com

DocumentRoot /usr/local/sites/www.site.com/htdocs

ErrorLog "/usr/local/sites/www.site.com/logs/error_log"
CustomLog "/usr/local/sites/www.site.com/logs/access_log" common
LogLevel info

WSGIDaemonProcess username-1 user=username group=groupname
display-name=%{GROUP}
WSGIProcessGroup username-1

<Directory /usr/local/sites/www.site.com/htdocs>

Order deny,allow
Allow from all

Options MultiViews ExecCGI SymLinksIfOwnerMatch

AddHandler wsgi-script .wsgi

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /site.wsgi/$1 [QSA,PT,L]

</Directory>

</VirtualHost>

Replace 'www.site.com' with name of the site for the specific
VirtualHost definition. Replace 'username' and 'groupname'
corresponding to owner of that site. That is, the UNIX user for
account they user. Ideally, each user would have a unique UNIX group
for themselves.

If you give a specific user more than one VirtualHost, then increment
count in name of WSGIDaemonProcess. That is, 'username-2',
'username-3' etc. Keep that name short that so that when displayed in
'ps' output it doesn't get truncated.

Note that no WSGIScriptAlias or Alias directive is supplied to mount a
WSGI application at root of site. This is intentional, so don't add
one.

What is done instead is that AddHandler is used to indicate that any
script file in the document root directory with a .wsgi extension
should be processed as a WSGI application. The 'ExecCGI' option in
Options directive is required for that to work.

Thus, if you have a script file in that directory called 'hello.wsgi',
it can be accessed using the URL '/hello.wsgi'. The script can be
supplied additional path information. For example
'/hello.wsgi/some/path', and '/some/path' gets passed as PATH_INFO to
WSGI application.

Because 'MultiViews' has been listed in Options directive, the '.wsgi'
extension in the URL actually becomes optional. Thus, one could also
use '/hello' and '/hello/some/path' and the same result would occur.

The reason that AddHandler is used and it relied upon that any WSGI
application script file have a .wsgi extension, is such that you don't
need to do anything special to also host static files. Thus, you can
place in the same directory a 'favicon.ico' file, HTML files, style
sheets, image files etc. You could even create subdirectories to hold
such static files.

When creating subdirectories, you could even stick WSGI application
scripts in those subdirectories. Thus you have the ability to freely
mix static files and multiple WSGI applications.

As an example, if someone wanted to run a Django instance to run some
blog software, they might create a WSGI script file called 'blog.wsgi'
in the site document root. The blog would then be accessed as '/blog'.
If that Django instance had media files, they would just create a
subdirectory under the site document root called 'media' and put any
static files in it, setting up Django settings file to say media files
are located at '/media'.

I'll go into it more when talk about permissions in subsequent post,
but the .wsgi script files should just be very small wrappers. Thus
with Django for example, the Django site code directory would actually
be off in the users home directory somewhere and not located in the
site document root. The WSGI script file would internally setup Python
path so that code is imported from the other location. The media files
could be copied into the site document root, or a symlink used. That
symlinks will work is because of 'SymLinksIfOwnerMatch' in Options
directive, although that is actually there to really allow mod_rewrite
to be used.

Thus, so far can host multiple WSGI applications by supplying .wsgi
script files as appropriate. Can also host static files without doing
anything special. So far though it would appear that one can't mount a
WSGI application on the root of the web site. This is where the
rewrite rules come into play.

What the rewrite rule in the configuration does is looks at each
request, and if the request doesn't match a static file, or match to a
WSGI application via its .wsgi script file (extension in URL still
optional), in other words, request would result in Not Found, then the
request will be rewritten and routed through the WSGI application
hosted via the 'site.wsgi' script file.

In other words, if you wanted your blog to be hosted at the root of
the web site, instead of calling it 'blog.wsgi', call it the special
'site.wsgi' file listed in the rewrite rule. Because precedence is
given to static files and other WSGI applications which explicitly
match the URL, then still possible to overlay multiple things in same
site.

You will note that DirectoryIndex directive is not used. This is
intentional as it doesn't allow additional path information to passed
to anything it targets. Thus, do not go adding DirectoryIndex.

Also note that have defined ErrorLog and CustomLog so that each
VirtualHost has its own log files. Have set LogLevel to 'info' so that
users get additional information in the error log about what mod_wsgi
is doing when processing their requests. Do note separate discussion
on list about some problems with ErrorLog in VirtualHost. For some
configurations or possibly where site known by multiple names,
messages are appearing in main Apache error log still, or could also
vanish. Hopefully people will respond on that separate issue so it can
be solved and fix provided. Only a few have reported this issue at
this point.

Be warned that multiple WSGI applications are supported here by
running them in distinct Python sub interpreters within the one daemon
process. With the default mechanism, users will not get a say on which
sub interpreter applications are run in. This can be an issue if
wanting to run Trac with subversion repository support. This is
because Python subversion wrappers will only work properly if run in
first interpreter created by Python. Will cover this issue in a future
installment.

Other things to be covered in future installments, is permissions as
already mentioned, but also ability to use HTTP Basic Authentication
in a WSGI application.

Hope this is getting more interesting. :-)

Graham

Graham Dumpleton

unread,
May 4, 2008, 6:20:25 PM5/4/08
to mod...@googlegroups.com
2008/5/4 Graham Dumpleton <graham.d...@gmail.com>:

Step 3 (Addendum)

Do note however that when using 'site.wsgi', the WSGI application is
executed for a request the 'SCRIPT_NAME' variable indicating what the
mount point of the application was will be '/site.wsgi'. This will
mean that when a WSGI application constructs an absolute URL based on
'SCRIPT_NAME', it will include 'site.wsgi' in the URL rather than it
being hidden. As this would probably be undesirable, many web
frameworks provide an option to override what the value for the mount
point is. If such a configuration option isn't available, it is just
as easy to adjust the value of 'SCRIPT_NAME' in the 'site.wsgi' script
file itself.

def _application(environ, start_response):
# The original application.
...

import posixpath

def application(environ, start_response):
# Wrapper to set SCRIPT_NAME to actual mount point.
environ['SCRIPT_NAME'] = ''
return _application(environ, start_response)

This wrapper will ensure that 'site.wsgi' never appears in the URL as
long as it wasn't included in the first place and that access was
always via the root of the web site instead.

Graham

Damjan

unread,
May 5, 2008, 4:55:42 AM5/5/08
to modwsgi

> Step 3 (Addendum)
>
> Do note however that when using 'site.wsgi', the WSGI application is
> executed for a request the 'SCRIPT_NAME' variable indicating what the
> mount point of the application was will be '/site.wsgi'. This will
> mean that when a WSGI application constructs an absolute URL based on
> 'SCRIPT_NAME', it will include 'site.wsgi' in the URL rather than it
> being hidden. As this would probably be undesirable, many web
> frameworks provide an option to override what the value for the mount
> point is. If such a configuration option isn't available, it is just
> as easy to adjust the value of 'SCRIPT_NAME' in the 'site.wsgi' script
> file itself.
>
> def _application(environ, start_response):
> # The original application.
> ...
>
> import posixpath
>
> def application(environ, start_response):
> # Wrapper to set SCRIPT_NAME to actual mount point.
> environ['SCRIPT_NAME'] = ''
> return _application(environ, start_response)
>
> This wrapper will ensure that 'site.wsgi' never appears in the URL as
> long as it wasn't included in the first place and that access was
> always via the root of the web site instead.

is this an error that you only *clear* SCRIPT_NAME? I guess something
like:
environ['SCRIPT_NAME'] = environ['SCRIPT_NAME'].rstrip('.wsgi')

on the other hand, that ".wsgi" suffix can be configured in the apache
conf to be something else.

Graham Dumpleton

unread,
May 5, 2008, 6:47:44 AM5/5/08
to mod...@googlegroups.com
2008/5/5 Damjan <gda...@gmail.com>:

Setting SCRIPT_NAME to be empty should be sufficient for this
particular case where you are wanting it to appear to be mounting at
the root of the web site even though a specific script is handling it.
The redirect is internal to Apache and we are just hiding the fact it
even occurred to the WSGI application.

Other variables such as REQUEST_URI will still be correct as they
reflect the original URL as supplied by client and so do not need to
be modified. Although, REQUEST_URI is not guaranteed to exist with
WSGI from memory anyway.

Using an echo script, using rewrite rule, but not doing fixup in WSGI
application, for URL '/abcd', one would get:

PATH_INFO: '/abcd'
PATH_TRANSLATED: 'redirect:/echo.wsgi/abcd'
QUERY_STRING: ''
REDIRECT_URL: '/abcd'
REQUEST_URI: '/abcd'
SCRIPT_FILENAME: '/Users/grahamd/VirtualHost/echo.wsgi'
SCRIPT_NAME: '/echo.wsgi'

With the fixup in WSGI application script, this would be:

PATH_INFO: '/abcd'
PATH_TRANSLATED: 'redirect:/echo.wsgi/abcd'
QUERY_STRING: ''
REDIRECT_URL: '/abcd'
REQUEST_URI: '/abcd'
SCRIPT_FILENAME: '/Users/grahamd/VirtualHost/echo.wsgi'
SCRIPT_NAME: ''

If instead of doing the rewrite trick in the root directory of the
site, but were instead doing it at a subdirectory level with an
appropriate et of rewrite rules, eg:

<Directory /Users/grahamd/VirtualHost/subdir>


RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f

RewriteRule ^(.*)$ /subdir/echo.wsgi/$1 [QSA,PT,L]
</Directory>

then you would need to do a bit more thorough, in particular:

def application(environ, start_response):
# Wrapper to set SCRIPT_NAME to actual mount point.

environ['SCRIPT_NAME'] = posixpath.dirname(environ['SCRIPT_NAME'])
return _application(environ, start_response)

and get:

PATH_INFO: '/abcd'
PATH_TRANSLATED: '/Users/grahamd/VirtualHost/abcd'
QUERY_STRING: ''
REDIRECT_URL: '/subdir/abcd'
REQUEST_URI: '/subdir/abcd'
SCRIPT_FILENAME: '/Users/grahamd/VirtualHost/subdir/echo.wsgi'
SCRIPT_NAME: '/subdir'

If you want to be tolerant of both cases, use:

def application(environ, start_response):
# Wrapper to set SCRIPT_NAME to actual mount point.

environ['SCRIPT_NAME'] = posixpath.dirname(environ['SCRIPT_NAME'])
if environ['SCRIPT_NAME'] == '/':


environ['SCRIPT_NAME'] = ''
return _application(environ, start_response)

Now, because the rewrite rule uses:

RewriteCond %{REQUEST_FILENAME} !-f

if a URL matches to a subdirectory, it will actually push it through
to the WSGI application. If Apache was configured to allow directory
list, or DirectoryIndex for subdirectories in some way then that
wouldn't work. One would need a slightly more complicated rewrite to
allow that as well.

Anyway, remember we are trying to make a script able to be used as if
it was root, your:

environ['SCRIPT_NAME'] = environ['SCRIPT_NAME'].rstrip('.wsgi')

isn't doing that as it is still leaving the basename of the script
file in the variable and thus doesn't buy us anything and may actually
just cause problems.

Think about it all in terms of how the following is going to behave:

from urllib import quote
url = environ['wsgi.url_scheme']+'://'

if environ.get('HTTP_HOST'):
url += environ['HTTP_HOST']
else:
url += environ['SERVER_NAME']

if environ['wsgi.url_scheme'] == 'https':
if environ['SERVER_PORT'] != '443':
url += ':' + environ['SERVER_PORT']
else:
if environ['SERVER_PORT'] != '80':
url += ':' + environ['SERVER_PORT']

url += quote(environ.get('SCRIPT_NAME',''))
url += quote(environ.get('PATH_INFO',''))
if environ.get('QUERY_STRING'):
url += '?' + environ['QUERY_STRING']

This should end up producing a URL the same as the client provided.

Now, it could well be that I am missing something here in what you are
saying, so maybe test it out yourself and come back with some actual
examples of how you think it isn't working properly and what I am
overlooking.

Graham

Damjan

unread,
May 6, 2008, 9:06:23 AM5/6/08
to modwsgi
> Setting SCRIPT_NAME to be empty should be sufficient for this
> particular case where you are wanting it to appear to be mounting at
...
> Now, it could well be that I am missing something here in what you are
> saying, so maybe test it out yourself and come back with some actual
> examples of how you think it isn't working properly and what I am
> overlooking.

You are right.. in the case of using RewriteRule, so that the WSGI
app is actually mapped to the root of the url-space just clearing the
SCRIPT_NAME will work.

(I think) I was refereing to the case where you use app.wsgi and mask
the .wsgi extension with multiviews(was it?) so that the app lives in
it's own /app[.wsgi]/ space.

Reply all
Reply to author
Forward
0 new messages