One Django instance, hundreds of websites

428 views
Skip to first unread message

Jari Pennanen

unread,
Jan 25, 2011, 9:56:42 AM1/25/11
to Django developers
Hi!

I'm on a monumental task here, I've decided to get one Django instance
running hundreds of websites.

I've run in to couple of shortcomings with django:

1. Global SITE_ID does not work since some requests belong to
different site. I've created middleware that adds request.site_id
based on hostname of request.
2. User may be on multiple sites, but still should not be able to
login to other sites! (Here is a slight problem since authentication
backend assumes that username and password is only thing required.
This is not the case, I need to verify the site id from
request.site_id and check that user is listed in that site as user.)
3. MEDIA_URL could be per site basis. (STATIC_URL could be global)

This one is not a problem because django supports it already, but
anyone wondering to try out it's good to know:

4. Website manager must be able to create dynamical urlconfs (I'm now
implementing database based urlconf loading), this is rather simple
since I can inject the urlconf to request.urlconf (Django supports
this already).


Is there anyone else doing similar task? I have currently FlatPage app
running perfectly with this, only thing I needed to change from
flatpage view was following:
...
# Get request's site id, fallback to settings.SITE_ID
site_id = getattr(request, 'site_id', settings.SITE_ID)
f = get_object_or_404(FlatPage, url__exact=url,
sites__id__exact=site_id)
...

I think this SITE_ID per request is a great idea, worth thinking for
Django itself.

Russell Keith-Magee

unread,
Jan 25, 2011, 10:15:34 AM1/25/11
to django-d...@googlegroups.com

This has been a topic of discussion quite recently -- see
http://code.djangoproject.com/ticket/15089 for details. Suffice to say
that this sort of problem is something we're aware of, and there are
people who have indicated an interest in looking at this in the 1.4
timeframe. If you're interested in this problem space, I encourage you
to get involved.

For the next couple of weeks the core developers will be a little busy
finalizing the 1.4 release; but you take that time to put together a
solid proposal (and maybe some sample code), you will be in good
position to make a big contribution when 1.4 development starts.

Yours,
Russ Magee %-)

Łukasz Rekucki

unread,
Jan 25, 2011, 10:25:21 AM1/25/11
to django-d...@googlegroups.com
On 25 January 2011 16:15, Russell Keith-Magee <rus...@keith-magee.com> wrote:
>
> For the next couple of weeks the core developers will be a little busy
> finalizing the 1.4 release;

That is obviously supposed to be 1.3 ;)

--
Łukasz Rekucki

Russell Keith-Magee

unread,
Jan 25, 2011, 10:27:31 AM1/25/11
to django-d...@googlegroups.com
2011/1/25 Łukasz Rekucki <lrek...@gmail.com>:

Erm... ahh... No... I'm writing from the FUTURE!!! The flying cars are
awesome :-)

Russ %-)

Graham Dumpleton

unread,
Jan 25, 2011, 4:45:24 PM1/25/11
to django-d...@googlegroups.com
Since the 'futures' module is only a part of Python 3.2, that must mean you are using that and thus have ported Django to Python 3 already. When can we expect that then. ;-)


Graham

James Bennett

unread,
Jan 25, 2011, 5:35:07 PM1/25/11
to django-d...@googlegroups.com
On Tue, Jan 25, 2011 at 3:45 PM, Graham Dumpleton
<graham.d...@gmail.com> wrote:
> Since the 'futures' module is only a part of Python 3.2, that must mean you
> are using that and thus have ported Django to Python 3 already. When can we
> expect that then. ;-)

So, here's what happened.

You all may remember that some time ago, as it was being brought
online for the first time, the Large Hadron Collider experienced...
difficulties. The exact nature of those difficulties was classified,
and a false story promulgated to cover it up. The LHC was offline for
over a year afterwards. A full explanation has been sent to Wikileaks,
but due to international pressure they have not yet released it. The
story can be told here for the first time.

As some of you may know, Russ is, in fact, *Dr.* Keith-Magee, holding
a Ph.D. in Artificial Intelligence. As the LHC team was moving to
bring the collider online, he used his -- in his words -- "mad
scientist" skills to access the LHC's command and control systems,
planting a moderately intelligent (able to pass the Turing Test with a
13-year-old) agent which subverted the collider to his nefarious
purposes, specifically the production of tachyons from extremely
high-speed collisions.

The test fire which caused extensive damage to the LHC also produced a
massive burst of tachyons, which -- as particles which travel
backwards through time, per Einstein's theories -- carried information
from the future. By careful observation of these tachyons, Russ was
able to glean information about the future of Django (transmitted
using an elegant encoding scheme based on quantum-mechanical
properties of the tachyons -- by observing one set of properties, he
could
determine the ID of a fixed ticket, and by observing a different set
of properties he could determine the date on which it was fixed).

This information was then forwarded to the private Django committers'
list, and Russ himself has gone into hiding until the future date on
which he will encode the information to be transmitted back to the
past.

As a result, we don't currently have Django running on Python 3, but
we do know the details of when and how it will happen. A full write-up
of how we accomplished the porting, and a timeline of when we did and
will port, will appear in the near future on the Django Advent site
for 1.3. The primary holdup at this time is an investigation of
whether our foreknowledge of the future timeline (as evidenced by a
mysterious application which appears in far-future tickets, named
"django.contrib.lightcones") will in any way alter it.

Anyway. Now you know, and knowing is half the battle.

--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

Graham Dumpleton

unread,
Jan 25, 2011, 5:57:52 PM1/25/11
to django-d...@googlegroups.com


On Wednesday, January 26, 2011 2:27:31 AM UTC+11, Russell Keith-Magee wrote:
BTW, this then was your flying car in Perth then which was spotted some time back.


Graham 

Rohan Jain

unread,
Jan 25, 2011, 11:03:23 PM1/25/11
to django-d...@googlegroups.com
I am also trying to achieve something highly similar to this but in a dilemma, for how to proceed. I have written a post about this: http://www.rohanjain.in/blog/hosting-multiple-sites-with-same-django-project/. Is there any existing big project following a similar concept?

Derega

unread,
Jan 26, 2011, 2:21:06 AM1/26/11
to Django developers
On Jan 25, 4:56 pm, Jari Pennanen <jari.penna...@gmail.com> wrote:
> I'm on a monumental task here, I've decided to get one Django instance
> running hundreds of websites.

I'm also on similar path here. I'm trying to build a system where
django instances running on several servers are serving hundreds of
websites. Each website has different urlconf, different templates,
different users and in some cases even different apps and features
enabled. But the codebase and settings are identical on all
instances.

I'm still in the process of defining all requirements of the system,
but when I get those sorted out I'll start looking for solutions. This
year will be interesting, and I'm eagerly waiting what happens in
django, and if I possibly can I try to help out and participate.

--Ilkka

Xavier Ordoquy

unread,
Jan 26, 2011, 6:55:58 AM1/26/11
to django-d...@googlegroups.com
Hi there,

I also started to have a look at something similar (ie, with settings for each site).
The proposal made in the ticket wouldn't fit what you are trying to do.
The current idea is to have the site id depends on the request but assumes to have only one common settings file.

On the other hand, having one setting file per site where the only difference is the site id seems a bit too much overhead.

Regards,
Xavier.

> --
> You received this message because you are subscribed to the Google Groups "Django developers" group.
> To post to this group, send email to django-d...@googlegroups.com.
> To unsubscribe from this group, send email to django-develop...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
>

David Danier

unread,
Jan 26, 2011, 9:51:38 AM1/26/11
to django-d...@googlegroups.com
> On the other hand, having one setting file per site where the only difference is the site id seems a bit too much overhead.

Why not use something like this:
from global_settings import *
SITE_ID = 235
#This also allows further changes like:
#INSTALLED_APPS = INSTALLED_APPS + (
# 'fooapp',
#)

I'm using something similar here, but the number of sites I need to run
with the same codebase is rather limited (unter 10). Anyways it works
well.

In addition every site has its own python module here, so changing the
settings is just a small step...its even possible to have different
urlconfs or templates or ...

David


Xavier Ordoquy

unread,
Jan 26, 2011, 10:20:39 AM1/26/11
to django-d...@googlegroups.com
Given the title, I would feel bad for the sysadmin if there was hundreds of setting files with just the site id within ;)

As for the urlconf, it's already possible. core/urlresolvers have a set/get urlconf that is set for the thread by the BaseHandler.

Don't get me wrong, I started some work in order to have each site with its own settings.
However, this seems to differ from what was done in ticket 15089.

This only things that troubles me is that one shouldn't be able to set INSTALLED_APPS per site because I can't see how syncdb would perform with new apps.

Xavier.

Lorenzo Gil Sanchez

unread,
Jan 26, 2011, 10:10:06 AM1/26/11
to django-d...@googlegroups.com
2011/1/25 Jari Pennanen <jari.p...@gmail.com>:

> Hi!
>
> I'm on a monumental task here, I've decided to get one Django instance
> running hundreds of websites.
>
> I've run in to couple of shortcomings with django:
>
> 1. Global SITE_ID does not work since some requests belong to
> different site. I've created middleware that adds request.site_id
> based on hostname of request.
> 2. User may be on multiple sites, but still should not be able to
> login to other sites! (Here is a slight problem since authentication
> backend assumes that username and password is only thing required.
> This is not the case, I need to verify the site id from
> request.site_id and check that user is listed in that site as user.)
> 3. MEDIA_URL could be per site basis. (STATIC_URL could be global)
>
> This one is not a problem because django supports it already, but
> anyone wondering to try out it's good to know:
>
> 4. Website manager must be able to create dynamical urlconfs (I'm now
> implementing database based urlconf loading), this is rather simple
> since I can inject the urlconf to request.urlconf (Django supports
> this already).
>
>
> Is there anyone else doing similar task?

I suggest to have a look to www.reviewboard.org since they have solved
this problem a long time ago.


> I have currently FlatPage app
> running perfectly with this, only thing I needed to change from
> flatpage view was following:
>    ...
>    # Get request's site id, fallback to settings.SITE_ID
>    site_id = getattr(request, 'site_id', settings.SITE_ID)
>    f = get_object_or_404(FlatPage, url__exact=url,
> sites__id__exact=site_id)
>    ...
>
> I think this SITE_ID per request is a great idea, worth thinking for
> Django itself.
>

Jari Pennanen

unread,
Jan 26, 2011, 11:06:59 AM1/26/11
to Django developers
For the project which I plan to use this is such that all sites could
share same INSTALLED_APPS, but it would be truly awesome if full
settings were possible for each site.

On Jan 26, 5:10 pm, Lorenzo Gil Sanchez
<lorenzo.gil.sanc...@gmail.com> wrote:
> I suggest to have a look to www.reviewboard.org since they have solved
> this problem a long time ago.

Hey thanks!

Here is the relevant code (in order of discovery):
https://github.com/reviewboard/reviewboard/blob/master/reviewboard/settings.py
https://github.com/reviewboard/reviewboard/blob/master/reviewboard/admin/middleware.py
https://github.com/reviewboard/reviewboard/blob/master/reviewboard/admin/siteconfig.py
(see load_site_config)
https://github.com/djblets/djblets/blob/master/djblets/siteconfig/middleware.py
(some dynamic settings stuff)

Although I would love to see per request settings object, but this is
not a way to go. I mean (judging fast) it seems like their
implementation is not thread safe: On each request it patches the
settings object with own stuff. If the request is not fully served
until next request comes, the settings object might get mangled too
early.

On the other hand if all code were to use get_settings(request=None)
instead of django.conf.settings that could work. If we are not trying
to achieve thread safeness, then probably one could open up enough
workers so that each django instance would serve single request at a
time.

Maybe they don't use multithreading in reviewboard?

Jari Pennanen

unread,
Jan 26, 2011, 11:24:42 AM1/26/11
to Django developers
On Jan 26, 5:20 pm, Xavier Ordoquy <xordo...@linovia.com> wrote:
> Given the title, I would feel bad for the sysadmin if there was hundreds of setting files with just the site id within ;)

Ha, single file per site :) I would change to that any day.

The current system in place is following: ~60 (and growing) websites
of Joomla with symbolic links all littered so one can "control all
sites" from one place. On top of that cake each site has own database
tables, thats freaking nightmare.

On next two years, I can expect the number of sites to rise about 500.
I have to blow the whistle long before that if I can't come up a
better solution than this Joomla hack.

FeatherDark

unread,
Jan 26, 2011, 11:56:08 AM1/26/11
to Django developers
Greetings huge django developer list,
I just wanted to mention, this method totally works for me, I call it
"Skinning"

In the templates folder I have a file called "base.html'
Inside that file is only 1 line:
{% extends request.META.HTTP_HOST|cut:':'|add:'.html'%}

The rest of that same folder contains a bunch of files:
www.mydonmainname.com.html
mydomainname.com.html
myseconddomain.com.html
www.myseconddomain.com.html
www.yetanotherdomain.com.html


each of those is basically a WYSIWYG compatible HTML file, they can
obviously contain django code, they represent "base.html" so anytime
someone hits the website.
The base.html looks up the URL and serves the appropriate base
template for all the other templates to draw from.

I know several people have introduced other, 'more elegant' solutions.
This is my 1 line, 'yes it works' solution.
I hope this is helpful to at least one person, it definitely is going
in my book, my client adores the solution. If you have something
better, or even more elegant please do let me know. Also if you think
this solution is 'incorrect' for any reason I would really appreciate
the insight. It would be spectacular if this kind of 'feature' were a
clearer 'default' option.

[M]

p.s. i tried to email this in and it bounced, so if u get this twice,
my deepest apologies.

Jari Pennanen

unread,
Jan 26, 2011, 1:18:01 PM1/26/11
to Django developers
On Jan 26, 6:56 pm, FeatherDark <msensei...@gmail.com> wrote:
> Greetings huge django developer list,
> I just wanted to mention, this method totally works for me, I call it
> "Skinning"
>
> In the templates folder I have a file called "base.html'
> Inside that file is only 1 line:
> {% extends request.META.HTTP_HOST|cut:':'|add:'.html'%}

request.META.HTTP_HOST is coming from Client. "Trust but verify", you
are not verifying this. It could pose a security risk. One could send
a request with malicious Host header and make the site retrieve
different template. This is not a serious issue, since you probably
don't have templates that would wreak havoc.

Why don't you create own template context processor that would add the
verified HTTP_HOST to template context? Then you could do just

{% extend MY_VERIFIED_HTTP_HOST %}

See:
http://docs.djangoproject.com/en/dev/ref/request-response/#django.http.HttpRequest.META
http://docs.djangoproject.com/en/dev/ref/templates/api/#writing-your-own-context-processors

Tom Evans

unread,
Jan 27, 2011, 10:09:06 AM1/27/11
to django-d...@googlegroups.com

request.META['HTTP_HOST'] is also the primary mechanism for
determining which website to serve when doing virtual hosting, IE if
you use apache and your site is hosted in a structure like:

NameVirtualHost *:80
<VirtualHost *:80>
ServerName www.foo.com
ServerAlias *.foo.com *.bar.com *.quuz.com
....
</VirtualHost>

Then that variable already is being verified.

Cheers

Tom

Jari Pennanen

unread,
Jan 27, 2011, 10:16:20 AM1/27/11
to Django developers
I think I've found the necessary tools making the Django login to work
per site basis:

1. Create own login view that calls the auth backend with
authenticate(site_id, username, password)

2. Create own auth backend that takes site_id, username and password
(also checks permissions by site)

3. *) Create own auth middleware that creates request.user
(Django's own authentication middleware and backend does has a
shortcoming *)

4. Create models for per site permissions (UserSite and GroupSite):
class UserSite(models.Model): user, site, is_superuser, is_active,
permissions

---------

* The shortcoming in Django authentication middleware, it relies on
this function django.contrib.auth.get_user:
def get_user(request):
...
user = backend.get_user(user_id) or AnonymousUser()
...

If this were something like this:

user = backend.get_user(user_id, request=request) or AnonymousUser()

The backend could verify that the user_id is authenticated with this
site's request.

Jari Pennanen

unread,
Jan 27, 2011, 2:30:18 PM1/27/11
to Django developers
Scrub my above message, here is the new revised and working summary
for per site login:

1. user_logged_in signal callback that adds
request.session[SITE_ID_SESSION_KEY] = request.site_id
2. AuthenticationForm with clean that does authenticate(site_id,
username, password)
3. MultiSitedAuthenticationMiddleware that adds request.user which
understands the request.session[SITE_ID_SESSION_KEY] and
authenticate(site_id, username, password)
4. Auth backend MultiSitedBackend that understands
authenticate(site_id, username, password) and UserSite permission
check.
5. Create models for per site permissions (UserSite and GroupSite):
class UserSite(models.Model): user, site, is_superuser, is_active,
permissions

This is pretty pluggable, no patches to django (yet) except one
considering testing:
http://code.djangoproject.com/ticket/15179

Graham Dumpleton

unread,
Jan 27, 2011, 3:16:18 PM1/27/11
to django-d...@googlegroups.com
Yes and no.

Apache uses it to resolve name based virtual hosts, but if it cant match it against a specific virtual host from memory it routes the request to the first VirtualHost which was found in the Apache configuration for that port.

Have many times seen broken VirtualHost configurations which shouldn't work, but seem to, because the user only had one VirtualHost definition and so Apache was routing the request to it anyway.

If you were going to be rigorous you would add a dummy VirtualHost as first in Apache configuration and have 'Deny from all' in it so that any attempts to access unknown host would fallback to this and get forbidden.

Graham

Jjdelc

unread,
Jan 28, 2011, 12:54:54 AM1/28/11
to Django developers
If all you need to change is the SITE_ID on the settings file, using
different files for each is not only a mess to handle, but also means
that you'll spend extra RAM for each instance running.

I solve this by using a middleware that changes the SITE_ID based on
the request's hostname:

SITES_DICT = 'cached-sites-dict'

class MultiHostMiddleware(object):
def process_request(self, request):
if cache.has_key(SITES_DICT):
sites = cache.get(SITES_DICT)
else:
sites = {}
for site in Site.objects.all():
sites[site.domain.lower()] = {
'id': site.id,
}
cache.set(SITES_DICT, sites)

try:
host = request.META["HTTP_HOST"].lower().replace('www.',
'')
domain = urlparse(host).path
settings.SITE_ID = sites[domain]['id']
except KeyError:
raise Http404()

This way I only have one instance running 'hundreds' of websites.
With this approach you can create a OneToOne model SiteOptions to
store extra settings, like TEMPLATE_DIRS, STATIC_ROOT, or other site's
options like API keys and such. I have an app that has the fields for
the site I'm doing but it works fine.

If you need different urlconfs, you could also do it in the middleware
(since urls are resolved against request.urlconf which you can set
there), but I think that at that point you're talking about another
website so I'd use a different settings file for it.


On Jan 27, 3:16 pm, Graham Dumpleton <graham.dumple...@gmail.com>
wrote:
> On Friday, January 28, 2011 2:09:06 AM UTC+11, Tom Evans wrote:
>
> > On Wed, Jan 26, 2011 at 6:18 PM, Jari Pennanen <jari.p...@gmail.com>
> > wrote:
> > > On Jan 26, 6:56 pm, FeatherDark <msens...@gmail.com> wrote:
> > >> Greetings huge django developer list,
> > >> I just wanted to mention, this method totally works for me, I call it
> > >> "Skinning"
>
> > >> In the templates folder I have a file called "base.html'
> > >> Inside that file is only 1 line:
> > >> {% extends request.META.HTTP_HOST|cut:':'|add:'.html'%}
>
> > > request.META.HTTP_HOST is coming from Client. "Trust but verify", you
> > > are not verifying this. It could pose a security risk. One could send
> > > a request with malicious Host header and make the site retrieve
> > > different template. This is not a serious issue, since you probably
> > > don't have templates that would wreak havoc.
>
> > > Why don't you create own template context processor that would add the
> > > verified HTTP_HOST to template context? Then you could do just
>
> > > {% extend MY_VERIFIED_HTTP_HOST %}
>
> > > See:
>
> >http://docs.djangoproject.com/en/dev/ref/request-response/#django.htt...
>
> >http://docs.djangoproject.com/en/dev/ref/templates/api/#writing-your-...

James Hancock

unread,
Jan 28, 2011, 2:59:43 AM1/28/11
to django-d...@googlegroups.com
I have one question about changing the site ID per request.
I assume that settings is imported from conf, and so in the end it is simply changing the same SITE_ID to fit the current request Django is handling.

Does this ever become a problem? I am setting up around 250 sites for example. If the site_id had a conflict because it is trying to be changed in two places at the same time.

It is probably a dumb question, but I was wondering.

Cheers,
James Hancock

Graham Dumpleton

unread,
Jan 28, 2011, 3:38:18 AM1/28/11
to django-d...@googlegroups.com


On Friday, January 28, 2011 6:59:43 PM UTC+11, James Hancock wrote:
I have one question about changing the site ID per request.
I assume that settings is imported from conf, and so in the end it is simply changing the same SITE_ID to fit the current request Django is handling.

Does this ever become a problem? I am setting up around 250 sites for example. If the site_id had a conflict because it is trying to be changed in two places at the same time.

On the fly changes to settings like this on a per request basis is not likely to work in a multithreaded hosting configuration. Thus, you are restricted to one thread per process and thus would need to use multiple processes to handle concurrent requests.

Graham
 
On Jan 27, 3:16 pm, Graham Dumpleton <graham.d...@gmail.com>
wrote:
> On Friday, January 28, 2011 2:09:06 AM UTC+11, Tom Evans wrote:
>
> > On Wed, Jan 26, 2011 at 6:18 PM, Jari Pennanen <jari...@gmail.com>
> > wrote:

Jari Pennanen

unread,
Jan 28, 2011, 5:05:56 AM1/28/11
to Django developers
Graham is correct.

I pointed out this in my reply to reviewboard method which works just
like Jjdelc proposed, and that is incorrect way.

So simply put: Changing settings object in middelwares is WRONG. If
one changes the settings object before or after request, it will be in
intermediate state if there is concurrent request and wreaks havoc in
multithreading.

Also there is a additional problem with that, there are apps that uses
settings in their __init__.py like this:
SETTING = getattr(settings, 'MYAPP_SETTING', 'sensible default')
these would not work afterwards either.

Though I'm not too worried about apps since I need to write most of
the apps myself anyway since there are very little apps that support
sites framework the right way.

Either way my current method does not include hundreds of files, just
database entries in Site, which are cached:
https://gist.github.com/795135
https://gist.github.com/795138

Waldemar Kornewald

unread,
Jan 29, 2011, 3:21:42 AM1/29/11
to django-d...@googlegroups.com
Hi,
it's possible to manipulate the settings object in a thread-safe way. Here's our dynamic site middleware:
https://bitbucket.org/wkornewald/djangotoolbox/src/535feb981c50/djangotoolbox/sites/dynamicsite.py
https://bitbucket.org/wkornewald/djangotoolbox/src/535feb981c50/djangotoolbox/utils.py

As you can see, it makes SITE_ID a thread-local property which has a different value for every thread.

Hope this helps someone.

Bye,
Waldemar

--
http://www.allbuttonspressed.com/

Jari Pennanen

unread,
Jan 29, 2011, 7:55:55 AM1/29/11
to Django developers
Certainly something new for me.

That does look like a rather cool. Essentially if that works one could
save even the request object to thread "global" and it would be
accessible anywhere.

It would solve many problems, such as django's authentication
middleware's shortcoming where it does not pass request object to the
auth backend's get_user() which is sole reason I had to write *own*
authentication middleware for per site basis.

Another unrelated thing I'm now wondering is the django.core.cache, is
it faster than my simple dict cache? That is { 'example.com' : 5, ...}
should I change my caching to this django.core.cache... I'll have to
study this further.

On Jan 29, 10:21 am, Waldemar Kornewald <wkornew...@gmail.com> wrote:
> Hi,
> it's possible to manipulate the settings object in a thread-safe way. Here's our dynamic site middleware:https://bitbucket.org/wkornewald/djangotoolbox/src/535feb981c50/djang...https://bitbucket.org/wkornewald/djangotoolbox/src/535feb981c50/djang...

Jari Pennanen

unread,
Jan 29, 2011, 8:35:08 AM1/29/11
to Django developers
Sorry about second post but I'm so thrilled about this thread local
approach! Thanks Waldemar.

This changes everything, EVERYTHING.

I can just do:

settings.__class__.SITE_ID = make_tls_property()
settings.__class__.MEDIA_URL = make_tls_property()
settings.__class__.MEDIA_ROOT = make_tls_property()
# This does not have to be in settings object but for the sake of
example:
settings.__class__.REQUEST = make_tls_property()

After which I create own middleware that sets them for each request.
Then I can just access the values with settings.REQUEST,
settings.SITE_ID ...

Only thing I probably have to be aware is that if this
make_tls_property is not done early enough my apps must not rely on
this method:

SETTING = getattr(settings, 'MYAPP_SETTING', 'default')

especially if the use SITE_ID, MEDIA_URL or MEDIA_ROOT.

On Jan 29, 2:55 pm, Jari Pennanen <jari.penna...@gmail.com> wrote:
> Certainly something new for me.
>
> That does look like a rather cool. Essentially if that works one could
> save even the request object to thread "global" and it would be
> accessible anywhere.
>
> It would solve many problems, such as django's authentication
> middleware's shortcoming where it does not pass request object to the
> auth backend's get_user() which is sole reason I had to write *own*
> authentication middleware for per site basis.
>
> Another unrelated thing I'm now wondering is the django.core.cache, is
> it faster than my simple dict cache? That is { 'example.com' : 5, ...}
> should I change my caching to this django.core.cache... I'll have to
> study this further.
>
> On Jan 29, 10:21 am, Waldemar Kornewald <wkornew...@gmail.com> wrote:
>
>
>
>
>
>
>
> > Hi,
> > it's possible to manipulate the settings object in a thread-safe way. Here's our dynamic site middleware:https://bitbucket.org/wkornewald/djangotoolbox/src/535feb981c50/djang......

Jari Pennanen

unread,
Jan 29, 2011, 12:41:38 PM1/29/11
to Django developers, wkorn...@gmail.com
Hi!

I suggest you to look on to this _patch_setattr I cooked. I noticed
that it is also necessary to patch the __setattr__ of the settings
object in order to allow changes to the settings.SITE_ID again.

Following test would fail after TLSProperty:

settings.SITE_ID = 42
assert settings.SITE_ID == 42

and they all failed after using make_tls_property, this is because of
funny LazyObject Django uses.

So I came up with _patch_setattr, see gist:
https://gist.github.com/802020

Russell Keith-Magee

unread,
Jan 30, 2011, 12:20:45 AM1/30/11
to django-d...@googlegroups.com
On Sat, Jan 29, 2011 at 8:55 PM, Jari Pennanen <jari.p...@gmail.com> wrote:
> Certainly something new for me.
>
> That does look like a rather cool. Essentially if that works one could
> save even the request object to thread "global" and it would be
> accessible anywhere.

... and this is one of the biggest reasons why Django doesn't
encourage the practice of using threadlocals.

If an engineer came to their supervisor with a problem and said "I'm
going to fix this problem with a global variable", they would be
soundly beaten by any supervisor worth their salt. Somehow, because
the name has been changed to "threadlocal", global variables have
suddenly become acceptable.

Every single problem associated with using global variables exists
with threadlocals -- and then a few more. They *can* be used
successfully. However, in almost every case, they can also be avoided
entirely with a good dose of rigorous engineering.

I *strongly* advise against the use of this technique.

Yours,
Russ Magee %-)

Jari Pennanen

unread,
Jan 30, 2011, 12:10:30 PM1/30/11
to Django developers


On Jan 30, 7:20 am, Russell Keith-Magee <russ...@keith-magee.com>
wrote:
> On Sat, Jan 29, 2011 at 8:55 PM, Jari Pennanen <jari.penna...@gmail.com> wrote:
> If an engineer came to their supervisor with a problem and said "I'm
> going to fix this problem with a global variable", they would be
> soundly beaten by any supervisor worth their salt. Somehow, because
> the name has been changed to "threadlocal", global variables have
> suddenly become acceptable.
>
> Every single problem associated with using global variables exists
> with threadlocals -- and then a few more. They *can* be used
> successfully. However, in almost every case, they can also be avoided
> entirely with a good dose of rigorous engineering.

Without the globals:

What are the chances to get the load of patches to Django which would
be required to implement following:

get_request_site_id(request)
get_request_media_root(request)
get_request_media_url(request)

Even these would require big patches, and thats not going to happen,
it literally takes *years* to get those interfaces to Django and make
all the apps working with the above interfaces even though they are
simple changes.

On top of that there had to be at least a second auth backend method
(if not altered completly see below):

get_user_request(user_id, request=None)

And replace the django.contrib.auth.get_user() backend get_user() call
with get_user_request()

-----
Altering the auth backend role.

In fact I think this module level django.contrib.auth.get_user(),
login() is currently wrong way to do things in the first place IMO, it
cuts out the authentication backend all together. It would be simpler
to imagine this login / get_user as following:

- Saving the user to request.session (currently handled by the module
level login(), and backend.authenticate())
- Loading user from request.session (requests after login) (currently
handled by the module level get_user(), and backend.get_user())

There is no simple way one could define own behavior for this saving
and loading the user to session in auth backends. That simply is odd
to me. It sounds like the thing I would like to define myself.

If there were a way to define saving and loading user one could do
cool stuff like:

1. per site login system (saving site_id to session during saving and
fetching it from session during loading)
2. during login caching of user permissions to session (no need to hit
database after login for permissions!)

One could define the saving and loading of user to session in
authentication backend with simple methods like:

authbackend.save(session, user) -> bool
authbackend.load(session) -> user object

Then:

- django.contrib.auth.login would become the caller for
backend.save(session, user) of course the login() could still use the
backend_session_key and user_id there
- django.contrib.auth.get_user would become the caller for
backend.load(session)

Jari Pennanen

unread,
Jan 30, 2011, 12:53:46 PM1/30/11
to Django developers
With globals:

No patches to Django required. Flatpages, media urls, media roots can
be customed by request and works without single problem. Mostly
because settings are used like settings.SITE_ID etc. and not like
getattr(settings, 'SITE_ID') in apps.

If Django ever is patched to work without this thread local trick, I
can very simply change it to use the interfaces I speaked about.
*Using* these interfaces I proposed (instead of globals) is simple,
but patching Django to use them is slow process. I'm willing to work
on it, but I don't think there is enough momentum to make for instance
the:

get_request_media_root(request)
get_request_media_url(request)

I proposed.

Jari Pennanen

unread,
Jan 30, 2011, 1:03:27 PM1/30/11
to Django developers
In above I have error:

authbackend.save(session, user) -> bool
authbackend.load(session) -> user object

should be:

authbackend.save(request, user) -> bool
authbackend.load(request) -> user object

Since getting site id from request is the thing I need to do and save
it to session.

Daniel Moisset

unread,
Jan 30, 2011, 1:39:14 PM1/30/11
to django-d...@googlegroups.com
On Sun, Jan 30, 2011 at 2:20 AM, Russell Keith-Magee
<rus...@keith-magee.com> wrote:
>
> Every single problem associated with using global variables exists
> with threadlocals -- and then a few more. They *can* be used
> successfully. However, in almost every case, they can also be avoided
> entirely with a good dose of rigorous engineering.
>
> I *strongly* advise against the use of this technique.
>

I understand (and completely support) your objection, specially when
someone says «one could save even the request object to thread
"global" and it would be accessible anywhere.» (which would make code
using requests, i.e. a lot of it harder to reuse and to test)

But the core problem being solved here, about moving SITE_ID to a
threadlocal, doesn't look like bad engineering to me. In fact is
moving something that is already global (everything in settings.py is)
to alocation which is *less* global (thread-wise instead of
process-wise). So this is an increase in locality, if I got it
right...

In which way is this worse than the current state of a global
variable? (and in general, do you have some reference to explain
«Every single problem associated with using global variables exists
with threadlocals -- and then a few more»? I tend to think that a
thread local is generally better than a global, even if it tends to be
worse that good API design/argument passing)

Regards,
D.

Jari Pennanen

unread,
Jan 30, 2011, 1:58:19 PM1/30/11
to Django developers
Notice that I never suggested *django* to implement thread local hack,
it just allowes me to continue. The thread local hack is just that
hack, it hides the real problem for now since Django does not support
the stuff I need it to.

Settings object should be considered mainly read-only, if the stuff
saved to there is not read-only, then it probably is in wrong place.

I've put my *non* thread local proposal here: http://ciantic.github.com/multisited/README.html
I would like comments on that from Russ or someone who has influence
on Django core.

lwc...@gmail.com

unread,
Jan 30, 2011, 4:09:47 PM1/30/11
to django-d...@googlegroups.com
I believe this ticket: http://code.djangoproject.com/ticket/14628

which was created during this chat session

http://www.revsys.com/officehours/2010/nov/05/#question5

is also relevant to the issue at hand.

An interesting bit of that chat is:
jacobkm
nicoechaniz: one hint is that although the documentation says that you shouldn't modify settings at runtime, in fact many of them *can* be changed at runtime without problems. Unfortunately there's a bit of a trial-and-error in figuring out which.
Ticket #14628 describes the situation.


The proposals in Ticket #15089 (and the related thread) approach the multi-tenancy problem by providing the means to determine the current site dynamically, but this, IMHO, does not fully solve the issue.

In our case (it's described in the chat-log) we have 1500 sites which would benefit from having separate settings files as there are configuration details which change from one to the other. These are not only core and contrib but also external apps, like django-filebrowser for example, for which we have different upload limits per client; we use some 10 external apps or more.

What we have been working on makes use of the threadlocals approach and a proxy object for settings values. This way, whenever some code is trying to get the value for a setting, the actual value is determined from the settings corresponding to the site being requested.
All of this has worked very well to a certain extent, but we have had a very hard time figuring out which settings can actually be overridden at runtime and which can't.

I believe that to achieve true multi-tenancy it should be made very clear when a setting is meant to stay unchanged and when it's OK to change it at runtime. The above mentioned ticket proposes this distinction for django settings but a complete resolution should also involve a strategy to allow third party app developers to easily make this same distinction in their own code. Eventually, apps could state if they are multi-tenancy compatible (all settings can be changed at runtime).

Maybe an easy way to accomplish this separation would be to have dynamic and static settings live in different files, which would make it self explanatory. It would be up to the developer of each app to understand which of his settings can actually be changed at runtime ad which can't.


What we have implemented ATM for our specific problem is a "man in the middle" (using twisted) which spawns server processes based on the requested site and only keeps alive a number of them and discards those that have been idle for a while. This is only useful in a situation like ours where most of the sites get very little hits per day, but it's been working just fine so far. The code (needs some love and generalization) is available at: http://bitbucket.org/san/luisito




James Hancock

unread,
Jan 31, 2011, 1:30:35 AM1/31/11
to django-d...@googlegroups.com
This post is getting pretty long. But I had a simple Django fix that would make it work a lot easier for me, and might help others. (I say this because of how I implemented it, I am working with about 60 different sites and it is a pretty simple arrangement)

Imagine you were able to set a site_id per request rather than relying on the settings SITE_ID. Django would then checked for a request's site_id first and then second check the settings one.

It is really simple, but it would fix a lot of my problems and avoid having to switch around the settings SITE_ID per request. Setting the requests site_id with middleware is straightforward enough and further customizations to the request. Changing the urls for example.

This approach would avoid:
  • Threading issus, and global variables
  • Adding new functions to work with (Saves time and pain, documentation, testing, releases so forth)
  • Doesn't break things that are tied into the Sites Framework(site-maps, comments, etc...)
It also feels a little more DRY to me.

Let me know if I have assumed something I shouldn't have. I don't know much about how the current implementation and use of SITE_ID in the backend.

Cheers,
James Hancock

Xavier Ordoquy

unread,
Jan 31, 2011, 1:49:17 AM1/31/11
to django-d...@googlegroups.com

Le 31 janv. 2011 à 07:30, James Hancock a écrit :

This post is getting pretty long. But I had a simple Django fix that would make it work a lot easier for me, and might help others. (I say this because of how I implemented it, I am working with about 60 different sites and it is a pretty simple arrangement)

Imagine you were able to set a site_id per request rather than relying on the settings SITE_ID. Django would then checked for a request's site_id first and then second check the settings one.

It is really simple, but it would fix a lot of my problems and avoid having to switch around the settings SITE_ID per request. Setting the requests site_id with middleware is straightforward enough and further customizations to the request. Changing the urls for example.

This approach would avoid:
  • Threading issus, and global variables
  • Adding new functions to work with (Saves time and pain, documentation, testing, releases so forth)
  • Doesn't break things that are tied into the Sites Framework(site-maps, comments, etc...)
It also feels a little more DRY to me.

Let me know if I have assumed something I shouldn't have. I don't know much about how the current implementation and use of SITE_ID in the backend.

Cheers,
James Hancock

The thread is pretty long because there are also 2 threads in one:
 - one for simply changing the site_id per request
 - one for changing the all setting per request

This being said, your solution sound pretty simple  but would probably require all the applications to be rewritten.
If your own applications aren't much an issue, other might be.

Regards,
Xavier.

Jari Pennanen

unread,
Jan 31, 2011, 4:18:42 AM1/31/11
to Django developers
On Jan 31, 8:30 am, James Hancock <jlhanc...@gmail.com> wrote:
> This post is getting pretty long. But I had a simple Django fix that would
> make it work a lot easier for me, and might help others. (I say this because
> of how I implemented it, I am working with about 60 different sites and it
> is a pretty simple arrangement)
>
> Imagine you were able to set a site_id per request rather than relying on
> the settings SITE_ID. Django would then checked for a request's site_id *first
> *and then *second *check the settings one.

Thats in my proposal implementation "2. Using middleware"
http://ciantic.github.com/multisited/README.html

What comes to documenting which settings can be changed runtime, it
sounds madness. Is there any settings like that? I can't think of any,
all of the relevant settings will suffer threading issues as soon as
changed in middleware (unless used local thread trick, and that is not
advised). There is no reason one should change settings attributes on
runtime unless in tests.

Carl Meyer

unread,
Jan 31, 2011, 1:27:51 PM1/31/11
to Django developers
On Jan 31, 1:49 am, Xavier Ordoquy <xordo...@linovia.com> wrote:
> The thread is pretty long because there are also 2 threads in one:
>  - one for simply changing the site_id per request
>  - one for changing the all setting per request

Exactly!

For the record, as far as I'm concerned #15089 is limited in scope to
the first issue: making contrib.sites provide API for getting a Site
object that projects can configure such that the returned Site object
is based on the request, without having to resort to threadlocals.
This may not satisfy everyone's definition of "true multitenancy," but
it covers a lot more use cases than contrib.sites does now.

For those whose definition of "true multitenancy" requires being able
to modify arbitrary settings at runtime per-request, I see only two
realistic (thread-safe) options:

1) Using threadlocals, as discussed above.
2) Fixing Django's settings to be an instance of some kind of
configured "app" object (a la Flask, except we can't call it an app
because we already use that name for something different) rather than
a process global. I don't know that anyone disagrees this would be
better in principle, but I haven't seen any proposals yet for how to
do it in a backwards-compatible way (though I'm not sure that means
it's impossible). If it can't be done backwards-compatibly, that puts
it in the Django 2.0 timeframe (i.e. in the foggy mists of some
unknown future time).

Carl

Jari Pennanen

unread,
Jan 31, 2011, 3:38:34 PM1/31/11
to Django developers

On Jan 31, 8:27 pm, Carl Meyer <carl.j.me...@gmail.com> wrote:
> On Jan 31, 1:49 am, Xavier Ordoquy <xordo...@linovia.com> wrote:
>
> > The thread is pretty long because there are also 2 threads in one:
> >  - one for simply changing the site_id per request
> >  - one for changing the all setting per request
>
> Exactly!

I've not supported arbitrary settings, only SITE_ID, MEDIA_URL and
MEDIA_ROOT. These alone should allow to do sites like Weebly etc.

In fact I directly oppose multinenacy for e.g. INSTALLED_APPS -- it
can be solved later, even these three settings will take a huge effort
to get support.

> 1) Using threadlocals, as discussed above.

Yes, thread locals are great hack for SITE_ID, MEDIA_URL and
MEDIA_ROOT meanwhile, other settings I have not studied.
Reply all
Reply to author
Forward
0 new messages