Design Decision: Dynamic settings.SITE_ID (ref. #4438)

129 views
Skip to first unread message

Max Battcher

unread,
May 31, 2007, 1:00:06 AM5/31/07
to django-d...@googlegroups.com
I recently created and started testing a simple middleware, that I
thought may even be worthy of django.contrib.sites. In brief, here's
the question being asked:

Is modifying settings, and in particular settings.SITE_ID, allowable?
Is it workable? (ie, what breaks if that assumption does not hold?)

Here's the impetus:

myblogname.example.com can look friendlier and be easier to remember
than example.com/myblogname. Many modern sites place user-controlled
content under relatively dynamic subdomains (ie, new users might sign
up any moment). It is also certainly advocated by many Rails
philosophers.

For most sites of this type it doesn't make sense to be required to
build a VirtualHost file and a Settings file for each and every
User/Blog/Project/Family Dog/Whatever. In those cases it makes sense
to use wildcard (*) subdomain VirtualHost and a single settings file
with a bunch of shared applications. I think the
django.contrib.sites.models.Site seems perfect for the task of
determining which Sites might be available at a given moment, and is
an already existing way to configure applications to handle such.

To make this as easy and nearly transparent to existing Site-based
code I proposed a simple process_request middleware to set
settings.SITE_ID to the id of the Site whose domain matches
request.META['HTTP_HOST'] and I attached the code to a simple one that
I started testing to Ticket #4438. [1]

Here's where the debate begins... I would like to see something as
simple as adding a middleware to support dynamic Sites rather than a
single static Site for a settings file. But, can you modify the
settings from a middleware? Should you?

I don't think this sort of thing belongs in the view because it keeps
someone from using unmodified/unwrapped generic views (which can often
be a sign of a loss of DRY, right?) and because it makes a huge
distinction between a website with a static, unchanging SITE_ID and a
settings file that might host multiple SITE_IDs. It also doesn't
belong in a Manager class because a Manager shouldn't need to deal
with HttpRequest and if it did, it can't be used for filtering in a
urls.py file if it requires a request object... I don't see a better
alternative.

[1] http://code.djangoproject.com/ticket/4438

What do you y'all think?

--
--Max Battcher--
http://www.worldmaker.net/

Malcolm Tredinnick

unread,
May 31, 2007, 1:57:03 AM5/31/07
to django-d...@googlegroups.com
On Thu, 2007-05-31 at 01:00 -0400, Max Battcher wrote:
> I recently created and started testing a simple middleware, that I
> thought may even be worthy of django.contrib.sites. In brief, here's
> the question being asked:
>
> Is modifying settings, and in particular settings.SITE_ID, allowable?
> Is it workable? (ie, what breaks if that assumption does not hold?)

Every single piece of code that caches anything based on having read
something from settings would then have to query settings every time
(and recompute whatever it cached). There is code that relies on the
current behaviour and the assumption that settings will never change
once you access them makes this a useful pattern. Having to change this
to "some settings will change" means there is always some flipping back
and forth checking which set of assumptions you are operating under when
developing. Feels error prone. I'm pretty stupid, so I'm a big fan of
things behaving as expected with consistency.

I personally also quite like the idea that settings are set entirely by
the user of an application -- in the normal use-case of a settings file
and a webserver -- and are not going to be messed around by code. This
idea isn't axiomatic because of settings.configure(), but if you're
using that it's an unusual situation and typical user configuration has
changed in purpose slightly (and deliberately).

An alternative approach to a solution for a problem that requires this
question (can I change a setting?) is to work out whether it makes sense
to move the "feature" away from being a setting (since it's no longer a
user-configured setting, but the user setting can still act as a hint).

[.. prologue elided ...]


> I would like to see something as
> simple as adding a middleware to support dynamic Sites

That's the problem that might be interesting to solve! All the stuff
about settings changes flows from your particular solution. Don't
misunderstand me here: I'm not trying to dismiss on your solution -- I
just want to distinguish between real barriers on the way to the goal
and barriers imposed by other decisions.

Two alternatives spring to mind. Neither of these are fully thought-out
yet, but let's see how they sound:

(1) We introduce a formal thread-local settings feature as well. Things
that might change on a thread-to-thread basis (which corresponds to
per-request in web-based operation and is a no-op, essentially, in
scripts, etc) are accessed through this and it falls back to normal
settings for things that aren't present. So normal usage of sites would
not require changes and your sort of situation could put the SITE_ID key
into this storage.

We would need to decide which current settings could be accessed through
this module, since all accessors would have to be changed. I've been
thinking about this idea on and off for a bit, because it would also
clean up some other pieces of code that do thread-local stuff (i18n
settings, internally, being one example).

(2) We come up a with a more dynamic site object. This is a little
harder than it first seemed it might be. The problem isn't contrib.sites
itself -- there are two changes necessary in there with regards to
getting the current site and if we can find a way to specify what
"current" is, changing those methods is easy. The problem is all the
other places that access the sites module by querying the Site model
directly. I guess replacing "settings.SITE_ID" with a new
contrib.sites.get_current_site_id() method is a simple change there.

Many cases can actually be fixed by using CurrentSiteManage in a couple
of models (maybe not as the default manager), but a few are harder, so
making settings.SITE_ID dynamic, via a function, might be easier.

Not quite sure how to tell the sites package what the current site ident
is for the current thread (yet).

Conclusion
----------

I'm mostly in favour of making this type of functionality possible
(feeping creaturism is the main argument against, but I'm always
negative on new features for that reason).

I realise your original question was motivated by not wishing to make
any changes to the code at all and instead *just completely violate one
of our most sacred, honorable and historic assumptions* (alright, you
may not have viewed that way).

I'd like to solve the problem in a slightly more intrusive way, though,
just because I like viewing the settings module as a static thing. Which
conflicts massively with my "I don't like random backwards incompat
changes" inner child, but that's something I have to work out.

Don't take my approach as anything close to gospel, though. I'm just
first in line with a response. There will be other opinions.

Regards,
Malcolm

Max Battcher

unread,
May 31, 2007, 2:38:58 AM5/31/07
to django-d...@googlegroups.com
On 5/31/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> Every single piece of code that caches anything based on having read
> something from settings would then have to query settings every time
> (and recompute whatever it cached). There is code that relies on the
> current behaviour and the assumption that settings will never change
> once you access them makes this a useful pattern. Having to change this
> to "some settings will change" means there is always some flipping back
> and forth checking which set of assumptions you are operating under when
> developing. Feels error prone. I'm pretty stupid, so I'm a big fan of
> things behaving as expected with consistency.

I'm starting to see this. In my testing there is an inconsistency
from time to time in the results and/or the caching.

> An alternative approach to a solution for a problem that requires this
> question (can I change a setting?) is to work out whether it makes sense
> to move the "feature" away from being a setting (since it's no longer a
> user-configured setting, but the user setting can still act as a hint).

Agreed. Hence the reason I try to make sure to provide the full story
as I see it so that a little Root Cause Analysis can be performed.

> > I would like to see something as
> > simple as adding a middleware to support dynamic Sites
>
> That's the problem that might be interesting to solve! All the stuff
> about settings changes flows from your particular solution. Don't
> misunderstand me here: I'm not trying to dismiss on your solution -- I
> just want to distinguish between real barriers on the way to the goal
> and barriers imposed by other decisions.

I'm not married to my solution and opening this discussion was
precisely what I saw as finding a much better solution. I realize
that I'm not an everyday Django coder and I've very rarely used
Django's source as anything other than a reference when I have a
question as a consumer of the framework, so I certainly realize that
my first instincts might not be the best.

> Two alternatives spring to mind. Neither of these are fully thought-out
> yet, but let's see how they sound:
>
> (1) We introduce a formal thread-local settings feature as well.

> (2) We come up a with a more dynamic site object.

(1) sounds more generally useful. If you've been thinking about it
for some time and it looks like it might solve/alleviate some other
things along the line it might be the better approach. (2) does
seem like a lot of work for a single (contrib) application, but having
fewer checks against SITE_ID could be a nice benefit (reducing some
over-reliance on django.conf.settings).

> I realise your original question was motivated by not wishing to make
> any changes to the code at all and instead *just completely violate one
> of our most sacred, honorable and historic assumptions* (alright, you
> may not have viewed that way).

I just thought I might have found a simple minimal solution in a few
lines of code. I don't mind realizing that my solution created more
side problems than it was worth. It was a learning experience... I
had assumed that the functionality didn't exist because few had
thought about it and fewer used the sites application rather than roll
their own thing. Now I know that I was wrong and the problem is a bit
bigger than I first thought it was.

I personally think that Dynamic Sites support of one form or another
should be provided sometime by 1.0, just because I don't think
django.contrib.sites is complete without. It's the sort of
functionality a new person might assume it contains... It took me a
while to get used to the idea that sites didn't bother to check the
address at all and simply took SITE_ID for face value.

> I'd like to solve the problem in a slightly more intrusive way, though,
> just because I like viewing the settings module as a static thing. Which
> conflicts massively with my "I don't like random backwards incompat
> changes" inner child, but that's something I have to work out.

I have no problem for seeing it solved the intrusive way if that's
what makes the most sense for the problem set. Hopefully something
like this can even be done without breaking too many backward
compatibility eggs...

James...@gmail.com

unread,
Jun 6, 2007, 5:20:37 PM6/6/07
to Django developers
Malcolm:

We have currently implemented a Middleware Hack to alter the
settings.SITE_ID

The problem being that our project is serving more then 10 domain
names and we are aiming at about 1200 requests/second. We don't want
to hesitate when the load gets high, to through another App Server to
handle the traffic.

I am wondering how this exactly affects the caching? Shouldn't the
caching be getting and setting based off of settings.SITE_ID? Your
post scares me and makes me want to dig through all the Django Caching
code to see how exactly it will break. Maybe we will be the first to
patch the breakage as it doesn't seem like it should break to me.

Any clarification would be helpful.

Jimmy

Malcolm Tredinnick

unread,
Jun 7, 2007, 7:45:44 PM6/7/07
to django-d...@googlegroups.com

Remember that the word caching just means "storing for later use". It
doesn't always mean HTTP caching, which is what I suspect you are
talking about.

There are many places in the code where a settings value is read and
then some action taken and data stored somewhere based on that setting.
For example, the list of installed apps (which is a setting) is used to
populate the global app cache (once!) in django/db/models/loading.py.
The USE_I18N setting is used to determine which translation functions to
install -- we only do that once. Those are two I can think of off the
top of my head; there are other instances, though.

In short, I was using "cached" in the generic Comp Sci sense, not in the
HTTP sense. Sorry for any confusion.

Regards,
Malcolm


Marty Alchin

unread,
Jun 7, 2007, 8:00:24 PM6/7/07
to django-d...@googlegroups.com
On 6/7/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> Those are two I can think of off the
> top of my head; there are other instances, though.

For dbsettings at least, I expect a dynamic settings.SITE_ID would be
even more damaging. It loads settings from the database once during
startup and only touches the database again when updating them. So all
"sites" would end up with the same dbsettings, even though they
shouldn't.

When updating them, it would assign them to the appropriate site,
which would then be used throughout that instance. Unfortunately, this
would be reset the next time the server starts, so you'd have to
manually set all your settings on every site every time you start the
server, in order to get them to work properly.

The only around this for dbsettings would be to maintain a dictionary
mapping each SITE_ID to its dbsettings cache. I don't enjoy that
thought, especially since it would be a rare case that someone would
want to do it..

-Gul

Malcolm Tredinnick

unread,
Jun 7, 2007, 8:30:47 PM6/7/07
to django-d...@googlegroups.com
On Thu, 2007-06-07 at 20:00 -0400, Marty Alchin wrote:
> On 6/7/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> > Those are two I can think of off the
> > top of my head; there are other instances, though.
>
> For dbsettings at least, I expect a dynamic settings.SITE_ID would be
> even more damaging. It loads settings from the database once during
> startup and only touches the database again when updating them. So all
> "sites" would end up with the same dbsettings, even though they
> shouldn't.

No dbsettings depend on SITE_ID, as far as I can see.

I think you are just confusing multiple issues here. This thread is not
at all about making all settings dynamic based on SITE_ID. It was about
making the functionality currently provided by SITE_ID dynamic (just the
value of SITE_ID -- or it's new equivalent).

Regards,
Malcolm


Marty Alchin

unread,
Jun 7, 2007, 8:42:57 PM6/7/07
to django-d...@googlegroups.com
On 6/7/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> No dbsettings depend on SITE_ID, as far as I can see.

The most recent version takes the current Site into account, using a
ForeignKey to contrib.sites.models.Site. It then uses
Site.objects.get_current() to get the Site for its queries, and as far
as I can tell from its code, get_current() uses settings.SITE_ID to
retrieve that object.

I do realize this wouldn't have any impact on the other Django-proper
settings. But even just changing this one setting would (likely) have
a substantial impact on dbsettings. But, since I haven't actually
tried it, I may well be missing something.

-Gul

Malcolm Tredinnick

unread,
Jun 7, 2007, 8:50:58 PM6/7/07
to django-d...@googlegroups.com
On Thu, 2007-06-07 at 20:42 -0400, Marty Alchin wrote:
> On 6/7/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> > No dbsettings depend on SITE_ID, as far as I can see.
>
> The most recent version takes the current Site into account, using a
> ForeignKey to contrib.sites.models.Site. It then uses
> Site.objects.get_current() to get the Site for its queries, and as far
> as I can tell from its code, get_current() uses settings.SITE_ID to
> retrieve that object.

Oh, sorry, you mean dbsettings, the third-party app, not as an
abbreviation and typo for database settings (internally in Django). My
error.

>
> I do realize this wouldn't have any impact on the other Django-proper
> settings. But even just changing this one setting would (likely) have
> a substantial impact on dbsettings. But, since I haven't actually
> tried it, I may well be missing something.

Read the rest of the thread to see how we were discussing doing this.

I think it (dynamic site handling) is something that might have to
happen in one way or another at some point, so it might be worth
thinking about how you can handle it one day maybe. A hash table mapping
site value to the collection of settings, for example. If we ever make
SITE_ID's equivalent dynamic, it would have to be very low-cost to get
the current value anyway, so the extra hash de-reference wouldn't kill
performance in that case. Is that impossible?

Regards,
Malcolm

Marty Alchin

unread,
Jun 7, 2007, 10:11:16 PM6/7/07
to django-d...@googlegroups.com
On 6/7/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> Oh, sorry, you mean dbsettings, the third-party app, not as an
> abbreviation and typo for database settings (internally in Django). My
> error.

Ah! Yes, that's understandable. I've been calling it django-values for
so long, the name "dbsettings" seems strange to me, too. We'll all get
used to it sooner or later.

> Read the rest of the thread to see how we were discussing doing this.

Wow, I see your point. I had given it a cursory look, but hadn't
really read through it all.

> I think it (dynamic site handling) is something that might have to
> happen in one way or another at some point, so it might be worth
> thinking about how you can handle it one day maybe. A hash table mapping
> site value to the collection of settings, for example. If we ever make
> SITE_ID's equivalent dynamic, it would have to be very low-cost to get
> the current value anyway, so the extra hash de-reference wouldn't kill
> performance in that case. Is that impossible?

It's absolutely possible, and it wouldn't really add that much extra
to it. As you say, it'd just be an extra hash lookup, after all. As
long as there's a standard way to go about it, I'm on board with
making sure dbsettings lines up with it.

-Gul

tzel...@gmail.com

unread,
Jul 30, 2007, 11:47:34 PM7/30/07
to Django developers
I just made a ticket (http://code.djangoproject.com/ticket/5022) that
*might* resolve some issues for certain subdomain situations. I would
love to hear any feedback, good or bad.

Thanks!
-Tom

Reply all
Reply to author
Forward
0 new messages