proposal for lazy foreignkeys

86 views
Skip to first unread message

Carl Meyer

unread,
Sep 25, 2010, 1:47:32 PM9/25/10
to Django developers
Hi all,

I've seen some level of interest in the idea of a lazy foreign key
(one whose target table is determined by project configuration in some
way, not hardcoded by the app/model in which it lives). The idea was
most recently brought up again in Eric Florenzano's keynote at
DjangoCon. I have some ideas regarding possible API for this, and
would be glad for feedback.

First, a couple motivating use cases:

1. Reusable apps overuse GenericForeignKey. GFKs are inefficient and
smell bad. They're good to have around when you really need to link to
any one of a possibly-growing set of models. But currently reusable
apps often use them anytime they want to link to "some domain model
but we don't know which one" - even if in practice in most cases it
will be only one! A lazy foreign key would be a better solution.

2. Standardization with flexibility: i.e. possible-future replacement
of contrib.auth.User. To be clear, I am not at this point proposing
any changes at all to contrib.auth. But in some future possible
contrib.auth refactoring, a lazy foreign key could provide a way for
reusable apps to point to a common User model, without Django having
to provide a concrete implementation of that model.

The concept:

We introduce the "virtual" model, which is an abstract model with the
following additional characteristics:

- It can be the target of a ForeignKey.
- It may only have one direct concrete subclass, and if it is the
target of any ForeignKey it must have exactly one.

At runtime, any ForeignKeys pointing to a virtual model are resolved
to actually point to the concrete subclass of that virtual model.

Like an abstract model, a virtual model may include fields, methods,
etc. These can be considered the specification of an interface: any
model with a ForeignKey to this virtual model can expect the concrete
model to satisfy that interface. This is particularly helpful for a
contrib.auth-type use case: reusable apps don't only need a User model
to point FKs at, they also often need at least some minimal set of
fields/properties they can rely on being present.

It's not required for the virtual model to have any fields, of course:
in some use cases (voting, tagging) the reusable app doesn't need to
know anything at all about its target model. A hypothetical voting app
could simply provide an empty "VotableObject" virtual model, which
would be inherited by the domain model which can receive votes. Since
Django already supports multiple inheritance for abstract models, this
is a minimal and non-restrictive requirement for the domain model
author.

Advantages of this proposal:

1. No new settings.
2. In terms of new code API, almost nothing: a new "virtual = True"
Meta keyword.
3. Very little conceptual overhead; reuses existing constructs as much
as possible.

I plan to put together a patch to show working code for this, but I'd
be glad for any feedback at this point, especially if there are
obvious conceptual problems I'm overlooking. Thanks!

Carl

Alex Gaynor

unread,
Sep 25, 2010, 1:50:09 PM9/25/10
to django-d...@googlegroups.com
> --
> You received this message because you are subscribed to the Google Groups "Django developers" group.
> To post to this group, send email to django-d...@googlegroups.com.
> To unsubscribe from this group, send email to django-develop...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
>
>

ISTM this would solve the "auth.User" issue, but doesn't help reusable
apps at large: one can trivially imagine a project that wants voting
(or tagging ;), or commenting, or ...) on more than one model.

In any event, my brain needs to digest (and I need lunch),
Alex

--
"I disapprove of what you say, but I will defend to the death your
right to say it." -- Voltaire
"The people's good is the highest law." -- Cicero
"Code can always be simpler than you think, but never as simple as you
want" -- Me

Carl Meyer

unread,
Sep 25, 2010, 1:56:53 PM9/25/10
to Django developers
On Sep 25, 1:50 pm, Alex Gaynor <alex.gay...@gmail.com> wrote:
> ISTM this would solve the "auth.User" issue, but doesn't help reusable
> apps at large: one can trivially imagine a project that wants voting
> (or tagging ;), or commenting, or ...) on more than one model.

Of course! This isn't a silver bullet for every single use case. For
reusable apps that need to support attachment to multiple models,
either a GFK would still be used, or even better, some solution for
multiple-instances-of-an-app (I believe the app-cache-refactor this
GSoC moved us somewhat closer to this possibility, but doesn't
actually support it yet).

But I do think there's a substantial class of reusable apps currently
using GFKs where a lazy foreignkey would serve better. For instance,
if this proposal went through, I'd personally author a reusable voting
app that only supports voting on one domain model, but gives you
efficient and less-hassle no-GFK database schema in exchange. I think
a lot of projects would find that useful, and for now I could simply
point people who want voting-on-multiple-models to alternative apps
using GFKs.

Carl

flo...@gmail.com

unread,
Sep 25, 2010, 3:23:06 PM9/25/10
to Django developers
On Sep 25, 10:47 am, Carl Meyer <carl.j.me...@gmail.com> wrote:
> The concept:
>
> We introduce the "virtual" model, which is an abstract model with the
> following additional characteristics:

I'm a fan of this implementation strategy, it's a much better solution
than the setting approach IMO.

Thanks,
Eric Florenzano

Russell Keith-Magee

unread,
Sep 26, 2010, 2:10:47 AM9/26/10
to django-d...@googlegroups.com

On first inspection, absent of an implementation, I think this is an
interesting approach.

My biggest technical concern is the same as Alex's -- that it doesn't
address the 'FK to multiple models' problem. While I agree with your
'no silver bullet' response to Alex, I also don't want to end up with
two (or more) completely different ways of solving the same problem.
At the very least, I'd like to have some certainty that the solution
for single concrete class problem will be conceptually similar to the
multiple concrete class problem.

I also have two technical concerns about how the virtual keyword will
work in practice.

Firstly, take the likely migration path for contrib.auth. We would
introduce an AbstractUser that encompasses the 'basic' concept of a
user, and is marked as a virtual. But we still need to ship a concrete
User that provides the current implementation. At which point, we now
have our single allowed concrete instantiation, and users can't define
their own User class.

I can see this being a common pattern; if you want your app to work
out of the box, you will provide a bare-bones concrete implementation,
which will then block anyone else from providing a concrete
implementation. This necessitates introducing either the ability to
hide a model, or provide a different way of registering models so that
unneeded concrete types aren't instantiated.

Of course, the simple solution here would be to split the concrete
model out into a different app, so that you optionally include
auth.User if you actually want it.

Secondly, this approach requires that content objects that are to be
the target of these relationships must share a virtual base class.
This makes for great pure-OO, but it sucks from the point of view of
duck typing.

For example, consider the case of tagging -- if you want to put a Tag
in a relationship with some content object, then you need to make that
content object inherit from a virtual "TaggableObject" base class.
This means that you need to have control of the base class so that you
can install that mixin. If you're using a reusable app from a third
party to provide your content object (e.g., a Blog model from a
blogging app), you don't have access to the model definition.

The rest of this email is thinking out loud, and probably has a whole
bunch of sharp edges.

It seems to me that what we need isn't a 'virtual' keyword on the
content class, but a virtual representation on the referrer class,
plus a way of registering instances of concrete subclasses.

Here's how it might work:

* Allow a model to have a ForeignKey to an abstract base class; but
in doing so, you make the model with the FK a virtual model.

* Introduce a "VirtualForeignKey" that can point at *any* content
object. Again, having a VirtualForeignKey on a model makes the entire
model virtual.

* In configuration code (I'm a little hazy on exactly where would be
best), provide a way to instantiate virtual models as concrete models.

So, since blog entries can be owned, we would need to register:
MyBlogEntry = make_concrete("MyBlogEntry", model=BlogEntry, owner=MyUser)

So - when we want to allow blogs to be tagged, we might register:
BlogTag = make_concrete("BlogTag", model=Tag, content=MyBlogEntry)

Conceptually, a model could even have multiple virtual extension points:
BlogTag = make_concrete("BlogTag", Tag, content=MyBlogEntry,
image=PrettyPicture)

As a result of these changes, code like:

Blog.objects.all()

wouldn't work out of the box, because Blog is a virtual class.
However, you could use the app cache:

MyBlogEntry = get_model('blog','MyBlogEntry')
MyBlogEntry.objects.all()

We could also include a shortcut on model classes to find their
concrete instances:

Blog.concrete(owner=MyUser).objects.all()

In practice, this would mean that reusable apps that had virtualized
components will need to be careful to ensure that the concrete model
is appropriately realized, and that the concrete model is then passed
around the app as required. For example, Django's admin (in a
virtualized User world) would need to make all the internal models
with FKs to user were rendered concrete.

This also fits in nicely with the app-cache refactor, because the
'app' object provides a convenient place to specify models and
perform the concreting process.

This would also be backwards compatible, because FKs to abstract base
classes are forbidden at the moment. This new behaviour would only be
required for models that introduced virtualizing foreign keys.

Again - this isn't completely thought through, and I have written even
less code than you have :-), but I'm putting it out on the stoop to
see if the cat licks any of it up.

Yours,
Russ Magee %-)

Hanne Moa

unread,
Sep 26, 2010, 10:06:13 AM9/26/10
to django-d...@googlegroups.com
On 25 September 2010 19:47, Carl Meyer <carl.j...@gmail.com> wrote:
> 1. Reusable apps overuse GenericForeignKey. GFKs are inefficient and
> smell bad.

I seem to gradually be going away from GenericForeignKey and using
"Glue"-models instead. App1, model1, is connected to App2, model2 via
App3, Glue-model3, which has foreign-keys to model1 and model2, and
sometimes a little extra. App1 and App2 need not know about eachother
at all that way.

I'm considering making my own tagging-module this way, as tags *is*
one of the reusable apps that is useful to connect to more than one
model in a project. Ditto for comments.


HM

Klaas van Schelven

unread,
Sep 26, 2010, 5:45:07 AM9/26/10
to Django developers
Hi all,

My 2 cents.

I think the fact that you cannot have Foreign Keys link to specific
models is a specific instance of a larger problem. It's currently not
possible to make an app which has models with any kind of extended
(subclassed) behavior. Say I create a tagging app. Now I might want
to:

* Create a foreign key to a specific taggable model (the problem at
hand)
* Add weights to tags
* Add owners to tags
* ...

I think having this ability to extend models and pass the extended
versions back into the reusable app is the key missing feature.

Klaas

Klaas van Schelven

unread,
Sep 26, 2010, 5:48:50 AM9/26/10
to Django developers
Some more cases that would be important (not just ForeignKey):

inheriting from the VirtualModel
any kind of relationship to it (manytomany)
ModelForms

On Sep 25, 7:47 pm, Carl Meyer <carl.j.me...@gmail.com> wrote:

Patryk Zawadzki

unread,
Sep 27, 2010, 5:36:09 AM9/27/10
to django-d...@googlegroups.com
On Sun, Sep 26, 2010 at 8:10 AM, Russell Keith-Magee
<rus...@keith-magee.com> wrote:
> My biggest technical concern is the same as Alex's -- that it doesn't
> address the 'FK to multiple models' problem. While I agree with your
> 'no silver bullet' response to Alex, I also don't want to end up with
> two (or more) completely different ways of solving the same problem.
> At the very least, I'd like to have some certainty that the solution
> for single concrete class problem will be conceptually similar to the
> multiple concrete class problem.

With the risk of being ignored once again, I dare to link to a working
solution that does not need any changed to the framework itself (other
than perhaps including the factory class):

http://gist.github.com/584106

--
Patryk Zawadzki

Luke Plant

unread,
Sep 27, 2010, 11:46:03 AM9/27/10
to django-d...@googlegroups.com
On Mon, 2010-09-27 at 11:36 +0200, Patryk Zawadzki wrote:

>
> With the risk of being ignored once again, I dare to link to a working
> solution that does not need any changed to the framework itself (other
> than perhaps including the factory class):
>
> http://gist.github.com/584106

This looks rather good to me. It may have been ignored before because
it has no comments and some things are not immediately obvious. For
example, you are basically proposing that the concrete models are passed
into view functions via URLconf, and from there are passed into any
functions which need them, and so they would never actually need to be
imported by the app that defines the abstract model.

I for one would be much happier to not add any more machinery via Meta
options. With some cleanup, and some documentation of this pattern, and
possibly a better name, I think the AbstractMixin class you propose
could be a good candidate for inclusion in core.

Some notes:
1) it seems like line 15 in abstract.py should say 'abstract':'False',
not 'True' - did I miss something?

2) there would need to be some way of merging the concrete class's own
Meta options with the abstract class's Meta options

3) why do we need the _classcache? Is the key used specific enough -
what if two different apps both create 'MyCategory' based on
'CategoryFactory', using them in different situations?

Thanks,

Luke

--
"Christ Jesus came in to the world to save sinners" (1 Timothy 1:15)

Luke Plant || http://lukeplant.me.uk/

Luke Plant

unread,
Sep 27, 2010, 11:52:23 AM9/27/10
to django-d...@googlegroups.com
On Mon, 2010-09-27 at 16:46 +0100, Luke Plant wrote:

> Some notes:
> 1) it seems like line 15 in abstract.py should say 'abstract':'False',
> not 'True' - did I miss something?

Cancel that one, I was missing the fact that we are still inheriting
from the generated class, not doing:

MyCategory = CategoryFactory.construct()

Patryk Zawadzki

unread,
Sep 27, 2010, 11:56:07 AM9/27/10
to django-d...@googlegroups.com
On Mon, Sep 27, 2010 at 5:46 PM, Luke Plant <L.Pla...@cantab.net> wrote:
> On Mon, 2010-09-27 at 11:36 +0200, Patryk Zawadzki wrote:
>> With the risk of being ignored once again, I dare to link to a working
>> solution that does not need any changed to the framework itself (other
>> than perhaps including the factory class):
>>
>> http://gist.github.com/584106
> This looks rather good to me.  It may have been ignored before because
> it has no comments and some things are not immediately obvious.  For
> example, you are basically proposing that the concrete models are passed
> into view functions via URLconf, and from there are passed into any
> functions which need them, and so they would never actually need to be
> imported by the app that defines the abstract model.
>
> I for one would be much happier to not add any more machinery via Meta
> options. With some cleanup, and some documentation of this pattern, and
> possibly a better name, I think the AbstractMixin class you propose
> could be a good candidate for inclusion in core.
>
> Some notes:
> 1) it seems like line 15 in abstract.py should say 'abstract':'False',
> not 'True' - did I miss something?

It is there because I inherit from the .construct() result rather than
taking it directly as a solid class. This helps with debugging as all
the tracebacks show proper module for the model.

> 2) there would need to be some way of merging the concrete class's own
> Meta options with the abstract class's Meta options

True.

> 3) why do we need the _classcache?  Is the key used specific enough -
> what if two different apps both create 'MyCategory' based on
> 'CategoryFactory', using them in different situations?

I did it so there was no way to create two different mechanics for the
same model accidentally.

--
Patryk Zawadzki

Reply all
Reply to author
Forward
0 new messages