Model Inheritance

11 views
Skip to first unread message

PB

unread,
Jan 25, 2009, 12:51:42 PM1/25/09
to Django developers
Hi,

I've been experimenting with model inheritance and have become pretty
dissatisfied with the way it is implemented in Django. It seems that
this part of the ORM has not received as much TLC and thought as the
rest of the framework.

Some gripes:

- There is no way to find out whether an object has a Subclass model
short of testing it. This means you have to write an exception
oriented piece of code, which is pretty ugly. Python builtins like
getattr and hasattr are not supported.

- Classical Inheritance is not possible with the current
implementation.
(I wrote about this at http://peterbraden.co.uk/article/django-inheritance)

- It is not possible to overwrite fields in sub classes - should you
wish to do so the ORM will throw an exception. This means the
inheritance is no more than a glorified 1-1 mapping between tables.

I understand that the way this has been implemented means interfacing
with the actual database is a simpler task, but it makes writing true
inheritance code over the top increasingly difficult. I'm interested
as to why this choice was made in the design. Also, do people think
there is room for a third type of inheritance which is closer to the
pythonic inheritance model, and would work alongside the existing
methods? As it is I find myself frequently having to write a wrapper
for the ORM to support the things I'm trying to do.

Have I missed something - is it like the many times where I find there
is a simple way of doing something after much time struggling?

I've read the thread where inheritance is proposed (http://
groups.google.com/group/django-developers/browse_thread/thread/
1bc01ee32cfb7374/84b64c625cd1b402?lnk=gst&q=model
+inheritance#84b64c625cd1b402) and it seems that many of the caveats
in the thread are still in play.

(And apologies if this sounds overly negative - I love Django!)

Regards,

Peter

Karen Tracey

unread,
Jan 25, 2009, 1:43:24 PM1/25/09
to django-d...@googlegroups.com
On Sun, Jan 25, 2009 at 12:51 PM, PB <PeterB...@googlemail.com> wrote:

Hi,

I've been experimenting with model inheritance and have become pretty
dissatisfied with the way it is implemented in Django.

[snip gripes]

This general reaction has been registered before.  You might want to search the user's list for likely keywords to find some threads that explain some of the rationale for why inheritance was done as is currently in Django.  With a quick search I found this:

http://groups.google.com/group/django-users/browse_thread/thread/52f72cffebb705e/b76c9d8c89a5574f

which has a couple of good responses from Malcolm.

I don't believe the choices that were made here were done casually or with an eye towards easiest implementation or anything like that.  Rather they support some things (e.g. extending 3rd-party models via inheritance) that would be difficult/impossible if a more "Pythonic" inheritance model had been chosen, don't introduce performance drains for common cases, etc. 

Perhaps there is a way to support more of what you're looking for in addition to what is there now.  If you come up with an approach you think is clean and generally useful, you can always propose it.

Karen

PB

unread,
Jan 25, 2009, 1:51:08 PM1/25/09
to Django developers
Ah, thanks for the link - I'd done a little searching but I guess I
missed that one.

Malcolm Tredinnick

unread,
Jan 27, 2009, 1:37:48 AM1/27/09
to django-d...@googlegroups.com
On Sun, 2009-01-25 at 09:51 -0800, PB wrote:
> Hi,
>
> I've been experimenting with model inheritance and have become pretty
> dissatisfied with the way it is implemented in Django. It seems that
> this part of the ORM has not received as much TLC and thought as the
> rest of the framework.

Slight underestimation of the effort that went into the feature,
there. :-)

But no offense taken. It's actually nicer when people focus on the
external effects and when I see that it's so close to exactly what
people want that they think it must only be missing a little bit. That
means we've pretty much succeeded. Best thing to hear as an implementor
is "that was easy" or "mostly worked as expected".

As Karen's already noted with one link (and there have been many such
threads on django-users since the feature was introduced), the design
was fairly carefully thought out and we (well, mostly me) have laid out
the reasoning behind it a lot. In addition to the django-dev thread you
found, this one is where a lot of the details were resolved (there might
have been one other a little later, but memory is fuzzy):

http://groups.google.com/group/django-developers/browse_frm/thread/7d40ad373ebfa912/a20fabc661b7035d?lnk=gst&q=model+inheritance+CORBA#a20fabc661b7035d

You'll notice there that a lot of the things you're after are either
contrary to other requirements (so one or other has to lose out), or
quite possible with a bit of extra code. Whether the extra code should
go into Django or not is an open issue (meh ... let's see what the code
look like first). But nothing you're after is per se impossible, and
that was also a design decision. So in that sense, Django fully supports
it. It is designed to allow those extensions. We just haven't
implemented such methods in core, since they can be done as external
extras and there are a few different approaches.

Remember, Django is something you build on top of. An aid, not a crutch.

Model inheritance on top of relational data storages is already a hack
(a.k.a "the Vietnam of computer science"). Something that looks like
pure-Python as well as being actually possible is going to add a lot of
constraints that aren't necessarily acceptable and, frankly, aren't that
necessary (in terms of allowing something that isn't possible now --
syntactic sugar aside). The compromise isn't due to ease of
implementation. That's one consideration, but not the sole one, nor the
primary one. It's often due to reality.

> - There is no way to find out whether an object has a Subclass model
> short of testing it. This means you have to write an exception
> oriented piece of code, which is pretty ugly. Python builtins like
> getattr and hasattr are not supported.

That requires an auxiliary data store of the most derived child class.
If you're not going to change parent models (one of those requirement
things we fleshed out). Keeping parallel data stores in sync is a royal
pain, so was intentionally not part of the core design. Providing such a
parallel store and methods/functions to use it would make a good
third-party project.

> - Classical Inheritance is not possible with the current
> implementation.
> (I wrote about this at http://peterbraden.co.uk/article/django-inheritance)

That requires implementing the previous point. So the reasons and
solution are the same.

Also be aware that what you're after here is also inconsistent when
viewed at a Python level. You're asking for a queryset of Place objects
(in your blog post example) and then complaining when it gives you back
a bunch of Place objects. Implicitly, you're trading one slight
inconsistency for another. Nothing wrong with that, but at least keep it
in mind when you're thinking along those lines. *Neither* behaviour is
completely natural from every viewpoint.

> - It is not possible to overwrite fields in sub classes - should you
> wish to do so the ORM will throw an exception. This means the
> inheritance is no more than a glorified 1-1 mapping between tables.

There are so many things that go wrong when you start doing this, that
is an area where a trade-off we made in order to be pragmatic. After
all, you don't really *need* to override the fields on the base model
for the language to work. Sure, sometimes it's handy, but it's not a
requirement. If Python, itself, didn't allow it, life would go on and it
would still be just as functional as a language. Nice to have; not a
showstopper.

For persistent storage cases, it's worse: those fields must be filled in
correctly and be able to be accessed (as soon as you go down that path,
serialisation and other intialisation cases rear their heads, for
example). It's difficult to ensure that people do fill them in properly
in all cases and correctly report the problems when they don't.

So, for now, we say you can't do that. Come up with a nice patch and
about a thousand tests to implement it and the code will be reviewed
with interest. However, at some point, adding massive complexity (and I
don't say that lightly) to the internal handling to support something
that can be avoided isn't a disaster.

> I understand that the way this has been implemented means interfacing
> with the actual database is a simpler task,

With the added bonus of also being possible to subclass any model. :-)

> but it makes writing true
> inheritance code over the top increasingly difficult. I'm interested
> as to why this choice was made in the design. Also, do people think
> there is room for a third type of inheritance which is closer to the
> pythonic inheritance model, and would work alongside the existing
> methods?

If what you're talking about as "classical inheritance" -- the problem
being the "classical" has multiple meaning, both as
historical/traditional and even as a technical term of
"class-ical" (hypenated for emphasis), as opposed to, say, prototypical
-- is what you mean by the third-type, then I've addressed that, above.
It's a presentation issue, once you have introduced some extra data in a
separate table to prevent the queries from requiring O(# of descendents)
table joins.

There is a third type of inheritance that will likely be in 1.1, time
permitting, which is allowing subclassing that doesn't drag along the
ORM subclassing with it. So pure-Python subclassing. That allows adding
Python methods and custom managers to existing tables. That falls out as
a neat side-effect of allowing models over database views.

I doubt that it would be a good idea to accept some kind of fundamental
data-storage layer change (e.g. adding a type-of-most-descended-child
columnt to parent tables), but I also doubt that's necessary (I, along
with other people, like Russell, have put a few hundred hours of thought
and effort, at least, into this sort of thing. I have a pretty good
intuition about what's possible, difficult, necessary, etc, I believe).

> As it is I find myself frequently having to write a wrapper
> for the ORM to support the things I'm trying to do.

If you write the wrapper sufficiently generally, surely you only have to
write it once?!

Regards,
Malcolm

PB

unread,
Jan 27, 2009, 9:12:02 AM1/27/09
to Django developers
Malcolm, Tracy

Thankyou both for the feedback - it seems I'm coming from a very
common position, and I think it mainly stems from my misunderstanding
of the aims of the ORM, of which the tradeoffs you have helped
explain.

>Slight underestimation of the effort that went into the feature,
>there. :-)

>But no offense taken. It's actually nicer when people focus on the
>external effects and when I see that it's so close to exactly what
>people want that they think it must only be missing a little bit. That
>means we've pretty much succeeded. Best thing to hear as an implementor
>is "that was easy" or "mostly worked as expected".

Take it as a high compliment - the fact that I am shocked when
something doesn't work as expected reflects highly on the framework :)


>That requires an auxiliary data store of the most derived child class.
>If you're not going to change parent models (one of those requirement
>things we fleshed out). Keeping parallel data stores in sync is a royal
>pain, so was intentionally not part of the core design. Providing such a
>parallel store and methods/functions to use it would make a good
>third-party project.

Agreed - I'm going to think about this, I think there is probably a
very elegant way to add this to the current ORM, but it definitely
requires a bit of thought.

>Also be aware that what you're after here is also inconsistent when
>viewed at a Python level. You're asking for a queryset of Place objects
>(in your blog post example) and then complaining when it gives you back
>a bunch of Place objects. Implicitly, you're trading one slight
>inconsistency for another. Nothing wrong with that, but at least keep it
>in mind when you're thinking along those lines. *Neither* behaviour is
>completely natural from every viewpoint.

I see what you're saying this, but that I feel that what I suggest is
the *intuitive* way - sure you are asking for a list of places, but
restaurant *is a* place, albeit an extended version. Even Java allows
you to cast the abstract object returned from a collection as the more
specific type, pythons strength is that you don't need any type
casting. Having inheritance by composition is an entirely different
pattern, and in this situation I believe it is an obtuse choice. The
reasons you give provide a good motivation for it, but I think calling
it and using the same syntax for inheritance is a little misleading.

>There are so many things that go wrong when you start doing this, that
>is an area where a trade-off we made in order to be pragmatic. After
>all, you don't really *need* to override the fields on the base model
>for the language to work. Sure, sometimes it's handy, but it's not a
>requirement. If Python, itself, didn't allow it, life would go on and it
>would still be just as functional as a language. Nice to have; not a
>showstopper

Another point I have a few issues with. Sure it is a trade-off, and
for pragmatic reasons, but I'm still not entirely sure if it's
justified. Is it not possible to simply define the responsibility of a
sub class to modify it's members, and not any overridden members of
superclasses. It should at least be possible to override field
attributes with ones that are not (functions or properties for
example) as in this situation the superclass's fields are simply left
untouched.

Sure it's a convenience, but without it the effects of the workaround
start to trickle down to levels where they should not be. I don't want
to have to think about whether I want a.foo to be a field or a
property in the template layer - the only place where the difference
is important is in the model layer where the difference is semantic.

Again I am arguing against it without a proposal for a solution, so
this is another area I'm going to go and think about.

>If you write the wrapper sufficiently generally, surely you only have to
>write it once?!

But the motivation for a framework is to avoid rewriting code, and as
a home for all the generic stuff no? :)

Thankyou both for your responses - definitely a lot more to think
about than I had anticipated...

Regards,

Peter

Malcolm Tredinnick

unread,
Jan 28, 2009, 12:23:50 AM1/28/09
to django-d...@googlegroups.com
On Tue, 2009-01-27 at 06:12 -0800, PB wrote:
[...]

> But the motivation for a framework is to avoid rewriting code, and as
> a home for all the generic stuff no? :)

That's correct: no. :-)

It's not a home for *all* generic code, since that would lead to a five
million line framework. It's to provide a base for people to build on.

Rewriting code is an orthogonal concept. All code should avoid forcing
*re*writes. But not the first time write.

That isn't specific or a judgement on this particular issue. Rather, a
general design consideration to keep in mind.

Regards,
Malcolm


Reply all
Reply to author
Forward
0 new messages