Hi,
I've been experimenting with model inheritance and have become pretty
dissatisfied with the way it is implemented in Django.
Slight underestimation of the effort that went into the feature,
there. :-)
But no offense taken. It's actually nicer when people focus on the
external effects and when I see that it's so close to exactly what
people want that they think it must only be missing a little bit. That
means we've pretty much succeeded. Best thing to hear as an implementor
is "that was easy" or "mostly worked as expected".
As Karen's already noted with one link (and there have been many such
threads on django-users since the feature was introduced), the design
was fairly carefully thought out and we (well, mostly me) have laid out
the reasoning behind it a lot. In addition to the django-dev thread you
found, this one is where a lot of the details were resolved (there might
have been one other a little later, but memory is fuzzy):
You'll notice there that a lot of the things you're after are either
contrary to other requirements (so one or other has to lose out), or
quite possible with a bit of extra code. Whether the extra code should
go into Django or not is an open issue (meh ... let's see what the code
look like first). But nothing you're after is per se impossible, and
that was also a design decision. So in that sense, Django fully supports
it. It is designed to allow those extensions. We just haven't
implemented such methods in core, since they can be done as external
extras and there are a few different approaches.
Remember, Django is something you build on top of. An aid, not a crutch.
Model inheritance on top of relational data storages is already a hack
(a.k.a "the Vietnam of computer science"). Something that looks like
pure-Python as well as being actually possible is going to add a lot of
constraints that aren't necessarily acceptable and, frankly, aren't that
necessary (in terms of allowing something that isn't possible now --
syntactic sugar aside). The compromise isn't due to ease of
implementation. That's one consideration, but not the sole one, nor the
primary one. It's often due to reality.
> - There is no way to find out whether an object has a Subclass model
> short of testing it. This means you have to write an exception
> oriented piece of code, which is pretty ugly. Python builtins like
> getattr and hasattr are not supported.
That requires an auxiliary data store of the most derived child class.
If you're not going to change parent models (one of those requirement
things we fleshed out). Keeping parallel data stores in sync is a royal
pain, so was intentionally not part of the core design. Providing such a
parallel store and methods/functions to use it would make a good
third-party project.
> - Classical Inheritance is not possible with the current
> implementation.
> (I wrote about this at http://peterbraden.co.uk/article/django-inheritance)
That requires implementing the previous point. So the reasons and
solution are the same.
Also be aware that what you're after here is also inconsistent when
viewed at a Python level. You're asking for a queryset of Place objects
(in your blog post example) and then complaining when it gives you back
a bunch of Place objects. Implicitly, you're trading one slight
inconsistency for another. Nothing wrong with that, but at least keep it
in mind when you're thinking along those lines. *Neither* behaviour is
completely natural from every viewpoint.
> - It is not possible to overwrite fields in sub classes - should you
> wish to do so the ORM will throw an exception. This means the
> inheritance is no more than a glorified 1-1 mapping between tables.
There are so many things that go wrong when you start doing this, that
is an area where a trade-off we made in order to be pragmatic. After
all, you don't really *need* to override the fields on the base model
for the language to work. Sure, sometimes it's handy, but it's not a
requirement. If Python, itself, didn't allow it, life would go on and it
would still be just as functional as a language. Nice to have; not a
showstopper.
For persistent storage cases, it's worse: those fields must be filled in
correctly and be able to be accessed (as soon as you go down that path,
serialisation and other intialisation cases rear their heads, for
example). It's difficult to ensure that people do fill them in properly
in all cases and correctly report the problems when they don't.
So, for now, we say you can't do that. Come up with a nice patch and
about a thousand tests to implement it and the code will be reviewed
with interest. However, at some point, adding massive complexity (and I
don't say that lightly) to the internal handling to support something
that can be avoided isn't a disaster.
> I understand that the way this has been implemented means interfacing
> with the actual database is a simpler task,
With the added bonus of also being possible to subclass any model. :-)
> but it makes writing true
> inheritance code over the top increasingly difficult. I'm interested
> as to why this choice was made in the design. Also, do people think
> there is room for a third type of inheritance which is closer to the
> pythonic inheritance model, and would work alongside the existing
> methods?
If what you're talking about as "classical inheritance" -- the problem
being the "classical" has multiple meaning, both as
historical/traditional and even as a technical term of
"class-ical" (hypenated for emphasis), as opposed to, say, prototypical
-- is what you mean by the third-type, then I've addressed that, above.
It's a presentation issue, once you have introduced some extra data in a
separate table to prevent the queries from requiring O(# of descendents)
table joins.
There is a third type of inheritance that will likely be in 1.1, time
permitting, which is allowing subclassing that doesn't drag along the
ORM subclassing with it. So pure-Python subclassing. That allows adding
Python methods and custom managers to existing tables. That falls out as
a neat side-effect of allowing models over database views.
I doubt that it would be a good idea to accept some kind of fundamental
data-storage layer change (e.g. adding a type-of-most-descended-child
columnt to parent tables), but I also doubt that's necessary (I, along
with other people, like Russell, have put a few hundred hours of thought
and effort, at least, into this sort of thing. I have a pretty good
intuition about what's possible, difficult, necessary, etc, I believe).
> As it is I find myself frequently having to write a wrapper
> for the ORM to support the things I'm trying to do.
If you write the wrapper sufficiently generally, surely you only have to
write it once?!
Regards,
Malcolm
That's correct: no. :-)
It's not a home for *all* generic code, since that would lead to a five
million line framework. It's to provide a base for people to build on.
Rewriting code is an orthogonal concept. All code should avoid forcing
*re*writes. But not the first time write.
That isn't specific or a judgement on this particular issue. Rather, a
general design consideration to keep in mind.
Regards,
Malcolm