Integrating polymorphic queries in the framework

171 views
Skip to first unread message

Luis Masuelli

unread,
Jun 5, 2014, 6:49:23 PM6/5/14
to django-d...@googlegroups.com
What about integrating polymorphic features in the ORM? It's like having the features of django-polymorphic but in the core.

The polymorphism could be acheved by:
    1. Having contenttypes installed (this is a common pattern).
    2. Specifying a root (first ancestor) model class like:

    class MyParentModel(models.Model):
        ...

        class Meta:
            polymorphic = True
            discriminant = "somefield" #it could default to 'content_type' if not specified. This field could be created.

To achieve the polymorphism a query could be like:

    objects = MyParentModel.objects.filter(foo=bar,baz=clorch,...).polymorphic().more().calls().ifneeded()

Such method could complain if the contenttypes application is not installed; it could be based on many select_related() arguments (which are collected by tree-traversing the hierarchy, perhaps ignoring proxies).
Alternatively, this could be an util in the contenttypes app instead of the core apps:

    objects = contenttypes.utils.polymorphic(MyParentModel.objects.filter(foo=bar,baz=clorch,...)).more().calls().ifneeded()

Sorry if this was posted before, but it's my first time here and I always asked why does Django not have this feature in the core.

Russell Keith-Magee

unread,
Jun 5, 2014, 9:03:14 PM6/5/14
to Django Developers
If you set your time machine to go back 6 years, you'll find the original discussions about model inheritance (implemented by Malcolm Tredinnick, and I did a whole bunch of design/implementation review):


In those discussions, we discussed the idea of introducing a CORBA-style narrow() function (which is what you're talking about here) due to the need to add and maintain extra columns, which aren't required for many applications.

That said - the decision was at least partially in the interests of landing *something*. We've had 6 years to digest that design, and a bunch of internal API cleanups in the process. Personally, I'm not fundamentally opposed to revisiting this issue - I've had a bunch of places where a narrow() call would have been useful. However, I *would* want it to be an opt-in feature of the model API. 

But I'd also warn - this isn't a small undertaking. This is going to be a big patch, and you're going to need a champion on the core team if you want to make serious progress in getting this into core.

Yours,
Russ Magee %-)

Luis Masuelli

unread,
Jun 6, 2014, 11:00:46 AM6/6/14
to django-d...@googlegroups.com
Don't know if it's a big patch at all. A polymorphic call could be like this:

1. Check if the class is polymorphic by itself or by inheritance:

Traverse the inheritance D-Graph (we have to remember it's not a tree anymore) starting from the current class - It would stop on (and not count) models.Model (abstract=True and proxy models are also not counted). It should stop on the first parent having a Meta definition as I exposed before.
Having multiple (concrete) ancestors defining a Meta like that, would raise an exception.
Defining such Meta while an ancestor already defined it, would raise the same exception.
Not finding such Meta in the dgraph would have no effect, and the entire process would be ignored (i.e. a norma query could be executed over the current model). It could alternatively throw an exception or leave it as a settings option (CONTENTTYPE_POLYMORPHIC_EXCEPTION = True #you get the idea)

2. If the Meta was found, the polymorphic query could be done. Traverse the models down from the initial model class (i.e. the one from which the PQ was done on) enumerating concrete and nonproxy models. Basing in your idea, a narrow=(Model1, Model2, ...) argument could be specified to only include those models in the list, ignoring the rest (and stopping when all those model classes were iterated and found). As complement, an exclude=(Model1, Model2) could be alternatively specified to skip those classes (as skipping proxy and abstract classes in the enumeration).

3. Perform a select_related or prefetch_related query with the enumerated classes. This would do a join or prefetch perhaps only on the classes of interest (narrow=/exclude=) instead of doing the biggest possible join.

An alternative could be that to define a polymorphic model, the parent model could be a brand new contenttypes.models.PolymorphicModel (itself being abstract=True) and the nearest concrete ancestors must implement such Meta attributes (or an exception should be throw). Perhaps this (abstract) model could have a metaclass defining a new Manager for 'objects' attribute which could generate a queryset implementing this polymorphic() method.

Such model could have the widely-mentioned get_real_instance (as in django-polymorphic) or narrow() as mentioned in the post. The implementation could vary: it starts by getting the current object's contenttype from the discriminator and get the model class for it. if narrow-style is implemented, issubclass() could be called to determine whether the object belongs to any of those classes and return the real instance if so (it could avoid loading a non-specified class lazily). implementing get_real_instance class could load lazily the class if was not specified.

As I see, an implementation like I described (which is not new at all) could let you leverage the polymorphism level and let you suffer the tradeoffs you choose.

Remember that, with this (rant-styled, but improved from reading the post you pointed me to) proposal, anyone could choose to use the polymorphism in the queries or not, by calling explicitly such method on the query. Not using it would behave as normal queries.

Curtis Maloney

unread,
Jun 6, 2014, 9:52:51 PM6/6/14
to django-d...@googlegroups.com
Can I draw your attention to Sebastian Vetter's investigation of the relative scalability of different model polymorphism approaches : https://github.com/elbaschid/talks/tree/master/2013-12-12_MelbDjango_MTI.Performance

Basically, the "select_related then resolve" approach [used in django-model-utils InheritanceManager] doesn't scale nearly as well as django-polymorphic's more pre-fetch-related "1+N queries" approach.

--
Curtis


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/2287ac72-c541-44ec-a268-f49aec03cb64%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Curtis Maloney

unread,
Jun 6, 2014, 9:55:41 PM6/6/14
to django-d...@googlegroups.com
Hosted version of the talk: http://www.roadside-developer.com/talks/2013-12-12_MelbDjango_MTI.Performance/#/

[Thanks, Brenton!]

--
Curtis

Reply all
Reply to author
Forward
0 new messages