annotate() and generic relations

168 views
Skip to first unread message

Shai Berger

unread,
Dec 16, 2014, 8:12:38 PM12/16/14
to django-d...@googlegroups.com
Hi everyone,

First, there was ticket #10461[1] -- a complaint that annotate doesn't work
well with generic relations. Then, it was documented[2] as not working. But
then, people continued looking at the ticket, and in the end, it was declared
as fixed and a test was added[3]. But deeper checking[4] finds that the
annotation does not do the intuitive thing -- at least when aggregating over
the related objects themselves (rather than fields on them), the field it uses
to aggregate on is the relation field (object_id, by default) rather than the
pk. This is very odd behavior, as the relation field is guaranteed to be equal
on all child records of a given record -- and equal to the pk of the parent
record; thus, the only meaningful aggregation is Count().

However, our docs still just claim[5] that aggregations and generic relations
do not work together. Which, given the findings, seems to be a good
description. So, what should we do?

-- Error out on any attempt to aggregate over generic relations. This appears
to be the most correct behavior, but will break working code (either Count
annotations or code which creates annotations but doesn't actually use them).

-- Fix annotations to use the right field? Is it possible? Do specific fields
[i.e. Sum('generic_relation__id')] work?
(I'll try myself, if nobody beats me to it, but it's rather late for me now)

-- Is it possible that annotations work and aggregations don't?

Thanks,
Shai.


[1] https://code.djangoproject.com/ticket/10461
[2] https://github.com/django/django/commit/d987b378ce
[3] https://github.com/django/django/commit/76da053641
[4] https://github.com/django/django/commit/e8223b889a#commitcomment-8989832
[5] https://docs.djangoproject.com/en/dev/ref/contrib/contenttypes/#reverse-
generic-relations

Josh Smeaton

unread,
Dec 16, 2014, 8:41:11 PM12/16/14
to django-d...@googlegroups.com
I admit I don't fully understand how generic relations work (I try not to use them in my projects), but I would think we should be able to make annotations work properly with them. Now that aggregates are based internally on F() objects, we should be able to special case F() for generic relations if necessary. I'll keep this on my radar, and if no one else gets to it, I'll look into it further after I've finished up with built-in functions. Should another ticket be opened to clearly document the problem, or do you think the existing ticket has enough information?

Josh

Anssi Kääriäinen

unread,
Dec 17, 2014, 2:11:42 AM12/17/14
to django-d...@googlegroups.com
On Wed, 2014-12-17 at 03:12 +0200, Shai Berger wrote:
> <SNIP>
> But deeper checking[4] finds that the
> annotation does not do the intuitive thing -- at least when aggregating over
> the related objects themselves (rather than fields on them), the field it uses
> to aggregate on is the relation field (object_id, by default) rather than the
> pk. This is very odd behavior, as the relation field is guaranteed to be equal
> on all child records of a given record -- and equal to the pk of the parent
> record; thus, the only meaningful aggregation is Count().

I investigated this a bit, and the problem is deeper in generic
relations handling. Even filtering on generic relations use the related
field's value, not the related model's primary key value. We should fix
that, too.

Luckily the impact of the annotation case isn't that big. Even if the
target would have been the related model's primary key there aren't that
many sensible queries to run against the PK.

The impact of changing the way .filter() works might be a bit bigger,
but I think we will need to do this as a bug fix. Case where queries
"worked" as in they produced results but the results were incorrect are
nasty to fix. Somebody is always relying on the incorrect results.

What was fixed (in 1.6 if I recall correctly) was aggregation over
generic relations when the target field isn't the generic relation
itself, but some other field on the related model.

- Anssi

Anssi Kääriäinen

unread,
Dec 17, 2014, 3:06:18 AM12/17/14
to django-d...@googlegroups.com
On Wed, 2014-12-17 at 09:21 +0200, Anssi Kääriäinen wrote:
> I investigated this a bit, and the problem is deeper in generic
> relations handling. Even filtering on generic relations use the related
> field's value, not the related model's primary key value. We should fix
> that, too.
>
> Luckily the impact of the annotation case isn't that big. Even if the
> target would have been the related model's primary key there aren't that
> many sensible queries to run against the PK.
>
> The impact of changing the way .filter() works might be a bit bigger,
> but I think we will need to do this as a bug fix. Case where queries
> "worked" as in they produced results but the results were incorrect are
> nasty to fix. Somebody is always relying on the incorrect results.

It turns out this is easy to fix. There is a patch for this at
https://github.com/django/django/pull/3743.

The generic relations still work a bit differently to that of reverse
foreign key - where reverse foreign keys can be filtered
with .filter(reverse_relation=some_related_object), that doesn't work
with reverse relations, instead you have to explicitly provide the
primary key value by doing .filter(generic_rel=some_related_object.pk).

- Anssi

Shai Berger

unread,
Dec 17, 2014, 11:20:34 PM12/17/14
to django-d...@googlegroups.com
Hi Anssi,

Thanks for that quick fix.

On Wednesday 17 December 2014 10:16:27 Anssi Kääriäinen wrote:
>
> It turns out this is easy to fix. There is a patch for this at
> https://github.com/django/django/pull/3743.
>
(All: In case you're interested but lazy -- this was committed)

> The generic relations still work a bit differently to that of reverse
> foreign key - where reverse foreign keys can be filtered
> with .filter(reverse_relation=some_related_object), that doesn't work
> with reverse relations, instead you have to explicitly provide the
> primary key value by doing .filter(generic_rel=some_related_object.pk).

So, I'm not sure if we should document this behavior or try to fix it as well.

I opened #24019 for fixing the outdated "GenericRelation+aggregation don't
work" documentation, and added that as a point for consideration as well.

Thanks again,
Shai.

Anssi Kääriäinen

unread,
Dec 18, 2014, 3:06:25 AM12/18/14
to django-d...@googlegroups.com
On Thursday, December 18, 2014 6:20:34 AM UTC+2, Shai Berger wrote:
> The generic relations still work a bit differently to that of reverse
> foreign key - where reverse foreign keys can be filtered
> with .filter(reverse_relation=some_related_object), that doesn't work
> with reverse relations, instead you have to explicitly provide the
> primary key value by doing .filter(generic_rel=some_related_object.pk).

 
So, I'm not sure if we should document this behavior or try to fix it as well.


I think we should fix this, but this isn't a high-priority item, at least not to me.

This should be easier to fix in 1.9 when we can drop the backwards compat for pre 1.6 lookups from the ORM.

 - Anssi
Reply all
Reply to author
Forward
0 new messages