--
Ticket URL: <https://code.djangoproject.com/ticket/18785>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* stage: Unreviewed => Accepted
* needs_tests: => 0
* needs_docs: => 0
Comment:
I am accepting this as a regression in 1.4. To get this fixed in 1.4 will
need a simple patch. I have hopes that 1.5 will contain a more thorough
setup_joins() refactoring from #10790, but as said, that will not help
this ticket as the changes are way too large to backpatch.
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:1>
Comment (by famousactress@…):
Awesome! Thanks much for the support on this ticket and #10790. It's
definitely appreciated. I'm still not well enough versed in the ORM code
to be much help putting a patch together in short order, but I'd be more
than willing to write more/better unit tests that verify the
regression/fix... at this point I'm plenty familiar with the queries
regression tests :)
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:2>
Comment (by akaariai):
I think you will need two thing here:
- better pruning logic in trim_joins(). You will need to have better
information about which joins are direct foreign keys and thus trimmable.
- you will likely need to do similar improvement to Query.combine() as
was done here:
https://github.com/akaariai/django/blob/d301010fb92d6588d75fb3f9ecfe95ca6042cfed/django/db/models/sql/query.py#L480,
that is, do not skip unused aliases, just unref them after addition.
I think you should try to avoid changing existing return value of
setup_joins() or adding new mandatory parameters to trim_joins(). Breaking
these APIs in a minor release doesn't look nice, even if they are private
APIs.
Alternatively you might also want to check if you could solve #15316 in a
way that doesn't introduce the regression. It seems that handling
nonnull_check in the way it is done currently will simply not work.
If you can test #10790 and see if it solves this regression for you, that
would be very welcome.
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:3>
Comment (by akaariai):
I have a pretty good understanding of what is going on. The trim joins
logic should go as follows: if you have joins to forward direction (you
are following a foreign key), and the value you want is the thing the
foreign key points to (usually the PK of the foreign model), then you can
trim the join. This will always be valid, as the value of the foreign key
_must_ be the same it is one the referenced model, and there can only be
one value on the other side (the other side must be unique). If the
foreign key is nullable and contains a null, the join wouldn't produce
anything anyways.
On the other hand if you are following a foreign key to the reverse
direction (usually from some model's PK to the foreign key in the
referenced model), then you can _never_ trim the join. The reason for this
is that you can not know if there is anything on the other side, that is,
even if you know the value the foreign key must have, you can't know if
there is anything on the other side. As an example:
{{{
table a table b
id 1 a_id 1
id 2
}}}
Now, if you make the assumption that if you know the value of id in table
a, then you know the value of a_id in table b then you are wrong, as the
id=2 value doesn't have any matching row on the other side.
Current Django code makes the assumption you can trim joins in the reverse
case. This was blocked for the !OneToOneKey case in #15316 by the
nonnull_check, but I don't believe that to be the correct fix. Just block
all trimming in the reverse case and you have your solution. The question
is how do you pass the direction information from setup_joins to
trim_joins with minimal (internal) API breakages. You might want to add a
completely new variable to the Query class ("join_directions" or something
like that), which tracks the join directions for each alias generated and
nothing else. This way you need only internal changes to setup_joins,
trim_joins, `Query.__init__` and !Query.clone(). This solution isn't too
beautiful, but I don't expect it to bite us. In any case it seems obvious
code changes in this are will not be backportable, as the API is going to
change a lot in 1.5.
I might be overcautious above. Another solution is to just add the
direction to alias_map items and be done with it. This will break 3rd
party code that uses alias_map. As this is totally private API one could
claim that they had it coming all along... Still, it seems we can avoid
this break somewhat easily, so why be evil...
You can get the directions from the patch in #10790, names_to_path().
If you still keep the left outer joins in place in trim_joins(), I believe
you don't need any changes to !Query.combine(). In #10790 the left joins
are trimmed, too, as I can't see any reason why they are non-trimmable.
The reverse join trimming issue isn't commonly seen, as it is somewhat
rare to have reverse filters which are trimmable to begin with. The most
common case is !OneToOneKey, or doing something like
`A.objects.filter(b__a_id=1)`, but that usually doesn't make sense,
instead you would just do A.objects.filter(id=1).
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:4>
* cc: trbs@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:5>
Comment (by famousactress@…):
This regression from 1.4 still appears broken in the last 1.4 and 1.5
builds. Are there any plans to fix it? Will improvements in 1.6 resolve
this issue? Our project is orphaned on 1.3 because this change causes a
huge query spike and abysmal performance.
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:6>
Comment (by famousactress@…):
I should point out though, the problem looks fixed in 1.6. I confirmed
that our project unit test is now failing in an awesome direction using
the newest bits.. the join count for the little test I wrote to remind us
to look out for this error is 2 joins in v1.3, 4 joins in 1.4 & 1.5, and
1 join in 1.6. Huge thanks to Anssi who's work on
https://code.djangoproject.com/ticket/10790 I believe is responsible for
the major improvements. This ought to mean that basically everyone with a
non-trivial Django project just got a really significant performance
boost. If a 1.4 or 1.5 patch is too hairy, I suppose we can jump past them
to 1.6 when it's stable.
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:7>
* status: new => closed
* resolution: => fixed
Comment:
In [changeset:"84909377f2f2eea07ab8bf398dafc69e2d736b50"]:
{{{
#!CommitTicketReference repository=""
revision="84909377f2f2eea07ab8bf398dafc69e2d736b50"
Fixed #18785 -- Added Test join trimming regression
The regression was caused by patch to ticket #15316 and was fixed by a
patch to #10790.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:8>
Comment (by akaariai):
Backpatching of all the ORM changes to 1.5 can't be done. Way too much
risk, I am sure there are still regressions hiding in there. If a very
targeted patch for this issue is written then that might be a candidate to
1.5 as a regression fix.
--
Ticket URL: <https://code.djangoproject.com/ticket/18785#comment:9>