ManyToManyFields and LEFT OUTER JOIN

40 views
Skip to first unread message

David Cramer

unread,
Nov 28, 2007, 2:00:05 PM11/28/07
to Django developers
I was debugging a query earlier and noticed that ManyToMany filter
calls perform a left outer join.

The only use case of this I could come up with is
myfield__isnull=True. Are there any other reasons this is needed?

If not I'd like to propose changing this to use INNER JOIN (on the
exception of the isnull call).

Malcolm Tredinnick

unread,
Nov 28, 2007, 2:50:19 PM11/28/07
to django-d...@googlegroups.com

Via the magic of time machines, this has already been done on
queryset-refactor branch. You also need it with certain "or"
combinations of queries.

Malcolm


David Cramer

unread,
Nov 28, 2007, 4:04:28 PM11/28/07
to Django developers
Anything you can think of off hand that's going to hurt by just
changing it to INNER JOIN right now if we dont use the isnull support.

I'm also wondering if it may cause many rows to return (i can't
remember how left joins work, as I don't use them much) vs a single
row, but we use it with GROUP BY statements either way.

On Nov 28, 11:50 am, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

Malcolm Tredinnick

unread,
Nov 28, 2007, 5:14:12 PM11/28/07
to django-d...@googlegroups.com

On Wed, 2007-11-28 at 13:04 -0800, David Cramer wrote:
> Anything you can think of off hand that's going to hurt by just
> changing it to INNER JOIN right now if we dont use the isnull support.

I think you'll be safe. The only difference is that certain rows won't
appear when the right-hand side of the match doesn't exist. If you're
comparing to explicit (non-null) values, it should be fine. All the
cases I can think of where outer joins are required are not working
reliably on trunk anyway, so filtered relations are the only case you'll
be relying on.

Malcolm


Litnimax

unread,
Nov 29, 2007, 9:34:42 AM11/29/07
to Django developers
Hello Malcolm.
What is your estimation on merging your queryset-refactor branch?
Now INNER JOIN isnull lookup parameter does not work for related
models because INNER JOIN produce cartesian product between the
specified tables (that is, each and every row in the first table is
joined to each and every row in the second table) so there is no NULL
records in the right table.
That means we cannot use it to find rows in a table that have no
counterpart in another table like is done below:

SELECT left_tbl.*
FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id = right_tbl.id
WHERE right_tbl.id IS NULL;

I looked though queryset-refactor branch.. There is so many changes so
I doubt that we can have it in trunk in a reasonable time.

How can we now solve the task of "how to find rows in a table that
have no counterpart in another table" ?

Looking forward to your reply.
Regards,
Max.

Malcolm Tredinnick

unread,
Nov 29, 2007, 9:45:13 AM11/29/07
to django-d...@googlegroups.com

On Thu, 2007-11-29 at 06:34 -0800, Litnimax wrote:
> Hello Malcolm.
> What is your estimation on merging your queryset-refactor branch?

When it is ready. I"m putting as much time into it as I have spare (and
then some), although I have to keep stopping to work on higher priority
tickets, so it's not 100% of my available time.

> Now INNER JOIN isnull lookup parameter does not work for related
> models because INNER JOIN produce cartesian product between the
> specified tables (that is, each and every row in the first table is
> joined to each and every row in the second table) so there is no NULL
> records in the right table.

Yes, I understand how inner joins work.

> That means we cannot use it to find rows in a table that have no
> counterpart in another table like is done below:
>
> SELECT left_tbl.*
> FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id = right_tbl.id
> WHERE right_tbl.id IS NULL;
>
> I looked though queryset-refactor branch.. There is so many changes so
> I doubt that we can have it in trunk in a reasonable time.

You can have your doubts if you like. I don't think they're correct.
Indeed "when it is ready" is the very definition of "reasonable time".

> How can we now solve the task of "how to find rows in a table that
> have no counterpart in another table" ?

This is already fixed in the queryset-refactor branch, as demonstrated
here:
http://code.djangoproject.com/browser/django/branches/queryset-refactor/tests/regressiontests/queries/models.py#L157

Regards,
Malcolm

Litnimax

unread,
Nov 29, 2007, 11:17:05 AM11/29/07
to Django developers

> You can have your doubts if you like. I don't think they're correct.
> Indeed "when it is ready" is the very definition of "reasonable time".
Nice reply :-)
>
> > How can we now solve the task of "how to find rows in a table that
> > have no counterpart in another table" ?
>
> This is already fixed in the queryset-refactor branch, as demonstrated
> here:http://code.djangoproject.com/browser/django/branches/queryset-refact...

I see your test is for foreign keys. Could you please also test it for
generic relations?
This is one of the most powerful django features making much more
easier and clearer developing web 2.0 apps (sorry for this term).


Regards,
Max.
Reply all
Reply to author
Forward
0 new messages