Removing ORM's union restrictions

129 views
Skip to first unread message

th...@vandien.net

unread,
Dec 8, 2017, 5:12:52 PM12/8/17
to Django developers (Contributions to Django itself)
The relatively new QuerySet.union operator has quite a few limitations:

[...] only LIMIT, OFFSET, COUNT(*), ORDER BY, and specifying columns (i.e. slicing, count(), order_by(), and values()/values_list()) are allowed on the resulting QuerySet. Further, databases place restrictions on what operations are allowed in the combined queries. For example, most databases don’t allow LIMIT or OFFSET in the combined queries.

I think most of these could be circumvented by having the combined queries wrapped as subqueries, e.g.:

SELECT * FROM (
    SELECT f1, f2 FROM t1 ORDER BY f2 OFFSET 10
)
UNION
SELECT * FROM (
    SELECT f1, f2 FROM t2 ORDER BY f1 DESC LIMIT 3
)

As far as I know, all relevant databases support this workaround and there isn't any performance penalty to it. Am I missing something?

— Thijs

Adam Johnson

unread,
Dec 9, 2017, 6:23:11 AM12/9/17
to django-d...@googlegroups.com
Afraid I can't say as to whether you're missing anything, but I can say that MySQL/MariaDB support LIMIT/OFFSET it in the combined queries, so if this were implemented it would be nice to not add the SELECT * wrappers for the MySQL backend - there does seem to be a performance impact there, as the explain plan shows an extra derived table:

chainz@localhost [6]> explain (select c from t limit 1 offset 1) union all (select c from t2 limit 2);
+------+-------------+-------+------+---------------+------+---------+------+------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+-------+
|    1 | PRIMARY     | t     | ALL  | NULL          | NULL | NULL    | NULL |    3 |       |
|    2 | UNION       | t2    | ALL  | NULL          | NULL | NULL    | NULL |    3 |       |
+------+-------------+-------+------+---------------+------+---------+------+------+-------+
2 rows in set (0.00 sec)

chainz@localhost [7]> explain (select * from (select c from t limit 1 offset 1) u0) union all (select c from t2 limit 2);
+------+-------------+------------+------+---------------+------+---------+------+------+-------+
| id   | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra |
+------+-------------+------------+------+---------------+------+---------+------+------+-------+
|    1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    2 |       |
|    2 | DERIVED     | t          | ALL  | NULL          | NULL | NULL    | NULL |    3 |       |
|    3 | UNION       | t2         | ALL  | NULL          | NULL | NULL    | NULL |    3 |       |
+------+-------------+------------+------+---------------+------+---------+------+------+-------+
3 rows in set (0.00 sec)

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/1e029596-3fd7-4796-988a-5aa51c522269%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Adam

th...@vandien.net

unread,
Dec 11, 2017, 5:19:12 PM12/11/17
to Django developers (Contributions to Django itself)
In case there's an ORDER BY, I suppose you would need a wrapper. It's what got me here in the first place; I was surprised how much difficulty this user was having when attempting something rather basic: https://stackoverflow.com/q/47716183/1163893. To come back to your point, if the potential implementation could be smart about whether a wrapper is needed, I don't object to that.

— Thijs
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.



--
Adam

Florian Apolloner

unread,
Dec 13, 2017, 5:56:41 PM12/13/17
to Django developers (Contributions to Django itself)


On Monday, December 11, 2017 at 11:19:12 PM UTC+1, Thijs van Dien wrote:
In case there's an ORDER BY, I suppose you would need a wrapper. It's what got me here in the first place; I was surprised how much difficulty this user was having when attempting something rather basic: https://stackoverflow.com/q/47716183/1163893. To come back to your point, if the potential implementation could be smart about whether a wrapper is needed, I don't object to that.

Well, in the case of this specific issue, the error in question is caused by the choice of the database. It would have worked just fine in Postgresql :D As for removing the restrictions: By all means, patches welcome -- I'll happily review them. When I initially wrote union support I had a very specific usecase and a deadline (1.11 freeze) to get them in…

Cheers,
Florian
Reply all
Reply to author
Forward
0 new messages