On Jul 5, 2012, at 4:57 PM, Sergey Shepelev wrote:
Problem: sa.union_all(*qs).order_by('y') will generate invalid SQL like this:
SELECT ... ORDER BY x LIMIT 10 UNION ALL SELECT ... ORDER BY x LIMIT 10 ORDER BY y
Order in inner queries could be required for LIMIT or DISTINCT ON (field).
Solution: sqlalchemy should put all inner queries that contain ORDER BY or LIMIT clauses into parenthesis, like this:
SELECT ... UNION ALL (SELECT ... ORDER BY x LIMIT 10) ORDER BY y
Test:
def test_with_order(self):
q = sa.select([2]).order_by('1')
union = sa.union(q, q).limit(3)
db.postgresql.execute(union)
self.assertEqual(str(union), "(SELECT 2 ORDER BY 1) UNION (SELECT 2 ORDER BY 1) LIMIT 3")
This is tricky because the SQL expression language doesn't like to make guesses like that, that is, decides about the structure of the query based on things inside of select() constructs. The ORM does that kind of thing more acceptably, but it isn't covering this case either at the moment.
The above SQL is not accepted on some backends - that is, it fails on SQLite at least. Odd structures like that tend to not work on several of the more quirky backends, common culprits are Oracle, Firebird, SQL Server, older MySQL versions.
So to make this "magic" here is to make the select() construct behave differently on the PG backend, or maybe just not on the SQLite backend, but this is tricky to decide without knowing the true compatibility here and what's going to break for whom if we make a change like that - the SQL expression language tends to want you to manually apply structural conventions right now (such as, if you say select().select(), there's no "alias" generated automatically, things like that).
I would have expected alias() to work here, but it doesn't. so it's on deck to work on why that is at some point.
The point is right now you can force the parens using self_group(), and I'd be more comfortable if you stuck with that for this particular case right now, though I may change my mind about handling this case in the future:
q = select([2]).limit(5).order_by('1').self_group()
union = union(q, q).limit(3)
print union
i put up
http://www.sqlalchemy.org/trac/ticket/2528, where the goal is to test the compatibility of this kind of query across the most common backends and see what the best approach might be. I'd probably go as far as alias() rending parenthesis, though (that is, self_group() will likely remain exactly what you want here).