Using values() to force selection of certain columns in a certain order
proved useful unioning querysets with union() for the aforementioned user.
The positioning of columns added with annotate() is not controllable with
values() and has the potential to disrupt union() unless this fact is
known and the ordering done in a certain way to accommodate it.
I'm reporting this mainly for posterity but also as a highlight that
perhaps this should be mentioned in the documentation. I'm sure there are
reasons why the annotations are appended to the select but if someone
feels that this is changeable then it would be a bonus outcome.
--
Ticket URL: <https://code.djangoproject.com/ticket/28553>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* type: Uncategorized => Cleanup/optimization
* stage: Unreviewed => Accepted
Comment:
(The ticket component should change to "Documentation" if there aren't any
code changes to make here. I'm not sure.)
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:1>
Comment (by Sergey Fedoseev):
Probably duplicate of #28900.
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:2>
Comment (by Flávio Juvenal):
I've stumbled upon a case in production where this limitation prevents me
for making a useful query. I've been able to create a test to reproduce
this problem.
It works with sqlite but fails with postgres.
Add this test to {{{tests/queries/test_qs_combinators.py}}}:
{{{
#!python
from django.db.models import F, IntegerField, TextField, Value
def test_union_with_two_annotated_values_on_different_models(self):
qs1 = Number.objects.annotate(
text_annotation=Value('Foo', TextField())
).values('text_annotation', 'num')
qs2 = ReservedName.objects.annotate(
int_annotation=Value(1, IntegerField()),
).values('name', 'int_annotation')
self.assertEqual(qs1.union(qs2).count(), 10)
}}}
In current master (`78f8b80f9b215e50618375adce4c97795dabbb84`), running
{{{./runtests.py --parallel=1 --settings=tests.test_postgres
queries.test_qs_combinators.QuerySetSetOperationTests.test_union_with_two_annotated_values_on_different_models}}}
fails:
{{{
Testing against Django installed in 'django/django'
Creating test database for alias 'default'...
Creating test database for alias 'other'...
System check identified no issues (1 silenced).
E
======================================================================
ERROR: test_union_with_two_annotated_values_on_different_models
(queries.test_qs_combinators.QuerySetSetOperationTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "django/django/db/backends/utils.py", line 85, in _execute
return self.cursor.execute(sql, params)
psycopg2.ProgrammingError: UNION types integer and character varying
cannot be matched
LINE 1: ..._annotation" FROM "queries_number") UNION (SELECT "queries_r...
^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "django/tests/queries/test_qs_combinators.py", line 140, in
test_union_with_two_annotated_values_on_different_models
self.assertEqual(qs1.union(qs2).count(), 10)
File "django/django/db/models/query.py", line 382, in count
return self.query.get_count(using=self.db)
File "django/django/db/models/sql/query.py", line 494, in get_count
number = obj.get_aggregation(using, ['__count'])['__count']
File "django/django/db/models/sql/query.py", line 479, in
get_aggregation
result = compiler.execute_sql(SINGLE)
File "django/django/db/models/sql/compiler.py", line 1054, in
execute_sql
cursor.execute(sql, params)
File "django/django/db/backends/utils.py", line 68, in execute
return self._execute_with_wrappers(sql, params, many=False,
executor=self._execute)
File "django/django/db/backends/utils.py", line 77, in
_execute_with_wrappers
return executor(sql, params, many, context)
File "django/django/db/backends/utils.py", line 85, in _execute
return self.cursor.execute(sql, params)
File "django/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "django/django/db/backends/utils.py", line 85, in _execute
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: UNION types integer and character
varying cannot be matched
LINE 1: ..._annotation" FROM "queries_number") UNION (SELECT "queries_r...
^
----------------------------------------------------------------------
Ran 1 test in 0.007s
FAILED (errors=1)
Destroying test database for alias 'default'...
Destroying test database for alias 'other'...
}}}
My {{{tests/test_postgres.py}}} is:
{{{
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'django-test',
'HOST': '127.0.0.1',
},
'other': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'django-test-other',
}
}
SECRET_KEY = "django_tests_secret_key"
# Use a fast hasher to speed up tests.
PASSWORD_HASHERS = [
'django.contrib.auth.hashers.MD5PasswordHasher',
]
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:3>
* cc: Ole Laursen (added)
* type: Cleanup/optimization => Bug
Comment:
There's a ticket with a test case in #30211.
That ticket was reported as a bug, and I think this ticket should be a bug
too, so I'm changing the classification for now (apologies if that's
inappropriate). I think it's a bug because the documentation in my reading
implies that as long as the columns match, union will work. So it really
comes as a bit of a surprise that Django overrides the order in
values_list().
In my particular case, I'm using union() to combine two different tables
that I need to get out in sorted order, and I was trying to use annotate()
+ values_list() to add a NULL filler to one table as it lacks a column
from the other.
Also, I suppose the ORM could possibly also be a bit more efficient if it
could return values_list() tuples directly from the select instead of
having to rearrange them?
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:4>
Comment (by Kal Sze):
Actually there could be a different problem as well. We were lucky in that
the ORM generated SQL where the column types do not match. But what
happens when the column types match? I'm afraid you would get a query that
doesn't throw an exception but is in fact subtly broken, especially if the
values of the differnet columns happen to be similar, in which case it
might take a long time for app developers to realize that the query is
broken.
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:5>
Comment (by Matthias Kestenholz):
I just hit this as well. The solution (workaround) seems to be using `F()`
and `Value()` freely and consistently for all querysets even if it doesn't
look necessary on the surface.
See
https://github.com/matthiask/feincms3-forms/commit/c112a7d613e991780f383393fd05f1c84c81a279
(It's a bit surprising that `values_list` doesn't produce a SQL query with
the exact same ordering of values, but after thinking about it some more
I'm not sure if that's really a bug or just a sharp edge of the current
implementation.)
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:6>
Comment (by Matthias Kestenholz):
I submitted a failing unit test for this issue which demonstrates the
problem: https://github.com/django/django/pull/16577
This seems hard to fix. I don't expect to do any work on this in the next
few months since I have found a workaround for my use case. Hopefully the
test is useful to someone.
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:7>
* cc: David Wobrock (added)
* owner: nobody => David Wobrock
* has_patch: 0 => 1
* status: new => assigned
Comment:
[https://github.com/django/django/pull/16649 PR]
The same issue can probably occur with `extra()` fields, but the PR
doesn't fix that as
[https://docs.djangoproject.com/en/4.1/ref/models/querysets/#extra we no
longer fix bugs for this method].
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:8>
* needs_better_patch: 0 => 1
* needs_tests: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:9>
* needs_better_patch: 1 => 0
* needs_tests: 1 => 0
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:10>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"d6b6e5d0fd4e6b6d0183b4cf6e4bd4f9afc7bf67" d6b6e5d0]:
{{{
#!CommitTicketReference repository=""
revision="d6b6e5d0fd4e6b6d0183b4cf6e4bd4f9afc7bf67"
Fixed #28553 -- Fixed annotation mismatch with
QuerySet.values()/values_list() on compound queries.
Co-authored-by: Matthias Kestenholz <m...@feinheit.ch>
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:11>
* cc: Simon Charette (added)
Comment:
Small note that the cases reported in comment:3 is not fixed, the merged
patch only focuses on the case where only annotations are referenced in
`values`/`values_list` which is a subset of the problem reported here as
mixing field references, extra, and annotations demonstrate.
The crux of the issue is that `SQLCompiler.get_select` ignores
`Query.values_select` and
[https://github.com/django/django/blob/cffcf0ef17b2dfd744d3bc64080229c1b500508f/django/db/models/sql/compiler.py#L247-L276
always generate selected columns as follow]
1. Start with `Query.extra_select`
2. Then `Query.select`
3. End with `Query.annotation_select`
The patch here only makes sure that the order of `annotation_select` is
preserved. What should be done instead is adjust `get_select` to be
`Query.values_select` aware as pointed out on the MR.
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:12>
Comment (by Mariusz Felisiak):
Replying to [comment:12 Simon Charette]:
> Small note that the cases reported in comment:3 is not fixed, ...
Yes, but this case is described in #28900. Am I right?
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:13>
Comment (by Simon Charette):
#28900 seems to be an even more complex case where none of the combined
queries use explicit `values` but the result of the query combination
does.
In the comment:3 example both queries use `values` but happen to mix field
references and annotations which is not covered by the test included in
d6b6e5d0fd4e6b6d0183b4cf6e4bd4f9afc7bf67.
I'm bringing this up because most if not all of the changes made to
`sql.Query` for change the type of `annotation_mask` are unnecessary to
solve #28553 entirely.
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:14>
Comment (by Mariusz Felisiak):
Replying to [comment:14 Simon Charette]:
> I'm bringing this up because most if not all of the changes made to
`sql.Query` for change the type of `annotation_mask` are unnecessary to
solve #28553 entirely.
I will be happy to review any adjustments in this area, +1.
--
Ticket URL: <https://code.djangoproject.com/ticket/28553#comment:15>