`psycopg2.ProgrammingError: column "..." must appear in the GROUP BY
clause or be used in an aggregate function...`
This is caused by the fix to ticket #19259.
Queries with an aggregation now only list the PK field in the `GROUP BY`
clause, but this doesn't work for views because they cannot have primary
keys in PostgreSQL.
Now Django doesn't officially support views, or models without a primary
key, but it did work before so it's somewhat of a regression.
I ran into this problem because I'm working on a legacy database and have
models that are backed by a view using the `db_table` Meta option.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
I've been able to work around this by defining my own `DatabaseWrapper`
and `DatabaseFeatures` (which inherits from the ones in
`django.db.backends.postgresql`) and setting
`allows_group_by_selected_pks` to `False`. However, this disables this
feature for all models not just the ones backed by a view, so it's not
ideal.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:1>
Comment (by timgraham):
Do you have a solution in mind?
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:2>
Comment (by jaap3):
Unfortunately I don't have a solution that would just make things work.
One of the underlying issues here is that Django has never officially
supported database views in the first place so I don't know how much
effort should be put into making this work again. The fact that this used
to work in the past was pure coincidence anyway.
Come to think of it, this problem might not just limited to database
views. I believe any (unmanged) model that hasn't set a `PRIMARY KEY`
constraint at the database level could trigger this issue.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:3>
* cc: shaib (added)
Comment:
Possible workaround (no time to test ATM): Before the `annotate()` call,
add `only()` naming all the fields. Or maybe even drop the PK, if that
makes sense.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:4>
Comment (by jaap3):
Using only seems to have no effect on the GROUP BY clause.
I am not sure what you mean by "drop the PK", Django doesn't allow models
without a field marked as `primary_key` and will add one if you don't
specify one.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:5>
* cc: charettes (added)
* version: 1.10 => master
* stage: Unreviewed => Accepted
Comment:
> Before the annotate() call, add only() naming all the fields. Or maybe
even drop the PK, if that makes sense.
The trick `only()` won't work as `pk` is always implicitly selected
(`only('field')` == `only('field', 'pk')`).
The only solution I can think of is disabling the optimization for
unmanaged models but the real issue here is really that unmanaged models
are not meant to be used to deal with database views. The fact that Django
enforces the presence of a primary key and views are not allowed to have
one really highlights the conceptual clash.
As Django doesn't provide any way to interact with views and even suggests
`managed`
[https://docs.djangoproject.com/en/1.10/ref/models/options/#django.db.models.Options.managed
can be used for this purpose] I think we should simply disable the
optimization for this specific case.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:6>
Comment (by jaap3):
I don't known if disabling the optimisation for all unmanaged models is
the right choice. It seems that this is a fairly esoteric edge case. The
fact that this behaviour was changed in 1.9 and almost nobody has noticed
until now speaks for itself (I can only find one other issue that looks
similar: #26758).
I'm sure that most unmanaged models are normal database tables with a
primary key constraint. It would be a shame to have to disable something
that works fine for the vast majority of use cases just because of this
one edge case.
Instead, would it be feasible to be able to opt-out of the optimisation by
setting a flag on the model's Meta class? So instead of checking if
`managed` is `False`, Django could check if `allows_group_by_selected_pks`
is set to `False` for the model and then choose not to perform the related
optimisation.
I am actively maintaining this application and would have no qualms about
adding a flag to the few models affected by this issue. If anyone else
runs into this the must also be actively maintaining theirs so they can do
the same thing.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:7>
Comment (by jarshwah):
Can you try adding a field as a primary key on the view? Ideally each row
would have a unique key in a view (I know this isn't always the case), but
if you just **tell** Django that your unmanaged model has a primary key,
it'll try to group on that.
For what it's worth I've always been a big user of unmanaged models over
views - so I'm keen to see this resolved and not abandoned.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:8>
* cc: jarshwah (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:9>
Comment (by jaap3):
As far as Django is concerned these views do have a primary key (there's a
id field, it even has a unique index). The problem is that at the database
level this field isn't, and cannot, have a primary key constraint. Without
a **real** primary key Postgres cannot group by primary key only, since
they don't exist.
This is what happens if you try to add a PK to a (materialized) view:
{{{
=> ALTER MATERIALIZED VIEW my_view ADD CONSTRAINT my_view_pkey PRIMARY KEY
(id);
ERROR: "my_view" is not a table
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:10>
Comment (by jarshwah):
Yes, you're right. Sorry for the noise! I mistakenly thought it was Django
throwing an error.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:11>
Comment (by charettes):
I don't think a meta option for this special case is worth it but a flag
to prevent Django from implicitly adding an auto-incrementing primary key
when none is explicitly specified could be useful in multiple cases
(migrations comes to mind here) and leveraged in this one.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:12>
* cc: olivier.tabone@… (added)
* owner: nobody => Olivier Tabone
* status: new => assigned
* needs_tests: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:13>
* owner: Olivier Tabone => (none)
* status: assigned => new
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:14>
Comment (by Simon Charette):
Fixed in daf2bd3efe53cbfc1c9fd00222b8315708023792. #28107 was a duplicate
of this ticket.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:15>
* status: new => closed
* resolution: => fixed
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:16>
Comment (by Simon Charette):
I just submitted [https://groups.google.com/forum/#!topic/django-
developers/lx3ZSq-W9X4 a post on the mailing list to gather feedback]. It
includes an alternative solution to make turning the optimization on for
unmanaged models backed by tables easier.
--
Ticket URL: <https://code.djangoproject.com/ticket/27241#comment:17>