the new SELECT DISTINCT query in 1.3 rc1, and how to turn it off

88 views
Skip to first unread message

Viktor Kojouharov

unread,
Mar 4, 2011, 10:24:53 AM3/4/11
to django-d...@googlegroups.com
Hello,

I'm testing my software with the new rc1 release of django 1.3, and I came onto a particularly nasty problem.
I have a model which uses a Postgresql 'point' type, for which I've defined a field as:

I also have another model, which references this one with a foreign key. When saving an instance from this other model, django throws the following exception:
The exception is due to the new 'DISTINCT' part of the SELECT query. Because of the point field, there is no way to select 'unique' entries, because Postresql cannot compare points.
So the question is, is there any way to turn off this distinct query, and just force django to assume that all entries are unique?

Karen Tracey

unread,
Mar 5, 2011, 12:29:47 AM3/5/11
to django-d...@googlegroups.com
Looks like the change that added distinct to this query is this one:

http://code.djangoproject.com/changeset/15607

It's probably best if you open a ticket in trac (http://code.djangoproject.com/newticket) for this. I can't think offhand how to solve both the problem that changeset fixed and the one you are encountering....

Karen

akaariai

unread,
Mar 5, 2011, 3:30:58 AM3/5/11
to Django developers


On Mar 5, 7:29 am, Karen Tracey <kmtra...@gmail.com> wrote:
> It's probably best if you open a ticket in trac (http://code.djangoproject.com/newticket) for this. I can't think offhand how
> to solve both the problem that changeset fixed and the one you are
> encountering....
>

If Django ORM would be able to perform a query: select distinct
on(primary_key) id, val1, ... from table order by primary_key this
would solve the problem. But making the ORM capable to do that will
probably take some time...

- Anssi

Viktor Kojouharov

unread,
Mar 5, 2011, 6:02:30 AM3/5/11
to django-d...@googlegroups.com
I've created a ticket for this

http://code.djangoproject.com/ticket/15559

Also, if someone else is stumbling on this problem, I went around it by providing a custom QuerySet class for my method, which has an overridden distinct method, that returns itself.

Florian Apolloner

unread,
Mar 8, 2011, 8:01:58 AM3/8/11
to Django developers
Hi,

On Mar 5, 9:30 am, akaariai <akaar...@cc.hut.fi> wrote:
> on(primary_key) id, val1, ... from table order by primary_key this
> would solve the problem.

Is "DISTINCT ON" part of the SQL standard at all?

Cheers, Florian

Łukasz Rekucki

unread,
Mar 8, 2011, 8:22:00 AM3/8/11
to django-d...@googlegroups.com

Quoting PostreSQL docs:

> The DISTINCT ON clause is not part of the SQL standard and is sometimes considered bad style because of the potentially indeterminate nature of its results. With judicious use of GROUP BY and subqueries in FROM the construct can be avoided, but it is often the most convenient alternative.

It's also supported by Oracle, AFAIK.

--
Łukasz Rekucki

Łukasz Rekucki

unread,
Mar 8, 2011, 8:28:14 AM3/8/11
to django-d...@googlegroups.com
2011/3/8 Łukasz Rekucki <lrek...@gmail.com>:

I forgot to mention there's a ticket to support it in ORM:
http://code.djangoproject.com/ticket/14139.
--
Łukasz Rekucki

Russell Keith-Magee

unread,
Mar 8, 2011, 8:52:28 AM3/8/11
to django-d...@googlegroups.com

I can't see an obvious fix here either.

Given the timeframe, we may need to roll back this fix, live with the
older bug, and look at it again in the 1.4 timeframe.

Yours,
Russ Magee %-)

Ian Kelly

unread,
Mar 8, 2011, 12:29:16 PM3/8/11
to django-d...@googlegroups.com
2011/3/8 Łukasz Rekucki <lrek...@gmail.com>:

> It's also supported by Oracle, AFAIK.

It is not, although it can be emulated using an analytic query. I
tried adding this to the patch in #6422 some time ago, but I found
that the structure of an analytic query was going to be rather
complicated to shoe-horn that into the query compilation code. That
was before the SQL compiler was factored out of the query code, and I
haven't gotten around to trying again since.

Cheers,
Ian

Michael Radziej

unread,
Mar 9, 2011, 3:20:37 AM3/9/11
to django-d...@googlegroups.com
Hi,

I'm currently not familiar with the current code base--still catching up--but wouldn't it make sense to select
distinct on the primary key?


Kind regards

Michael

Łukasz Rekucki

unread,
Mar 9, 2011, 4:41:03 AM3/9/11
to django-d...@googlegroups.com
On 8 March 2011 18:29, Ian Kelly <ian.g...@gmail.com> wrote:
> 2011/3/8 Łukasz Rekucki <lrek...@gmail.com>:
>> It's also supported by Oracle, AFAIK.
>
> It is not, although it can be emulated using an analytic query.

Should have done more searching, thanks for correcting me.

>I tried adding this to the patch in #6422 some time ago, but I found
> that the structure of an analytic query was going to be rather
> complicated to shoe-horn that into the query compilation code.  That
> was before the SQL compiler was factored out of the query code, and I
> haven't gotten around to trying again since.

It's probably messy in general case, but maybe we can try emulating
this in this particular case. Using an ordinary DISTINCT in a subquery
seems to solve the issue:

base_query =
self.rel.to._default_manager.using(db).complex_filter(self.rel.limit_choices_to).values('id').distinct()
queryset = self.rel.to._default_manager.using(db).filter(pk=base_query)


--
Łukasz Rekucki

Ian Kelly

unread,
Mar 9, 2011, 12:04:38 PM3/9/11
to django-d...@googlegroups.com
2011/3/9 Łukasz Rekucki <lrek...@gmail.com>:

> It's probably messy in general case, but maybe we can try emulating
> this in this particular case. Using an ordinary DISTINCT in a subquery
> seems to solve the issue:
>
>    base_query =
> self.rel.to._default_manager.using(db).complex_filter(self.rel.limit_choices_to).values('id').distinct()
>    queryset = self.rel.to._default_manager.using(db).filter(pk=base_query)

Yes, that looks like it should work. I expect it's probably less
efficient, but we may just have to live with that.

Reply all
Reply to author
Forward
0 new messages