[Django] #29794: Duplicate object returned using filter

4 views
Skip to first unread message

Django

unread,
Sep 26, 2018, 6:39:06 AM9/26/18
to django-...@googlegroups.com
#29794: Duplicate object returned using filter
-------------------------------------+-------------------------------------
Reporter: xeor | Owner: nobody
Type: Bug | Status: new
Component: Database | Version: 2.1
layer (models, ORM) |
Severity: Normal | Keywords: duplicate, vacuum
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Note: I have am unable to reproduce this issue, it showed up in my prod-
environment, but only with a specific db-entry. I think it's a bug, but I
need help reproducing it..
Note 2: This problem fixed itself doing a database vacuum.. I tried that
manually after I wrote the whole issue. Still think the issue is weird,
maybe someone have something useful to say. At least keeping the issue for
searchability for the next person hitting this. Maybe there is a way for
django to detect cases like this?

Using a simple model for explaination:

{{{
class LowercaseCharField(models.CharField):
def get_prep_value(self, value):
return str(value).lower()

class Server(models.Model):
objects = ServerBaseManager.from_queryset(ServerQuerySet)()
name = LowercaseCharField(max_length=255)
domain = models.ForeignKey(Domain, db_index=True,
default=default_domain, on_delete=models.SET_DEFAULT)

class Meta:
unique_together = (
('name', 'domain')
)
}}}

ServerBaseManager and ServerQuerySet does not override any functions, just
add functionality.
The domain ForeignKey is nothing magical either..

On 1 specific name entry, lets call it '''server1''' there are something
strange happening.
There are 2 entries with the '''server1''' name in the db, one with
'''domain.pk=1''', and one with '''domain.pk=2'''

Here is the strange part:

{{{#!python
In [1]: [i.pk for i in Server.objects.filter(name='server1')]
Out[1]: [1, 2]

In [2]: Server.objects.filter(name='server1')[0].pk
Out[2]: 2

In [3]: Server.objects.filter(name='server1')[1].pk
Out[3]: 2

In [4]: Server.objects.filter(name='server1').order_by('pk')[0].pk
Out[4]: 1

In [5]: Server.objects.filter(name='server1').order_by('pk')[1].pk
Out[5]: 2

In [6]: Server.objects.filter(name='server1').order_by('-name')[0].pk
Out[6]: 2

In [7]: Server.objects.filter(name='server1').order_by('-name')[1].pk
Out[7]: 2

In [8]: Server.objects.filter(name='server1').order_by('name')[0].pk
Out[8]: 2

In [9]: Server.objects.filter(name='server1').order_by('name')[1].pk
Out[9]: 2

In [10]: Server.objects.filter(name='server1').values('pk')
Out[10]: <ServerQuerySet [{'pk': 1}, {'pk': 2}]>

In [11]:
Server.objects.filter(pk__in=Server.objects.filter(name='server1'))[0].pk
Out[11]: 1

In [12]:
Server.objects.filter(pk__in=Server.objects.filter(name='server1'))[1].pk
Out[12]: 2
}}}

as you can see. The queryset returns the same object if I access it using
'''queryset[0]''', or '''queryset[1]''', but in a lot of other cases, it
works as it should.

* '''queryset.query''' returns nothing magical.. Just a simple SELECT
query
* I have plenty of duplicate names, but this only happens to server1, tho
this is hard to test in bulk, since the duplicate problem won't show up if
I try to automate the testing..

versions:
* python: 3.6.5
* postgres 9.4
* django: 2.1.1

--
Ticket URL: <https://code.djangoproject.com/ticket/29794>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 26, 2018, 8:00:55 AM9/26/18
to django-...@googlegroups.com
#29794: Duplicate object returned using filter
-------------------------------------+-------------------------------------
Reporter: Lars Solberg | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 2.1
(models, ORM) |
Severity: Normal | Resolution: invalid

Keywords: duplicate, vacuum | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Claude Paroz):

* status: new => closed
* resolution: => invalid


Comment:

Sorry, but I don't see anything shocking in what you showed us, even if
it's a bit surprising.

For example:
{{{


In [6]: Server.objects.filter(name='server1').order_by('-name')[0].pk
Out[6]: 2

In [7]: Server.objects.filter(name='server1').order_by('-name')[1].pk
Out[7]: 2
}}}

You are doing two queries with undefined ordering (as `name` is
identical). So it's totally possible that the first query returns pk [1,
2], while the second query returns [2, 1].

It would be shocking if you'd obtain a similar result with one unique
queryset:
{{{
qs = Server.objects.filter(name='server1').order_by('-name')
qs[0].pk => 2
qs[1].pk => 2
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/29794#comment:1>

Django

unread,
Sep 27, 2018, 12:56:49 AM9/27/18
to django-...@googlegroups.com
#29794: Duplicate object returned using filter
-------------------------------------+-------------------------------------
Reporter: Lars Solberg | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 2.1
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: duplicate, vacuum | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Lars Solberg):

Sorry, some of the examples was stupid. But I did do the queryset check as
well before the VACUUM fixed the problem.
The result are exactly as you said.

Scrolling the terminal history, I have this

{{{
In [1]: qs = Server.objects.filter(name='server1')

In [2]: qs[0]
Out[2]: <Server: server1>

In [3]: qs[0].pk
Out[3]: 2

In [4]: qs[1].pk
Out[4]: 2
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/29794#comment:2>

Django

unread,
Sep 27, 2018, 9:02:39 AM9/27/18
to django-...@googlegroups.com
#29794: Duplicate object returned using filter
-------------------------------------+-------------------------------------
Reporter: Lars Solberg | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 2.1
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: duplicate, vacuum | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Claude Paroz):

Oh, weird. Unfortunately, I'm afraid this will impossible to solve unless
we have some way to reproduce the problem.
Feel free to add anything new you could find in the future about this. I
don't see currently how Django could be at fault.

--
Ticket URL: <https://code.djangoproject.com/ticket/29794#comment:3>

Django

unread,
Sep 27, 2018, 9:07:40 AM9/27/18
to django-...@googlegroups.com
#29794: Duplicate object returned using filter
-------------------------------------+-------------------------------------
Reporter: Lars Solberg | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 2.1
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: duplicate, vacuum | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Lars Solberg):

Absolutly.. This solved itself by doing a VACUUM.. However, auto-vacuum
was enabled, so I guess postgres can be blamed somewhat..?
I tryed to vacuum too quickly, as I can't reproduce and debug the problem
anymore.
If it happens again, I'll dig deeper.

I though something like [i.pk for i in
Server.objects.filter(name='server1')] would generate the same sql queries
as qs[0], qs[1]..

--
Ticket URL: <https://code.djangoproject.com/ticket/29794#comment:4>

Reply all
Reply to author
Forward
0 new messages