Different objects returned when filtering with a ValuesListQuerySet

48 views
Skip to first unread message

Marc Kirkwood

unread,
Jul 25, 2014, 11:36:52 AM7/25/14
to django...@googlegroups.com
When filtering a queryset with a ValuesListQuerySet instance in Django 1.6, different objects to that shown in the apparent list seem to be returned by the iterator.

An illustration of a head-scratching debug session is shown below:

(Pdb) latest
[4, 1]
(Pdb) keys.filter(pk__in=latest).values_list('pk')
[(3,), (1),)]
(Pdb) keys.filter(pk__in=[4, 1]).values_list('pk')
[(4,), (1),)]

Can anyone explain this behaviour to me? When I fully convert it with list(latest), normal service resumes.
I have a suspicion that it's because of the Postgres-aware order_by() and distinct() chain, in the queryset that latest was produced from.

Thanks,

Marc.

Legal status: Any views or opinions are solely those of the sender and do not necessarily represent those of SF Software Ltd unless expressly stated in the body of the text of the email, this email is not intended to form a binding contract.

Confidentiality: this communication may contain information that is confidential and/or privileged and is for the exclusive use of the intended recipient(s). Any form of distribution, copying or use of this communication by anyone else is strictly prohibited. If you have received this communication in error, please reply to this message or telephone +44 (0)845 310 1788 and delete this communication and destroy any copies. 

Security: this communication has been created and sent in the knowledge that internet e-mail is not secure. We strongly advise you to understand and to be aware of the lack of security when e-mailing us. If you communicate with us via e-mail, we will assume that you accept the security risk and that you authorise us to communicate with you in the same format. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. 

Warning: Although we take reasonable precautions to ensure no viruses are present in this email, we cannot accept responsibility for any loss or damage arising from the use of this email or attachments. 
-----------------------------------------------------------------------------------------------------------------
In compliance with Directive on Disclosure, The Companies Regulations 2006, effective 01 January 2007 talktopebble.co.uk,schoolfund.co.ukschoolfundfinder.co.ukeasyusbooks.co.ukclubfund.co.uk are domain names registered to SF Software Limited.

SF Software Limited is a company registered in England and Wales with company number: 05580540. Our trading name is Pebble our trading address and registered office is: Media Exchange Three, Coquet Street, Newcastle upon Tyne, NE1 2QB
VAT Registration number is: GB 873 5186 95

Russell Keith-Magee

unread,
Jul 25, 2014, 7:35:05 PM7/25/14
to Django Users
On Fri, Jul 25, 2014 at 11:36 PM, Marc Kirkwood <marc.k...@talktopebble.co.uk> wrote:
When filtering a queryset with a ValuesListQuerySet instance in Django 1.6, different objects to that shown in the apparent list seem to be returned by the iterator.

An illustration of a head-scratching debug session is shown below:

(Pdb) latest
[4, 1]
(Pdb) keys.filter(pk__in=latest).values_list('pk')
[(3,), (1),)]
(Pdb) keys.filter(pk__in=[4, 1]).values_list('pk')
[(4,), (1),)]

Can anyone explain this behaviour to me? When I fully convert it with list(latest), normal service resumes.
I have a suspicion that it's because of the Postgres-aware order_by() and distinct() chain, in the queryset that latest was produced from.
 
It's difficult to say for certain without knowing details of all the moving parts, but I'm going to guess that the query that is producing latest isn't as consistent as you think it is.

In your debug session, statement 2 isn't "retrieve results that match statement 1", it's "retrieve results that match this subquery". That is, "latest" is being re-evaluated as a subquery in the second statement. If there's any order sensitivity in the query generating "latest", you're not guaranteed to get the same results. 

However, if you call list(latest), it ceases to be a subquery - it takes the cached results from the first statement and turns it into an IN [values] clause - effectively the same as statement 3.

If there are lots of order_by and distinct clauses, that might also affect exactly how the subquery is rolling out; it would be worth printing keys.filter(pk__in=latest).query to see exactly what is being executed.

Yours,
Russ Magee %-)

Marc Kirkwood

unread,
Jul 28, 2014, 4:32:56 AM7/28/14
to django...@googlegroups.com
Yeah that's what I was thinking it could be. cheers Russ.
Reply all
Reply to author
Forward
0 new messages