Given models:
{{{#!python
from django.db import models
# Create your models here.
class ParentModel(models.Model):
name = models.CharField(max_length=64)
class ChildModel(models.Model):
name = models.CharField(max_length=64)
parent = models.ForeignKey('ParentModel', related_name='children')
}}}
Following querys, when pickled causes unwanted evaluation and creating
quite large pickled result:
{{{#!python
for x in range(1,10000):
ParentModel(name='Parent {}'.format(x), ).save()
ChildModel(name='Child 1', parent=ParentModel.objects.all()[0]).save()
parents_1 = ParentModel.objects.all().values_list('pk', flat=True)
children_1 = ChildModel.objects.filter(parent__in=parents_1)
pickled_stuff_1 = pickle.dumps(children_1.query)
parents_2 = ParentModel.objects.all()
children_2 = ChildModel.objects.filter(parent__in=parents_2)
pickled_stuff_2 = pickle.dumps(children_2.query)
# First len is about 74 kilobytes, second len is about 2 megabytes.
print len(pickled_stuff_1), len(pickled_stuff_2)
}}}
When comparing sizes of pickled queries both are relatively big. When
inspecting latter pickle it can be seen that it actually contains all
instances from fully evaluated queryset. Same behavior exists in 1.8 and
1.10 as well.
--
Ticket URL: <https://code.djangoproject.com/ticket/27159>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* component: Uncategorized => Database layer (models, ORM)
* needs_tests: => 0
* needs_docs: => 0
* type: Uncategorized => Bug
* stage: Unreviewed => Accepted
Comment:
I didn't attempt to reproduce but I trust the reporter. If you could
transform the reproduction information into a test case for
`queryset_pickle`, that's always helpful.
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:1>
Comment (by jtiai):
I added crude, initial test to {{{queryset_pickle}}} to
https://github.com/jtiai/django/tree/issue_27159
Interesting enough, if you check pickled output it's clearly visible that
it contains all instances from Parent-model.
If you compare query output as a string both are equal.
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:2>
Comment (by jtiai):
After doing some debugging it looks like pickle goes to {{{WhereNode}}}
which in turn goes to {{{RelatedIn}}} query, which goes to
{{{QuerySet.__getstate__}}} which, by design populates full cache.
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:3>
* owner: nobody => DavidFozo
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:4>
* Attachment "ticket_27159.diff" added.
Comment (by DavidFozo):
I deleted `self._fetch_all()` from `django/db/models/query.py
QuerySet.__getstate__ `, so now it doesn't populate the cache before
pickling. Is it a valid solution? All tests are passing.
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:5>
* has_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:6>
* needs_better_patch: 0 => 1
Comment:
I left some comments for improvement on the PR.
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:7>
* needs_better_patch: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:8>
* owner: DavidFozo => jtiai
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:9>
* needs_better_patch: 0 => 1
Comment:
Left some comments on the PR.
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:10>
* needs_better_patch: 1 => 0
Comment:
Made all suggested changes and improved test cases.
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:11>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"7a2c27112d1f804f75191e9bf45a96a89318a684" 7a2c271]:
{{{
#!CommitTicketReference repository=""
revision="7a2c27112d1f804f75191e9bf45a96a89318a684"
Fixed #27159 -- Prevented pickling a query with an __in=inner_qs lookup
from evaluating inner_qs.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/27159#comment:12>