[Django] #27538: Value of JSONField is being re-encoded to string even though being already encoded

52 views
Skip to first unread message

Django

unread,
Nov 25, 2016, 9:12:01 AM11/25/16
to django-...@googlegroups.com
#27538: Value of JSONField is being re-encoded to string even though being already
encoded
--------------------------------------------+---------------------------
Reporter: Petar Aleksic | Owner: (none)
Type: Uncategorized | Status: new
Component: contrib.postgres | Version: 1.10
Severity: Release blocker | Keywords: JSONField
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
--------------------------------------------+---------------------------
What I assume the problem is, is that the value of the JSONField is being
re-encoded on every fetch or update, even though it had already been
encoded and the value of the field hadn't been changed. This eventually
causes an exponential growth of backslashes in the value of the field in
the database leading to an InternalError: invalid memory alloc request
size.

Here is how to reproduce the bug with shell:
Let's say there is a model named MyModel with a JSONField defined as
follows:

{{{
#!python
json_field = JSONField(
blank=True,
null=True,
default=dict
)
}}}

Now, after importing neccesary models, let's do some data IO operations in
shell:
{{{
#!python
my_model = MyModel.objects.get(id=1)
my_model.json_field = {"foo":"bar"}
my_model.save()
my_model.json_field
}}}
The last command prints out: {'foo': 'bar'} which is perfectly fine.
Let's now fetch the model again and print the value of the json field:
{{{
#!python
my_model = MyModel.objects.get(id=1)
my_model.json_field
}}}
This prints out '{"foo": "bar"}', so we se that the dict has been
converted to string (probably somewhere with json.dumps). If we run
my_model.save() without changing the value of json_field (which is a real-
world scenario, for instance we might have wanted to change other fields
and then run save) and fetch it again, the value of json_field wil be
doubly-encoded, although the value already is a valid JSON string :
{{{
#!python
my_model.save()
my_model = MyModel.objects.get(id=1)
my_model.json_field
}}}
Last command now prints out {{{ '"{\\"foo\\": \\"bar\\"}"' }}}. Obviously
the string has now been re-encoded, causing some characters to be escaped.
Next iteration of these steps results in:
{{{ '"\\"{\\\\\\"foo\\\\\\": \\\\\\"bar\\\\\\"}\\""' }}}

If we repeat these actions multiplte times the value will grow in a very
fast manner due to escaping the backslashes with backslashes. In only a
few iterations I managed to have pg_dump (only for data) create a 1GB
output file.

I am not sure whether the the cause for this bug resides in Django's
implementation of the JSONField or maybe in the psycopgb's implementation
of postgres JSONField, but somewhere on the fetch from database, the value
of the field is being encoded to string with json.dumps, and this is being
repeatedly done on every fetch, despite the value being already encoded to
a valid JSON string. It is my assumption that this happens on fetch, it
might be the case, that on update (my_model.save()) re-encoding and thus
re-escaping takes place.
Same happens if we never change the value of the json_field. If it
initially was an empty json obj {} , after only a few iterations it will
grow to {{{ '"\\"\\\\\\"{}\\\\\\"\\""' }}}

--
Ticket URL: <https://code.djangoproject.com/ticket/27538>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 25, 2016, 2:59:28 PM11/25/16
to django-...@googlegroups.com
#27538: Value of JSONField is being re-encoded to string even though being already
encoded
----------------------------------+--------------------------------------

Reporter: Petar Aleksic | Owner: (none)
Type: Bug | Status: closed
Component: contrib.postgres | Version: 1.10
Severity: Normal | Resolution: worksforme

Keywords: JSONField | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+--------------------------------------
Changes (by Tim Graham):

* status: new => closed
* type: Uncategorized => Bug
* resolution: => worksforme
* severity: Release blocker => Normal


Comment:

I wrote the attached test which passes for me. Could you please provide a
test that fails for you?

--
Ticket URL: <https://code.djangoproject.com/ticket/27538#comment:1>

Django

unread,
Nov 25, 2016, 2:59:41 PM11/25/16
to django-...@googlegroups.com
#27538: Value of JSONField is being re-encoded to string even though being already
encoded
----------------------------------+--------------------------------------

Reporter: Petar Aleksic | Owner: (none)
Type: Bug | Status: closed
Component: contrib.postgres | Version: 1.10
Severity: Normal | Resolution: worksforme

Keywords: JSONField | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+--------------------------------------
Changes (by Tim Graham):

* Attachment "27538-test.diff" added.

Django

unread,
Dec 1, 2016, 5:15:33 AM12/1/16
to django-...@googlegroups.com
#27538: Value of JSONField is being re-encoded to string even though being already
encoded
----------------------------------+--------------------------------------

Reporter: Petar Aleksic | Owner: (none)
Type: Bug | Status: closed
Component: contrib.postgres | Version: 1.10
Severity: Normal | Resolution: worksforme

Keywords: JSONField | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+--------------------------------------

Comment (by Petar Aleksic):

I didn't manage to reproduce the bug with the test or in a fresh django
project.

--
Ticket URL: <https://code.djangoproject.com/ticket/27538#comment:2>

Django

unread,
Mar 19, 2017, 2:50:40 PM3/19/17
to django-...@googlegroups.com
#27538: Value of JSONField is being re-encoded to string even though being already
encoded
----------------------------------+--------------------------------------

Reporter: Petar Aleksic | Owner: (none)
Type: Bug | Status: new
Component: contrib.postgres | Version: 1.10
Severity: Normal | Resolution:

Keywords: JSONField | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+--------------------------------------
Changes (by Waken Meng):

* status: closed => new
* resolution: worksforme =>


Comment:

I have the same problem, Re-json the JSONField value.


{{{
from django.contrib.postgres.fields import JSONField

class Foo(models.Model):
photos = JSONFields(max_length=300)

>>f = Foo()
>>f.photos = []
>>f.save()
>>f.photos
u'[]'

>>f.save()
>>f.refresh_from_db()
>>f.photos
u'"[]"'

>>f.save()
>>f.refresh_from_db()
>>f.photos
u'"\\"[]\\""'

}}}
Above was in manage.py shell, and everytime I save the instance, the
JSONField value is re-jsonized.

env:
postgres 9.6.1
python 2.7.12

django 1.10.5
psycopg2 2.6.2

--
Ticket URL: <https://code.djangoproject.com/ticket/27538#comment:3>

Django

unread,
Mar 19, 2017, 6:11:36 PM3/19/17
to django-...@googlegroups.com
#27538: Value of JSONField is being re-encoded to string even though being already
encoded
----------------------------------+--------------------------------------

Reporter: Petar Aleksic | Owner: (none)
Type: Bug | Status: closed
Component: contrib.postgres | Version: 1.10
Severity: Normal | Resolution: duplicate

Keywords: JSONField | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+--------------------------------------
Changes (by Tim Graham):

* status: new => closed
* resolution: => duplicate


Comment:

This behavior doesn't reproduce in Django's test suite. Are you using
`django-jsonfield`? See ticket:27675#comment:8.

--
Ticket URL: <https://code.djangoproject.com/ticket/27538#comment:4>

Reply all
Reply to author
Forward
0 new messages