Retrieving json stored in TextField as json

597 views
Skip to first unread message

Marc Aymerich

unread,
Oct 9, 2013, 10:01:23 AM10/9/13
to django-users
Hi,
I'm storing large volumes of json data in a TextFields, Then I have a
view that converts this text data into JSON, but this silly operation
requires a considerable amount of resources for a large dataset.

It would be nice if I'm able to retrieve native JSON directly from the database.

As you can see value is "almost" json, the only problem is that is
wrapped around quotes :)

>>> TimeSerie.objects.filter(type='cpu').values('value')
[{'value': '{"scheduled":2,"total":864,"15min":0.38,"1min":0.3,"5min":0.48}'},
{'value': '{"scheduled":2,"total":859,"15min":0.34,"1min":0.23,"5min":0.32}'},
{'value': '{"scheduled":2,"total":849,"15min":0.33,"1min":0.51,"5min":0.32}'},
{'value': '{"scheduled":2,"total":814,"15min":0.3,"1min":0.2,"5min":0.25}'},
{'value': '{"scheduled":2,"total":817,"15min":0.25,"1min":0.14,"5min":0.17}'},
{'value': '{"scheduled":2,"total":815,"15min":0.22,"1min":0.18,"5min":0.14}'}]

Any suggestion in how I can speed this up by avoiding to convert each
element to json by iterating on the query results?

I'm using psotgresql 9.3 with native json support, not sure if it is
relevant or not :)

--
Marc

Rafael Durán Castañeda

unread,
Oct 9, 2013, 7:16:50 PM10/9/13
to django...@googlegroups.com
I'm using django-json-field [1] for solving the same problem.

[1] https://github.com/derek-schaefer/django-json-field

HTH
> --
> You received this message because you are subscribed to the Google Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
> To post to this group, send email to django...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-users.
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CA%2BDCN_vTbC0-yOt7cuToYLZSOD%2B%3DZCubLOjgkpu-MyB1VxbGMA%40mail.gmail.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Marc Aymerich

unread,
Oct 10, 2013, 5:08:18 AM10/10/13
to django-users
Hi Rafael, thanks !
I forgot to mention that I was already using django-jsonfield
https://github.com/bradjasper/django-jsonfield
and now I've tried with django-json-field but with the same unfortunate result:

TimeSerie.objects.filter(type='cpu').values('value')
[{'value': '{"scheduled":2,"total":864,"15min":0.38,"1min":0.3,"5min":0.48}'},
{'value': '{"scheduled":2,"total":859,"15min":0.34,"1min":0.23,"5min":0.32}'},
{'value': '{"scheduled":2,"total":849,"15min":0.33,"1min":0.51,"5min":0.32}'}]

did I missed something?
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/646FABCC-B4DB-4EE7-BC54-B0C5A2C7C25F%40gmail.com.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Marc

Daniel Roseman

unread,
Oct 10, 2013, 12:02:34 PM10/10/13
to django...@googlegroups.com
On Thursday, 10 October 2013 10:08:18 UTC+1, Marc Aymerich wrote:
Hi Rafael, thanks !
I forgot to mention that I was already using django-jsonfield
https://github.com/bradjasper/django-jsonfield
and now I've tried with django-json-field but with the same unfortunate result:

TimeSerie.objects.filter(type='cpu').values('value')
[{'value': '{"scheduled":2,"total":864,"15min":0.38,"1min":0.3,"5min":0.48}'},
{'value': '{"scheduled":2,"total":859,"15min":0.34,"1min":0.23,"5min":0.32}'},
{'value': '{"scheduled":2,"total":849,"15min":0.33,"1min":0.51,"5min":0.32}'}]

did I missed something?

 
You need to show how you are storing this value in the first place.
--
DR.

Marc Aymerich

unread,
Oct 10, 2013, 5:06:08 PM10/10/13
to django-users
yep, here we go

class TimeSerie(models.Model):
type = models.CharField(max_length=64)
value = JSONField()
date = models.DateTimeField(auto_now_add=True)


# This is how the value is stored
value = json.loads(run(self.cmd, display=False).stdout)
TimeSerie.objects.create(type=self.name, value=value)


# And this is how it looks like using ValuesQuerySet vs accessing the
object value

>>> TimeSerie.objects.filter(pk=4769).values('value')
[{'value': '{"used":5358516,"cached":3191544,"free":2729444,"real-used":2044016,"shared":0,"real-free":6043944,"total":8087960,"buffers":122956}'}]

>>> TimeSerie.objects.filter(pk=4769)[0].values
{u'used': 5358516, u'cached': 3191544, u'free': 2729444, u'real-used':
2044016, u'shared': 0, u'real-free': 6043944, u'total': 8087960,
u'buffers': 122956}

One is an string and the other is a dict.

And here using raw sql
=# select value from monitor_timeserie where id=4769;
value
--------------------------------------------------------------------------------------------------------------------------------------
{"used":5358516,"cached":3191544,"free":2729444,"real-used":2044016,"shared":0,"real-free":6043944,"total":8087960,"buffers":122956}
(1 row)


it's a json field acctually
=# \d+ monitor_timeserie;
Table
"public.monitor_timeserie"
Column | Type |
Modifiers | Storage | Stats target |
Description
--------+--------------------------+----------------------------------------------------------------+----------+--------------+-------------
id | integer | not null default
nextval('monitor_timeserie_id_seq'::regclass) | plain |
|
type | character varying(64) | not null
| extended | |
value | json | not null
| extended | |
date | timestamp with time zone | not null
| plain | |
Indexes:
"monitor_timeserie_pkey" PRIMARY KEY, btree (id)
"monitor_timeserie_date_4395efb9" btree (date, type)
Has OIDs: no


--
Marc
Reply all
Reply to author
Forward
0 new messages