Play Json parsing performance anomaly

453 views
Skip to first unread message

mda...@secondmarket.com

unread,
Apr 14, 2012, 9:19:20 PM4/14/12
to play-framework
Hello, I've run in to a significant problem with the parsing
performance of play.api.libs.json.Json.parse when parsing large
arrays. I haven't debugged the root cause yet, but timing jerkson
Json.parse vs play Json.parse on an array of 100,000 elements yields:

Jerkson: 40-70ms with repeated calls being cached (8ms)
Play: over 3 minutes! with no caching on repeated calls

Parsing more "normal" json structures seems to perform acceptably
(I've only timed a few scenarios though). Is this a known issue? If I
have time I'll try to find the cause of this anomaly.


Pascal Voitot Dev

unread,
Apr 15, 2012, 2:09:45 PM4/15/12
to play-fr...@googlegroups.com
isn't it because jackson is doing some caching or otimization somehow?
play json is building the full json AST in memory so it may be the reason why it takes some time for such a big array...

Pascal



--
You received this message because you are subscribed to the Google Groups "play-framework" group.
To post to this group, send email to play-fr...@googlegroups.com.
To unsubscribe from this group, send email to play-framewor...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/play-framework?hl=en.


Ryan Tanner

unread,
Aug 15, 2013, 2:29:13 PM8/15/13
to play-fr...@googlegroups.com
I hate to bump such an old message but has anyone else run into this?  I just tried a test using Json.parse with 65k strings in a JSON array and the parse took nearly two minutes.  This could potentially be a big problem for us.

Ryan Tanner

unread,
Aug 15, 2013, 3:12:38 PM8/15/13
to play-fr...@googlegroups.com
Just to go into a bit more detail:

It appears that Play's internal JsValueDeserializer is using append for each element of an array.


I checked and master uses the same method.

Scala's List has linear-time performance on appends, this is likely the source of the performance slowdown on very large arrays.  Wouldn't it make more sense to either:

1) Use a Vector during deserialization and convert to a List/Seq once complete.
2) Switch to prepend instead of append and then reverse the collection once complete.
3) Use a mutable data structure during construction and then convert to an immutable structure once complete.

Given my understanding of how JsValueDeserializer works (I may very well be wrong) it looks like on large JSON strings you're going to wind up with tons of intermediate objects being created which the GC will have to clean up.

Any thoughts on this?

Guillaume Bort

unread,
Aug 16, 2013, 5:22:05 AM8/16/13
to play-fr...@googlegroups.com

Yes you are probably right. You should open a pull request for that.

--
You received this message because you are subscribed to the Google Groups "play-framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to play-framewor...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ryan Tanner

unread,
Aug 16, 2013, 11:15:53 AM8/16/13
to play-fr...@googlegroups.com
Thanks, I'll get to work on that.  

My thought is that a Vector is the appropriate structure to use during deserialization with a conversion to List at the end.  Opinions?

Ryan Tanner

unread,
Aug 16, 2013, 3:37:54 PM8/16/13
to play-fr...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages