|Play Json parsing performance anomaly||mda...@secondmarket.com||4/14/12 6:19 PM|
Hello, I've run in to a significant problem with the parsing
performance of play.api.libs.json.Json.parse when parsing large
arrays. I haven't debugged the root cause yet, but timing jerkson
Json.parse vs play Json.parse on an array of 100,000 elements yields:
Jerkson: 40-70ms with repeated calls being cached (8ms)
Play: over 3 minutes! with no caching on repeated calls
Parsing more "normal" json structures seems to perform acceptably
(I've only timed a few scenarios though). Is this a known issue? If I
have time I'll try to find the cause of this anomaly.
|Re: [play-framework] Play Json parsing performance anomaly||Pascal||4/15/12 11:09 AM|
isn't it because jackson is doing some caching or otimization somehow?
play json is building the full json AST in memory so it may be the reason why it takes some time for such a big array...
|Re: Play Json parsing performance anomaly||Ryan Tanner||8/15/13 11:29 AM|
I hate to bump such an old message but has anyone else run into this? I just tried a test using Json.parse with 65k strings in a JSON array and the parse took nearly two minutes. This could potentially be a big problem for us.
|Re: Play Json parsing performance anomaly||Ryan Tanner||8/15/13 12:12 PM|
Just to go into a bit more detail:
It appears that Play's internal JsValueDeserializer is using append for each element of an array.
I checked and master uses the same method.
Scala's List has linear-time performance on appends, this is likely the source of the performance slowdown on very large arrays. Wouldn't it make more sense to either:
1) Use a Vector during deserialization and convert to a List/Seq once complete.
2) Switch to prepend instead of append and then reverse the collection once complete.
3) Use a mutable data structure during construction and then convert to an immutable structure once complete.
Given my understanding of how JsValueDeserializer works (I may very well be wrong) it looks like on large JSON strings you're going to wind up with tons of intermediate objects being created which the GC will have to clean up.
Any thoughts on this?
|Re: [play-framework] Re: Play Json parsing performance anomaly||Guillaume Bort||8/16/13 2:22 AM|
Yes you are probably right. You should open a pull request for that.
|Re: [play-framework] Re: Play Json parsing performance anomaly||Ryan Tanner||8/16/13 8:15 AM|
Thanks, I'll get to work on that.
My thought is that a Vector is the appropriate structure to use during deserialization with a conversion to List at the end. Opinions?
|Re: [play-framework] Re: Play Json parsing performance anomaly||Ryan Tanner||8/16/13 12:37 PM|
I've opened an issue on Github:
I forked the repo, changes are here:
And performance tests are here: