--
You received this message because you are subscribed to the Google Groups "jackson-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jackson-user...@googlegroups.com.
To post to this group, send email to jackso...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
First of all, for most use cases you should see some performance improvement. The only case where difference should be modest or non-existing would be if all your data consists of String values, Maps of Maps style; in this case amount of information as well as resulting size are very close and there is not much room for improvement.
This assumption of vast improvement is bolstered by lots of folklore on the web, claiming extraordinary improvements, without backing up, or comparing poorly implemented textual format encoders/decoders: as it is easier to write a decently performing binary codec than textual format one, this is somewhat understandable: on some platforms default JSON codec is inefficient, and thereby well-written binary codecs could well be much faster.
As to CBOR specifically, since CBOR includes same information as JSON (and Smile), and is self-describing, it is not quite as efficient as schema-requiring formats like Protobuf or Avro. This is beneficial for usage, but means that size reduction is more modest, and similarly performance improvement.
Test setup I use for comparing Jackson codecs:shows, just as an example, 20% improvement for reading and 20-30% for writing; and bit higher improvements when using Afterburner.Performance improvement using Smile format is slightly better for this case, although for larger documents Smile should perform quite a bit better.So what is happening in your case? Maybe sharing some of test code would help. There are probably 3 common cases that could occur:1. You are not using CBOR codec at all, but JSON in both cases. You can rule this out by checking length of encoded document; lengths should always differ
2. Content you have consists of mostly or completely of String values (as per above), and is processed as untyped data (JsonNode or Map). If so, performance really might be very similar3. Usage itself is accidentally inefficient, like not reusing ObjectMapper, in which case overhead not related to data format makes up most of time used.
On Sunday, September 20, 2015 at 11:58:54 AM UTC-7, Tatu Saloranta wrote:First of all, for most use cases you should see some performance improvement. The only case where difference should be modest or non-existing would be if all your data consists of String values, Maps of Maps style; in this case amount of information as well as resulting size are very close and there is not much room for improvement.It might be that we fall into this situation. Most of our fields are strings..
This assumption of vast improvement is bolstered by lots of folklore on the web, claiming extraordinary improvements, without backing up, or comparing poorly implemented textual format encoders/decoders: as it is easier to write a decently performing binary codec than textual format one, this is somewhat understandable: on some platforms default JSON codec is inefficient, and thereby well-written binary codecs could well be much faster.Well I admit this was just an assumption of mine based on my experience that binary protocols can have dramatic improvements over textual protocols.
In a prior life I helped invent RSS and my company pushed about 100TB a month in protocol data (Spinn3r). In the past we pushed protocol buffers but have migrated to JSON/Jackson (which we really like btw).
As to CBOR specifically, since CBOR includes same information as JSON (and Smile), and is self-describing, it is not quite as efficient as schema-requiring formats like Protobuf or Avro. This is beneficial for usage, but means that size reduction is more modest, and similarly performance improvement.Yes. I was thinking about going back to protobuf for storing our data. But since we serve JSON anyway it might be more efficient to just store it as a UTF8 blob and then serve the blob.
Test setup I use for comparing Jackson codecs:shows, just as an example, 20% improvement for reading and 20-30% for writing; and bit higher improvements when using Afterburner.Performance improvement using Smile format is slightly better for this case, although for larger documents Smile should perform quite a bit better.So what is happening in your case? Maybe sharing some of test code would help. There are probably 3 common cases that could occur:1. You are not using CBOR codec at all, but JSON in both cases. You can rule this out by checking length of encoded document; lengths should always differAh. I ruled it out by running it under a profiler and it does show CBOR...
2. Content you have consists of mostly or completely of String values (as per above), and is processed as untyped data (JsonNode or Map). If so, performance really might be very similar3. Usage itself is accidentally inefficient, like not reusing ObjectMapper, in which case overhead not related to data format makes up most of time used.Yes. I profiled the code and didn't see anything like this. No obvious hotspots other than usual CBOR or JSON variable handling.
Thanks for the feedback. I'll probably look more into the internals this week to see if I can squeeze any more performance or redesign the stack a bit more...
Ok. Also make sure to use latest (2.6.2) version, if possible. There have been some improvements to CBOR codec with 2.6. It wasn't quite as fully optimized as Smile and JSON codecs are.
Well I admit this was just an assumption of mine based on my experience that binary protocols can have dramatic improvements over textual protocols.Right, they can. I am just frustrated at unqualified comments made on many articles -- there are cases where improvement is significant (for example, passing floating-point numeric values), and others where it is less so.
One other thing, just in case you do want to use protobuf in places: Jackson now has protobuf backend as well, with 2.6. I actually like protobuf in many ways, as a binary protocol.It definitely makes different trade-offs than JSON/CBOR/Smile, but I like its simplicity, and it makes for a good format for some use cases.
ps. I try to be on #jackson IRC channel at freenode, if you want to bounce ideas -- mailing list is useful, but sometimes chat is good for rapid exchange of ideas