Performance comparison of Thrift, JSON and Protocol Buffers

4,321 views
Skip to first unread message

Adewale Oshineye

unread,
Mar 2, 2009, 5:14:50 AM3/2/09
to prot...@googlegroups.com
This article has some surprising results from it's performance
comparison of Thrift, Protocol Buffers and JSON:
http://bouncybouncy.net/ramblings/posts/thrift_and_protocol_buffers/

Jon Skeet <skeet@pobox.com>

unread,
Mar 2, 2009, 6:46:03 AM3/2/09
to Protocol Buffers
On Mar 2, 10:14 am, Adewale Oshineye <adew...@gmail.com> wrote:
> This article has some surprising results from it's performance
> comparison of Thrift,  Protocol Buffers and JSON:http://bouncybouncy.net/ramblings/posts/thrift_and_protocol_buffers/

More specifically, it's comparing the performance of the Python
implementations of all of those. That only really says that our Python
implementation is relatively slow. I think the numbers for C++/Java
are somewhat better :)

Jon

Stephan Richter

unread,
Mar 2, 2009, 10:52:09 AM3/2/09
to prot...@googlegroups.com, Adewale Oshineye

The outcome looks about right. In the latest version of Python, even
simplejson has C extensions. A one order of magnitude difference between a
pure Python versus C implementation is about right, if not too small. I would
have expected a difference of 20-50 times.

Besides the fact that this post is is far too short on details - i.e. Python
version, OS, hardware -- I would look at it as a great motivation to get the
C extensions for PB done quickly. ;-)

Regards,
Stephan
--
Stephan Richter
Web Software Design, Development and Training
Google me. "Zope Stephan Richter"

Justin Azoff

unread,
Mar 2, 2009, 11:48:12 AM3/2/09
to Protocol Buffers
On Mar 2, 10:52 am, Stephan Richter <stephan.rich...@gmail.com> wrote:
> The outcome looks about right. In the latest version of Python, even
> simplejson has C extensions. A one order of magnitude difference between a
> pure Python versus C implementation is about right, if not too small. I would
> have expected a difference of 20-50 times.
>
> Besides the fact that this post is is far too short on details - i.e. Python
> version, OS, hardware -- I would look at it as a great motivation to get the
> C extensions for PB done quickly. ;-)
>
> Regards,
> Stephan

Hi all :-)

I actually posted a follow up:
http://bouncybouncy.net/ramblings/posts/more_on_json_vs_thrift_and_protocol_buffers/
It turned out I didn't have the simplejson C extension installed...
With that installed the speed difference was much greater.

The test were run on python2.4 on a 3.2gz p4.

Aside from the speed I really like Protocol Buffers, the API and docs
are very well done :-)

- Justin

Stephan Richter

unread,
Mar 2, 2009, 1:04:54 PM3/2/09
to prot...@googlegroups.com, Justin Azoff
On Monday 02 March 2009, Justin Azoff wrote:
> I actually posted a follow up:
> http://bouncybouncy.net/ramblings/posts/more_on_json_vs_thrift_and_protocol
>_buffers/ It turned out I didn't have the simplejson C extension
> installed... With that installed the speed difference was much greater.
>
> The test were run on python2.4 on a 3.2gz p4.
>
> Aside from the speed I really like Protocol Buffers, the API and docs
> are very well done :-)

Okay, that is more along the lines what I expected, 30-50 times faster. It
should serve even more as a motivation for PB C implementation now. :-)

BTW, you are just confirming some of the performance issues that we have seen
as well.

Kenton Varda

unread,
Mar 2, 2009, 1:14:03 PM3/2/09
to stephan...@gmail.com, Petar Petrov, prot...@googlegroups.com, Justin Azoff
[+petar]

Petar, want to share what you have so far on the C-extension stuff?  Maybe someone with more time available would like to help with it.

Dave Bailey

unread,
Mar 3, 2009, 2:37:00 PM3/3/09
to Protocol Buffers
Justin,

Thanks for writing this up; I think it's a nice "real world" example.

I ran an equivalent test (using your same .proto files) in Perl to
compare JSON::XS, protobuf-perlxs, and Storable. I did this on an
x86_64 quad-core Xeon (2.5 GHz) and found:

1) Your original dns.proto (with strings), serializing and
deserializing a DnsResponse with 5000 random DnsRecord elements:

0.019414 seconds to serialize as 658358 bytes with JSON
0.009672 seconds to deserialize 658358 bytes with JSON
0.030239 seconds to serialize as 415044 bytes with protobuf-perlxs
0.011978 seconds to deserialize 415044 bytes with protobuf-perlxs
0.029631 seconds to serialize as 693291 bytes with Storable
0.009553 seconds to deserialize 693291 bytes with Storabe

2) Your modified dns.proto (sip/dip/sport/dport), serializing and
deserializing a DnsResponse with 10000 random DnsRecord elements:

0.003501 seconds to serialize as 300322 bytes with JSON
0.005016 seconds to deserialize 300322 bytes with JSON
0.009567 seconds to serialize as 85838 bytes with protobuf-perlxs
0.004225 seconds to deserialize 85838 bytes with protobuf-perlxs
0.014848 seconds to serialize as 340886 bytes with Storable
0.004669 seconds to deserialize 340886 bytes with Storabe

I timed the actual serialization part only, and excluded the time to
generate the Perl data structure that is serialized (since that has
nothing to do with the serialization per se). With protobuf-perlxs,
deserialization is comparable in performance to JSON::XS and Storable
in Perl. Serialization is slower, but not extraordinarily slow. It
seems to be within a factor of 2 or 3. From what I know of Python, a
Python/C++ protobuf port (e.g. "protopy") would generate code that
exhibits similar performance characteristics. We have had really good
success with Perl/XS in terms of performance. I did not do a
comparison with Thrift/Perl, but I'm guessing we compare favorably.

-dave

On Mar 2, 8:48 am, Justin Azoff <justin.az...@gmail.com> wrote:
> On Mar 2, 10:52 am, Stephan Richter <stephan.rich...@gmail.com> wrote:
>
> > The outcome looks about right. In the latest version of Python, even
> > simplejson has C extensions. A one order of magnitude difference between a
> > pure Python versus C implementation is about right, if not too small. I would
> > have expected a difference of 20-50 times.
>
> > Besides the fact that this post is is far too short on details - i.e. Python
> > version, OS, hardware -- I would look at it as a great motivation to get the
> > C extensions for PB done quickly. ;-)
>
> > Regards,
> > Stephan
>
> Hi all :-)
>
> I actually posted a follow up:http://bouncybouncy.net/ramblings/posts/more_on_json_vs_thrift_and_pr...

Jon Skeet <skeet@pobox.com>

unread,
Mar 3, 2009, 5:25:50 PM3/3/09
to Protocol Buffers
On Mar 3, 7:37 pm, Dave Bailey <d...@daveb.net> wrote:
> Thanks for writing this up; I think it's a nice "real world" example.
>
> I ran an equivalent test (using your same .proto files) in Perl to
> compare JSON::XS, protobuf-perlxs, and Storable.  I did this on an
> x86_64 quad-core Xeon (2.5 GHz) and found:
>
> 1) Your original dns.proto (with strings), serializing and
> deserializing a DnsResponse with 5000 random DnsRecord elements:

<snip>

Could I ask you to keep hold of the .proto files and generated files?
I'm hoping to commit the Java version of my C# benchmark code on
Thursday... it would be nice to have more data.

Jon

Alain M.

unread,
Mar 4, 2009, 9:20:37 PM3/4/09
to ProtBuf List
Hi,

I was reading this comparison yesterday and was woried about PB
performance... But today I studied a little more about JSON and I would
like to share this:

JSON is not at all comparable with ProtBuf, it is much much simpler. It
is just a way of putting variables in a pack.

ProtBuf is a much more complex system that can send *entire structures*,
with an administration that make programs of *different versions* talk
to each other, *back-and-forward compatible*.

So I believe that comparing them is like oranges versus apples...

I would appreciate more comments on this ;)

Thanks,
Alain

Dave Bailey escreveu:

David Anderson

unread,
Mar 4, 2009, 10:24:04 PM3/4/09
to Alain M., ProtBuf List
On Thu, Mar 5, 2009 at 3:20 AM, Alain M. <ala...@pobox.com> wrote:
>
> Hi,
>
> I was reading this comparison yesterday and was woried about PB
> performance... But today I studied a little more about JSON and I would
> like to share this:
>
> JSON is not at all comparable with ProtBuf, it is much much simpler. It
> is just a way of putting variables in a pack.
>
> ProtBuf is a much more complex system that can send *entire structures*,
> with an administration that make programs of *different versions* talk
> to each other, *back-and-forward compatible*.
>
> So I believe that comparing them is like oranges versus apples...
>
> I would appreciate more comments on this ;)

I think the major point to take away from the comparison is: use the
correct tool for your needs. If you need backward/forward
compatibility, heterogeneous versions of software interacting and some
structural validation (just structure, not talking about the higher
level semantics of fields), PB/Thrift is what you need. If you don't
care about the above points, by all means use json (and don't forget
to get your web server to gzip traffic).

- Dave

Justin Azoff

unread,
Mar 5, 2009, 7:34:08 PM3/5/09
to Protocol Buffers
On Mar 4, 10:24 pm, David Anderson <d...@natulte.net> wrote:
> I think the major point to take away from the comparison is: use the
> correct tool for your needs. If you need backward/forward
> compatibility, heterogeneous versions of software interacting and some
> structural validation (just structure, not talking about the higher
> level semantics of fields), PB/Thrift is what you need. If you don't
> care about the above points, by all means use json (and don't forget
> to get your web server to gzip traffic).
>
> - Dave

I definitely agree! I have been also been looking at this from
another angle:
Right now JSON is faster than protobuf (at least in python), but
protobuf produces smaller output. Protobuf will only get faster, but
JSON can not get any smaller. Looking forward, protobuf definitely
has an advantage.

--
- Justin

Dave Bailey

unread,
Mar 6, 2009, 2:37:02 AM3/6/09
to Protocol Buffers
FYI, I reran my comparison benchmark using optimize_for = SPEED, and
got the following results:

1) dns.proto with key/value/first/last/type/ttl (mostly strings),
5,000 elements in DnsRecord:

0.019223 seconds to serialize as 658124 bytes with JSON::XS
0.0092 seconds to deserialize 658124 bytes with JSON::XS
0.018292 seconds to serialize as 414859 bytes with protobuf
0.006274 seconds to deserialize 414859 bytes with protobuf
0.028614 seconds to serialize as 692824 bytes with Storable
0.009033 seconds to deserialize 692824 bytes with Storable

2) dns.proto with sip/dip/sport/dport, 10,000 elements in DnsRecord:

0.003612 seconds to serialize as 300330 bytes with JSON::XS
0.004833 seconds to deserialize 300330 bytes with JSON::XS
0.002075 seconds to serialize as 85841 bytes with protobuf
0.000549 seconds to deserialize 85841 bytes with protobuf
0.013752 seconds to serialize as 340907 bytes with Storable
0.004676 seconds to deserialize 340907 bytes with Storable

So, I guess PB isn't kidding around when we say optimize_for = SPEED.
Straight across the board, faster than JSON::XS or Storable. It looks
like for packing and unpacking messages with a lot of varint data,
protobuf blows the doors off of the other Perl serialization
mechanisms, but even for string-heavy messages, it packs at least as
fast as the others, and unpacks significantly faster (probably due to
the smaller message size where the message has a lot of small
strings).

In summary, I don't think there is any faster way to serialize
structured data from Perl (as long as you're willing to write
the .proto files and use protobuf-perlxs to compile them into Perl/XS
extension modules, of course).

-dave

TimYang

unread,
Apr 16, 2009, 10:45:24 PM4/16/09
to Protocol Buffers
I've made two similar tests in Java, comparing Thrift and Protocol
Buffers, and here is the result.

Without optimize_for = SPEED

Thrift Loop : 10,000,000
Get object : 14,394msec
Serdes thrift : 37,671msec
Objs per second: 265,456
Total bytes : 1,130,000,000

ProtoBuf Loop : 10,000,000
Get object : 8,170msec
Serdes protobuf: 33,054msec
Objs per second: 302,535
Total bytes : 829,997,866

With optimize_for = SPEED

ProtoBuf Loop : 10,000,000
Get object : 15,130msec
Serdes protobuf: 68,600msec
Objs per second: 145,772
Total bytes : 829,996,683

Thrift Loop : 10,000,000
Get object : 12,651msec
Serdes thrift : 36,904msec
Objs per second: 270,973
Total bytes : 1,130,000,000

Details of the tests see
Round 1: http://timyang.net/programming/thrift-protocol-buffers-performance-java/
Round 2: http://timyang.net/programming/thrift-protocol-buffers-performance-2/
> > Right now JSON is faster than protobuf (at least inpython), but

Alkis Evlogimenos ('Αλκης Ευλογημένος)

unread,
Apr 17, 2009, 10:25:01 AM4/17/09
to TimYang, Protocol Buffers
Are the with/without optimize_for = SPEED flipped? It seems that what you suggest is that protobuf with optimize_for = SPEED is slower than without.
--

Alkis

TimYang

unread,
Apr 17, 2009, 11:23:24 PM4/17/09
to Protocol Buffers
Alkis is quite right, sorry for the typo.

What I mean my result is
Without optimize_for = SPEED

ProtoBuf Loop : 10,000,000
Get object : 15,130msec
Serdes protobuf: 68,600msec
Objs per second: 145,772
Total bytes : 829,996,683

Thrift Loop : 10,000,000
Get object : 12,651msec
Serdes thrift : 36,904msec
Objs per second: 270,973
Total bytes : 1,130,000,000


With optimize_for = SPEED

ProtoBuf Loop : 10,000,000
Get object : 8,170msec
Serdes protobuf: 33,054msec
Objs per second: 302,535
Total bytes : 829,997,866

Thrift Loop : 10,000,000
Get object : 14,394msec
Serdes thrift : 37,671msec
Objs per second: 265,456
Total bytes : 1,130,000,000

On Apr 17, 10:25 pm, Alkis Evlogimenos ('Αλκης Ευλογημένος)

Jon Skeet <skeet@pobox.com>

unread,
Apr 19, 2009, 4:16:48 PM4/19/09
to Protocol Buffers
On Apr 18, 4:23 am, TimYang <iso1...@gmail.com> wrote:
> Alkis is quite right, sorry for the typo.

Which JIT were you using, by the way? I found that using the -server
option made the Java ProtoBuf code run more than twice as quickly. Of
course, it could be that the Thrift code would get the same boost...

Jon

TimYang

unread,
Apr 20, 2009, 6:34:23 AM4/20/09
to Protocol Buffers
I'm using Sun's java version 1.6.0 on a 64-bit CentOS 5.2
On 64bit Linux, -server is the default option.

On Apr 20, 4:16 am, "Jon Skeet <sk...@pobox.com>" <sk...@pobox.com>
wrote:
Reply all
Reply to author
Forward
0 new messages