Performance of java proto buffers

achintms

unread,

Aug 19, 2010, 11:45:16 AM8/19/10

to Protocol Buffers

I have an application that is reading data from disk and is using
proto buffers to create java objects. When doing performance analysis
I was surprised to find out that most of the time was spent in and
around proto buffers and not reading data from disk.

On profiling further (using yourkit) I found the following breakdown
of the function that was converting byte[] to a message:

1. Total - 13 sec
2. Decoding strings - 7 sec
3. <Message>.Builder.mergeFrom - 6 sec

Again I expected that decoding strings would be almost all the time
(although decoding here still seems slower than in C in my
experience). I am trying to figure out why mergeFrom method for this
message is taking 6 sec (own time). My message looks something like
this:

message Message {
enum SubMessageType {
TYPE1 = 1;
TYPE2 = 2;
...
}
required SubMessageType type = 1;
optional SubMessage1 subMessage1 = 2;
optional SubMessage2 subMessage2 = 3;
...
}

There are around 15 SubMessages. Each of the sub messages are
basically very simple messages with just one string field. Also from
the profiler reading sub messages is taking around 7 sec which is all
in String decoding (as I mentioned above).

The code for <Message>.Builder.mergeFrom is basically just a loop
reading a tag and trying to match to one of the sub message types.
This loop would at most have 2 iterations (1 for the enum and 1 for
the sub messages). I am confused then how this method is taking up so
much time. I noticed that there were synthetic accessor warnings in
the proto-buffer generated code and some of that showed up in the
profiler as well (UnknownFieldSet.newBuilder...access$000). That
doesn't quite account for the whole time though.

In your experience is this expected? Am I doing something wrong? Can I
add any options or write the message differently to make it more
efficient. I will appreciate any insight.

Evan Jones

unread,

Aug 22, 2010, 11:58:49 AM8/22/10

to achintms, Protocol Buffers

On Aug 19, 2010, at 11:45 , achintms wrote:
> I have an application that is reading data from disk and is using
> proto buffers to create java objects. When doing performance analysis
> I was surprised to find out that most of the time was spent in and
> around proto buffers and not reading data from disk.

In my experience, protocol buffers are more than fast enough to be
able to keep up with disk speeds. That is, when reading uncached data
from the disk at 100 MB/s, protocol buffers can decode it at that
speed. Now, if your data is cached, and your application is not doing
much with the data, then I would expect protocol buffers to take 100%
of the CPU time, since the disk read doesn't take CPU, and your
application isn't doing much.

In other words: in a more "real" application, I would expect protocol
buffers will take only a very small portion of your application's time.

> Again I expected that decoding strings would be almost all the time
> (although decoding here still seems slower than in C in my
> experience). I am trying to figure out why mergeFrom method for this
> message is taking 6 sec (own time).

Decoding strings in Java is way slower because it actually decodes the
UTF-8 encoded strings into UTF-16 strings in memory. The C++ version
just leaves the data in UTF-8. If this is a performance issue for your
application, you may wish to consider using the bytes protocol buffer
type rather than strings. This is less convenient, and means you can
"screw up" by accidentally sending invalid data, but is faster.

> There are around 15 SubMessages.

This is basically the problem right here. Each time you parse one of
these messages, it ends up allocating a new object for each of these
sub messages, and a new object for each string inside them. This is
pretty slow.

As I said above: I suspect that in a "real" application, this won't be
a problem. However, it would be faster if you get rid of all the sub
messages (assuming that you don't actually need them for some other
reason).

Finally, I'll take a moment to promote my patch that improves Java
message *encoding* performance, by optimizing string encoding. It is
available at the following URL. Unfortunately, there is no similar
approach to improving the decoding performance.

http://codereview.appspot.com/949044/

Evan

--
Evan Jones
http://evanjones.ca/

achintms

unread,

Aug 23, 2010, 11:41:53 AM8/23/10

to Protocol Buffers

Thanks Evan. That was very helpful. I got rid of the external object
and created the internal objects directly. After that the only part
that was taking time was decoding. I like the idea of using bytes for
serialization and do my own encoding/decoding on top of that. That way
I can delay decoding until it is needed. For example for comparisons I
should just be able to use the bytes. Also do you think that if I
encode/decode using utf-16 it would be faster? Clearly it is not as
compressed.

Evan Jones

unread,

Aug 23, 2010, 12:02:22 PM8/23/10

to achintms, Protocol Buffers

On Aug 23, 2010, at 11:41 , achintms wrote:
> Thanks Evan. That was very helpful. I got rid of the external object
> and created the internal objects directly. After that the only part
> that was taking time was decoding. I like the idea of using bytes for
> serialization and do my own encoding/decoding on top of that. That way
> I can delay decoding until it is needed. For example for comparisons I
> should just be able to use the bytes.

This is true, provided that everyone uses the same encoding without
any bugs, and canonicalizes Unicode in the same way (http://unicode.org/reports/tr15
). In general, this is tricky, and I would suggest using the built-in
string type. However, if you have a very specific need, and the
decoding is a bottleneck, this should work.

> Also do you think that if I
> encode/decode using utf-16 it would be faster? Clearly it is not as
> compressed.

I would think it should be, but I haven't done any performance
measurements, so I can't confirm 100% that this is the case.

Reply all

Reply to author

Forward