How big is protocol buffers?

632 views
Skip to first unread message

Noel Fegan

unread,
Aug 12, 2009, 4:58:25 PM8/12/09
to Protocol Buffers
Hi,

This might be a "how long is a piece of string"-type question, but I
was wondering if there's any way for me to know how much protocol
buffers might add to my app's memory footprint, when I have a large
number of messages. I see there's a LITE option which sounds like the
mode I would need to be working in? I'm assuming this affects the code
generational aspects of PB, ie the classes it generates for me. Is
there a runtime engine for PB?

The context of my question, is that I am currently working on a
project that uses SOAP as an IPC mechanism. I know, the decision to
use SOAP pre-dates my involvement, but our interface is quite large,
e.g. 7 main services each exposing say 10 functions, and the "client
end" is also a SOAP server for async callback, so there's an API in
both directions so-to-speak. Server end is a Java app. Between SOAP,
jetty, the generated classes etc, this IPC layer of the server end of
our app represents about a 1/3 of the runtime memory footprint of this
server process. For us, whatever we replace SOAP with would have to
have a much smaller memory footprint, rather than necessarily being
much faster.

I know Protocol Buffers doesn't include an RPC implementation and
merely provides stubs, but I could imagine we could produce a fairly
light-weight RPC implementation, but I'm more concerned that we could
go down this road and find that using Protocol Buffers will also
produce a large memory footprint, really because our API is large. I'm
not sure if this gives enough context to help anyone answer my
question, but I would be interested even in understanding a little
more about how I might go about working this out for myself? Any
pointers would be appreciated.

Thanks.
Noel

Kenton Varda

unread,
Aug 12, 2009, 5:56:46 PM8/12/09
to Noel Fegan, Protocol Buffers
Well there's good news and bad news.  The bad news is that using protocol buffers has been known to lead to large code footprints.  The good news is that it should be fairly easy to estimate how much it will bloat *your* binary, without having to actually implement the whole system.

Protocol buffers involves both generated code and a runtime library.  Both will be factors in your code footprint.  You have several options for trading off between speed, code size, and features, as follows:

* With optimize_for = SPEED (the default), generated code is large (how large depends on the number of types and fields in your protocol), and the runtime library is a 313kB jar file, but you get insanely high throughput and all features including reflection.

* With optimize_for = CODE_SIZE, you use the same (large) runtime library and get all the same features, but the generated code is much smaller as it falls back to shared reflection-based implementations of many methods.  In C++ the generated code is about 40% of the regular size; I don't have precise numbers for Java.  Reflection-based operations tend to be 4x-10x slower than generated code (probably similar in performance to XML).

* With optimize_for = LITE_RUNTIME (new in v2.2.0), you get fast (but large) generated code, but it depends on a much smaller runtime library.  libprotobuf-lite.jar is only 55kB.  The down side is that features like descriptions and reflection are not available (which is why you can't optimize both for code size and lite runtime at the same time -- the code size optimizations rely on features that aren't in the lite runtime).  Note that service definitions also aren't supported in lite mode, but they actually don't provide all that much functionality (since we don't include an RPC implementation).

Both C++ and Java support these options.

What I'd suggest you do is write out the .proto file(s) for your interface, then compile them in each mode and see how big the jars end up being.  This should not take very long and should answer your question.

I'm a little surprised that you seem more worried about the footprint of the server than the client.  The usual experience at Google is that we're happy to let server programs balloon into gigantic monsters, sometimes hundreds of MB in size, but we want to keep client programs small because users have to download them.

Noel Fegan

unread,
Aug 12, 2009, 6:27:29 PM8/12/09
to Protocol Buffers
We have a UI app (client) that talks to a locally running process
(localhost) which I am calling the server. So, we're using SOAP for
IPC so it's not "remote", but it is an invocation of an out-of-process
functionality. As such both processes are running on the client
machine. We have a number of clients apps that may be running at the
same time for this user, so we encapsulated the common functionality
in a single app that these clients use under-the-hood. Also, the
background process connects downstream to singular network services
that it consolidates multiplexing the service on behalf of the
clients.

300KB doesn't sound too bad. Our current Adapter layer (IPC layer of
our server app) comes in at around 25MB-30MB (Java heap space). Note:
that's the complete SOAP stack: Web Service, SOAP engine, JAXB. Apart
from the SOAP generated class our own code in this layer merely maps
equivalent "core" classes to SOAP generated types and methods. We
don't figure our code is adding much, so we're assuming most of this
is down to some part of SOAP.

Anyway, thanks for the quick response. I might start by trying to
produce proto file for a portion of our interface, and see how that
compares.

You don't happen to have a WSDL to proto file converter by any
chance? ;-)

Kenton Varda

unread,
Aug 12, 2009, 6:43:11 PM8/12/09
to Noel Fegan, Protocol Buffers
Ah, I was just talking about the size of the .jar files here, which I had thought was what you were worried about.  The runtime memory use could be much larger; I'm not sure.  In the non-lite runtime, descriptors are constructed at startup which take space.  The lite library does not allocate any permanent objects.  Either way, though, I'd expect the descriptors to take very little memory.  You call your interface "large" but to me it sounds small.  :)

But that said, I suspect that the runtime memory usage of your existing system has more to do with the RPC implementation than with the message objects.  Since you're proposing writing a new RPC implementation to use with protobufs, it's hard to say how much memory it could end up using.

Kenton Varda

unread,
Aug 12, 2009, 6:45:14 PM8/12/09
to Noel Fegan, Protocol Buffers
On Wed, Aug 12, 2009 at 3:27 PM, Noel Fegan <nfe...@gmail.com> wrote:
You don't happen to have a WSDL to proto file converter by any
chance? ;-)

No, but I bet you could write an XSLT to do it pretty easily.
Reply all
Reply to author
Forward
0 new messages