Serialization performance comparison with Boost.serialization

3,141 views
Skip to first unread message

Yingfeng

unread,
Mar 30, 2009, 12:06:59 AM3/30/09
to Protocol Buffers
Hi,
We are looking for a fast mechanism for serialization/deserialization.
Here is our comparison between pb and boost:
We hope to serialize/deserialize data in std containers, such as:

std::vector<std::string>

Here is the data
10000000 strings are stored in the vector

as to boost:
Serialization: 3.8 s
Deserialization: 6.89 s

as to protocol buffers:
Serialization: 4.59 s
Deserialization: 0.47 s

It seems pb performs much bettern than boost in deserialization,
however it is even slower than boost in serialization. Could it be
improved for serialization to be as fast as deserialization?

Kenton Varda

unread,
Mar 30, 2009, 12:14:41 AM3/30/09
to Yingfeng, Protocol Buffers
What does your .proto file look like?  And the code that uses it?

Yingfeng Zhang

unread,
Mar 30, 2009, 12:32:39 AM3/30/09
to Kenton Varda, Protocol Buffers
Test files are attached

Best
vector.tar.bz2

Kenton Varda

unread,
Mar 30, 2009, 2:07:58 PM3/30/09
to Yingfeng Zhang, Protocol Buffers
Several points:

* Some of your test cases seem to be parsing from or serializing to files.  This may be measuring file I/O performance more than it is measuring the respective serialization libraries.  Even though you are using clock() to measure time, simply setting up file I/O operations involves syscalls and copying that could take some CPU time to execute.  Try parsing from and serializing to in-memory buffers instead.  For protocol buffers you should use ParseFromArray() and SerializeToArray() for maximum performance -- not sure if boost has equivalents.

* Your test generates different random data for the boost test vs. the protobuf test.  For an accurate comparison, you really should use identical data.

* Finally, your test isn't a very interesting test case for protocol buffers.  Parsing and serializing a lot of strings is going to be dominated by the performance of memcpy().  You might notice that the actual serialization step in your program takes much less time than even just populating the message object.  It might be more interesting to try serializing a message involving many different fields of different types.


I think the reason parsing ends up being much slower than serialization for you is because it spends most of the time in malloc(), allocating strings.  There are a few things you can do about this:

1) Reuse the same message object every time you parse.  It will then reuse the same memory instead of allocating new memory.

2) Make sure you are not using a reference-counting string implementation.  They are, ironically, very slow, due to the need for atomic operations.

3) Use Google's tcmalloc in place of your system's malloc.  It is probably a lot faster.

Yingfeng Zhang

unread,
Mar 30, 2009, 10:09:23 PM3/30/09
to Kenton Varda, Protocol Buffers
Thanks for feedbacks.
I agree with what your points.I use vector<string> because it had already been used on existing platform.

A newer test of comparing vector<int> has the following result:

It takes 1.9 seconds for boost to serilize vector<int> of size 10000000 !

It takes 4.71 seconds for boost to deserilize vector<int> of size 10000000 !

It takes 0.47 seconds for protocol-buffer to serilize  vector<int>of size 10000000 !

It takes 0.45 seconds for protocol-buffer to deserilize  vector<int> of size 10000000 !



Best

Kenton Varda

unread,
Mar 30, 2009, 10:46:11 PM3/30/09
to Yingfeng Zhang, Protocol Buffers
That's more like it.  :)

Yingfeng Zhang

unread,
Mar 30, 2009, 10:57:15 PM3/30/09
to Kenton Varda, Protocol Buffers
If we change boost binary, here is the result, it seems much faster..

It takes 0.08 seconds for boost to serialize vector<int> of size 10000000 !

It takes 0.05 seconds for boost to deserialize vector<int> of size 10000000 !


Best

Kenton Varda

unread,
Mar 30, 2009, 11:37:07 PM3/30/09
to Yingfeng Zhang, Protocol Buffers
What do you mean "change boost binary"?

Parsing ~500MB in 0.05 seconds sounds dubious to me.  That's 10GB/s throughput.

Yingfeng Zhang

unread,
Mar 31, 2009, 12:07:26 AM3/31/09
to Kenton Varda, Protocol Buffers
boost  supports two kinds of serialization mechanism : text and binary

Alek Storm

unread,
Mar 31, 2009, 12:24:02 AM3/31/09
to Yingfeng Zhang, Kenton Varda, Protocol Buffers
I think Yingfeng is referring to the archive formats described here: http://www.boost.org/doc/libs/1_38_0/libs/serialization/doc/archives.html#archive_models.  The binary format, however, appears to be non-portable, so it doesn't seem to serve the same purpose as Protocol Buffers, and should be faster anyway, since it encodes directly to native types.

--
Alek Storm

Kenton Varda

unread,
Mar 31, 2009, 1:13:55 PM3/31/09
to Alek Storm, Yingfeng Zhang, Protocol Buffers
OK.  But I believe Yingfeng's results were impossibly fast, unless the code has changed since I saw it.  His data set is a vector of 10,000,000, each with a random size in the range [3,100].  That comes out to 515,000,000 bytes (491MB) of string data.  If we totally ignore overhead of the vector, malloc costs, etc., just reading that much data in 0.05 seconds means reading about 10GB/s which is pretty close to the theoretical maximum throughput of the highest-end PC RAM available today.

So either I missed something, Yingfeng's code has changed, or boost's "binary" mode isn't really encoding the entire data set.

Yingfeng Zhang

unread,
Apr 1, 2009, 3:09:01 AM4/1/09
to Kenton Varda, Alek Storm, Protocol Buffers
I think Boost has made some optimization for such data as vector<int>, it performs almost the same as memcpy directly.
However, if we serialize a bit more complicated data structure, such as vector<pair<int, int> > , or vector<MyData>, where MyData refers to
struct
{
int a;
int b;
int c;
int d;
}

Then it will not perform as well as before.
Reply all
Reply to author
Forward
0 new messages