Performance Improvement with Protobuf and UPB in Python and C++

Evan Lu

unread,

Sep 6, 2024, 7:08:09 PM9/6/24

to Protocol Buffers

Hi everyone,

We're currently using Protobuf 3.14 with C++ and Python in our project and are looking to improve serialization/deserialization performance. I recently tried Protobuf 3.24 and noticed performance improvements in Python, likely due to the use of UPB.

I have a couple of questions:

1. Does UPB also provide performance improvements for C++?
2. If so, in which version of Protobuf was UPB introduced for C++?

Thanks in advance for your help!

Best regards,
Evan

Tony Liao

unread,

Sep 9, 2024, 4:52:01 PM9/9/24

to Protocol Buffers

Hi Evan,

The protobuf C++ implementation is separate from UPB and has its own set of optimizations driven by Google's own workloads.

I can say that for Google's C++ workloads, we have not observed better performance with UPB. For this reason, we are still shipping with the native implementation for C++.

-Tony

Evan Lu

unread,

Sep 9, 2024, 5:23:37 PM9/9/24

to Protocol Buffers

Hi Tony,

Thank you for the clarification!

Do you have any recommendations or best practices for improving serialization/deserialization performance specifically for Protobuf in C++? I'd appreciate any insights or optimizations you or anyone can suggest.

Best regards,
Evan

Tony Liao

unread,

Sep 10, 2024, 5:04:59 PM9/10/24

to Protocol Buffers

Hi Evan,

Performance is something that will depend on the nature of your workload quite a lot, so it's hard to give advice without knowing more detail. Here are some recommendations that we tend to give:

Using arenas can reduce the cost of memory allocation. This can be especially impactful during parsing (deserializing).
- https://protobuf.dev/reference/cpp/arenas/
Protobuf reflection is slow -- try to avoid it in performance sensitive code.
string fields are UTF-8 validated. If you don't need UTF-8 validation, prefer to use bytes instead.
Deeply nested messages can result in a lot of pointer indirection. In some cases, you might be able to come up with a wire-compatible message format that is shallower. However, this may come at the cost of flexibility.
Serialization performance is O(number of fields). If you have a large message with 1000s or 10000s of fields, serializing it can be noticeably slower than serializing a message with a handful of fields.

We also have best practices published here: https://protobuf.dev/programming-guides/dos-donts/. It's not focused on performance though.

And please keep in mind that we are continually evolving the protobuf implementation, so anything we're saying here isn't a substitute for running your own benchmarks. :)