proposed optimization to google.protobuf 2.4.1 (C++)

Showing 1-3 of 3 messages
proposed optimization to google.protobuf 2.4.1 (C++) Eyal Farago 10/18/11 7:24 AM
I originally sent this to Kenton, but isince he's no longer maintaining GPB he suggested I'd post this here.

As part of working on a feature that uses protobuf for serialization, I profiled a process that spent about 50% of its time parsing a protobuf message, this message has a large packed repeating field of integers (300-400 items), this caused ‘ReadPackedPrimitive’ to dominate the time, I took a glance at it and it seemed that it calls ‘BytesUntilLimit()’ proportionally to the number of items in the repeated field, this method evaluates a condition and several aritmathic operations which seems harmless but become annoying when executed repeatedly.

Looking at the implementation of ‘ReadPackedPrimitive’ I noticed that most of the time the woking buffer of the coded stream is large enough to read the next item (a varint in this case), so I came out with an implementation that examines the size of the buffer (by calling ‘CodedInputStream.GetDirectBufferPointerInline’) and resort to ‘BytesUntilLimit()’ only when the working buffer is empty (which is pretty rare).



    ::google::protobuf::io::CodedInputStream &inp,

    ::google::protobuf::RepeatedField< ::google::protobuf::int32 > &trg



    ::google::protobuf::uint32 length;

    if( !inp.ReadVarint32( &length ) ) return false;

    //stay on the safe side, though as far as I know the encoder never serializes an empty array.

    if( 0 == length )   return true;

    ::google::protobuf::io::CodedInputStream::Limit limit = inp.PushLimit( length );

    const ::google::protobuf::uint8 *buff( NULL );

    int buff_length( 0 );


        if( !::google::protobuf::internal::WireFormatLite::ReadPrimitive< ::google::protobuf::int32, ::google::protobuf::internal::WireFormatLite::T        {

            return false;


        inp.GetDirectBufferPointerInline( reinterpret_cast< const void** >( &buff ), &buff_length );

    } while( ( 0 != buff_length ) || ( 0 != inp.BytesUntilLimit() ) );

    inp.PopLimit( limit );

    return true;



I didn’t implement this in the library itself (as this would require me to recompile, re-deploy, and convince my team leader that it’s worth the mess of changing our third parties repository, not to mention the hussle in the next protobuf upgrade), instead I’ve modified a generated file and came up with the following implementation for packed repeated integer field (that can be easily generalized to replace ‘ReadPackedPrimitive’).

This modification resulted with a 10-15% on a process that spent half its time parsing the proto, and 20% when the process was ‘crippled’ to do protobuf parsing alone.

I’d like to hear the maintainer's opinion on this, and be more than happy to contribute a patch if you accept this change. 



Re: proposed optimization to google.protobuf 2.4.1 (C++) Jeremiah Jordan 10/19/11 2:58 PM
You should probably create an issue and attach your patch to it.
Re: [protobuf] Re: proposed optimization to google.protobuf 2.4.1 (C++) Eyal Farago 10/20/11 12:10 PM
thanks, will do sunday morning when the (Jewish) holidays finally kicks out...


On Wed, Oct 19, 2011 at 11:58 PM, Jeremiah Jordan <> wrote:
You should probably create an issue and attach your patch to it.

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To view this discussion on the web visit

To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at