Size differences between implementations: nanopb vs c++ vs python

fabiang...@gmail.com

unread,

Sep 24, 2015, 12:45:17 PM9/24/15

to nanopb

Hey everyone,

we plan to use protobuf for our project passing sensor data from micro controllers to Linux based system. Therefore we want to use the variety of languages. Now I implemented examples in all three of them doing exactly the same. The message structure is like this:

Sensors{
required Time time =1;
repeated Imu imu= 2;
optional Temperature temperature= 3;
.
.
.(all others optional)
}

Time{
optional Ticks ticks =1;
optional Timestamp utc =2;
.
.
.
}

Ticks{
required uint32 ticks =1;
}

Time{
optional uint64 secs =1;
optional uint64 nsecs =2;
}

In the example I just get the UTC timestamp and set the uint64 values correspondingly. Then I serialize the message and decode it at the end. For all languages (python, nanopb, c++) the results of the decoded message fit the encoded one, but the size of the decoded stream varies a lot.

C++: ByteSize() returns 15 to 16 bytes depending on the run.
NanoPB: bytes_written returns 300 to 308 bytes depending on the run.
Python: len() returns 15 to 16 bytes
sys.getsizeof() returns 52 to 53 bytes

Can somebody explain me the discrepancy in size between c++ and nanopb?
It is quiet important to us since we do knot want to transmit to much overhead via the serial port.

Thanks for your time.

Fabian

Petteri Aimonen

unread,

Sep 24, 2015, 12:59:30 PM9/24/15

to nan...@googlegroups.com

Hi,

> C++: ByteSize() returns 15 to 16 bytes depending on the run.
> NanoPB: bytes_written returns 300 to 308 bytes depending on the run.
>

> Can somebody explain me the discrepancy in size between c++ and nanopb?

There shouldn't be this large difference. Typically nanopb and the C++
implementation should give exactly the same output, byte-per-byte. There
is even a test case checking that.

--

This is just a guess, but when you create a nanopb structure, do you
remember to initialize it?

E.g.

Sensors sensormsg = Sensors_init_zero;

or if you prefer:

memset(&sensormsg, 0, sizeof(sensormsg));

Otherwise all of your optional fields will be filled with garbage
values, and thus they will take space on encoding.

C++ and Python do this automatically because those languages have
constructors. C does not, so you have to remember to initialize
manually.

--

If that wasn't the problem, can you prepare a simple C program that
demonstrates the problem and that I can run?

If that is not possible, even just the .proto file and the binary output
from your program would help.

--
Petteri

fabiang...@gmail.com

unread,

Sep 24, 2015, 1:13:09 PM9/24/15

to nanopb, j...@kapsi.fi

> This is just a guess, but when you create a nanopb structure, do you
> remember to initialize it?
>
> E.g.
>
> Sensors sensormsg = Sensors_init_zero;
>
> or if you prefer:
>
> memset(&sensormsg, 0, sizeof(sensormsg));
>
> Otherwise all of your optional fields will be filled with garbage
> values, and thus they will take space on encoding.
>
> C++ and Python do this automatically because those languages have
> constructors. C does not, so you have to remember to initialize
> manually.

Well that did the trick. Thanks a lot.
Maybe you can add the initializing call also to the simple example.
I understand now of course why it is not necessary there but I think it would help users to not miss it with more complex structures. Or at least a comment about that.

But thanks a lot for the fast help.

Fabian

Petteri Aimonen

unread,

Sep 24, 2015, 1:33:32 PM9/24/15

to nanopb groups

Hi,

> Maybe you can add the initializing call also to the simple example.

Ah, thanks!

I've tried to have initialization in all examples but somehow missed
this one. Added it now.

--
Petteri

Reply all

Reply to author

Forward