protobuf::Any Message vs MessageLite interface

1,733 views
Skip to first unread message

Arpit Baldeva

unread,
Oct 4, 2016, 7:25:33 PM10/4/16
to Protocol Buffers
Hi,

I was wondering if there is any reason that Any type is implemented in terms of Message interface rather than MessageLite? It'd seem to me that all this class needs to work correctly is to use GetTypeName() on MessageLite interface.

My motivation is, of course, to be able to use Any type with Lite run time. 

--Arpit

Adam Cozzette

unread,
Oct 7, 2016, 12:54:18 PM10/7/16
to Arpit Baldeva, Protocol Buffers
Hi Arpit,

I think you're right that the Any type could technically be updated to support storing lite protos. The original motivation for requiring non-lite protos was that we need to have the type name, which we currently get from the descriptor, though as you mentioned it appears that we can also get it from MessageLite::GetTypeName().

Here's the thing, though: we are actually thinking about removing the MessageLite::GetTypeName method at some point in the future. Lite protos are intended primarily for mobile and so in that environment it's important to keep the binary as small as possible and also avoid including symbols in the code that's distributed, whereas currently that method requires us to generate code that includes the message type names. So I think it would be best to avoid creating another dependency on that GetTypeName() method when it might be going away in the future.

As a workaround you can still store lite protos inside an Any, but you just have to do it by manually setting the fields on Any without the help of the PackFrom() and UnpackTo() methods. This might require a little bit of extra boilerplate code but it shouldn't be too bad.

Adam

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+unsubscribe@googlegroups.com.
To post to this group, send email to prot...@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Tim Kientzle

unread,
Oct 7, 2016, 1:15:56 PM10/7/16
to Adam Cozzette, Arpit Baldeva, Protocol Buffers

On Oct 7, 2016, at 9:54 AM, 'Adam Cozzette' via Protocol Buffers <prot...@googlegroups.com> wrote:

Here's the thing, though: we are actually thinking about removing the MessageLite::GetTypeName method at some point in the future. Lite protos are intended primarily for mobile and so in that environment it's important to keep the binary as small as possible and also avoid including symbols in the code that's distributed, whereas currently that method requires us to generate code that includes the message type names. So I think it would be best to avoid creating another dependency on that GetTypeName() method when it might be going away in the future.


This seems to suggest that you don’t intend to support Any for mobile?

Tim

Adam Cozzette

unread,
Oct 7, 2016, 1:56:49 PM10/7/16
to Tim Kientzle, Arpit Baldeva, Protocol Buffers
Good question, I'm not much of a mobile developer so hopefully someone with more experience there can chime in. But my thought would be that if you're on mobile and need the performance of lite protos then you would probably want to just read and write Any fields manually instead of using the helper methods that require non-lite. If the extra boilerplate turns out to be excessive, we could look into ways of updating the Any API to make it a little easier to use with lite protos.

Arpit Baldeva

unread,
Oct 7, 2016, 5:16:40 PM10/7/16
to Protocol Buffers, tkie...@apple.com, abal...@gmail.com
Thanks for the info.

I feel like without pack/unpack/Is method, the utility of Any will diminish. For example, the rpc status proto (https://github.com/googleapis/googleapis/blob/master/google/rpc/status.proto) uses repeated Any field. It'd not be possible to write code like one described here - https://developers.google.com/protocol-buffers/docs/proto3#any because you won't know if it is safe to convert value to a give message. I also came across this post after my post which marks the request as a bug currently - https://github.com/google/protobuf/issues/1974 

Regarding the future of GetTypeName, though it has overhead, feel like it could have many utilities outside of the Any support as well. I don't have concrete use case in mind though as I am just starting on protobuf. This brings another important question that I was wondering if somebody already has data around. There are two options for reducing code bloat. One is Lite and another is code_size. I understand that lite reduces code bloat by removing descriptors/reflections related code (thereby reducing the library size) and code_size reduces the code bloat by generating less code per message but puts descriptors/reflectors back in(shared code). And the recommendation is to choose code_size option if number of message are much higher compared to the overhead caused by the size of lib. Are there any benchmarks around what the size of the library is (and lite version) and what is the per message overhead saved by the code_size option? And the performance drop with code_size option?   

--Arpit

Adam Cozzette

unread,
Oct 10, 2016, 3:00:24 PM10/10/16
to Arpit Baldeva, Protocol Buffers, Tim Kientzle, Gerben Stavenga
On Fri, Oct 7, 2016 at 2:16 PM, Arpit Baldeva <abal...@gmail.com> wrote:
Thanks for the info.

I feel like without pack/unpack/Is method, the utility of Any will diminish. For example, the rpc status proto (https://github.com/googleapis/googleapis/blob/master/google/rpc/status.proto) uses repeated Any field. It'd not be possible to write code like one described here - https://developers.google.com/protocol-buffers/docs/proto3#any because you won't know if it is safe to convert value to a give message. I also came across this post after my post which marks the request as a bug currently - https://github.com/google/protobuf/issues/1974 

What you're saying makes sense, we might want to consider just updating Any to have first-class support for MessageLite. In C++ this would be straightforward but in Java, for example, we would need to think carefully about how to do it because in Java lite we don't currently have the message names available at runtime.

Regarding the future of GetTypeName, though it has overhead, feel like it could have many utilities outside of the Any support as well. I don't have concrete use case in mind though as I am just starting on protobuf. This brings another important question that I was wondering if somebody already has data around. There are two options for reducing code bloat. One is Lite and another is code_size. I understand that lite reduces code bloat by removing descriptors/reflections related code (thereby reducing the library size) and code_size reduces the code bloat by generating less code per message but puts descriptors/reflectors back in(shared code). And the recommendation is to choose code_size option if number of message are much higher compared to the overhead caused by the size of lib. Are there any benchmarks around what the size of the library is (and lite version) and what is the per message overhead saved by the code_size option? And the performance drop with code_size option?

Here's one way to break it down.

SPEED:
- Fixed overhead of full runtime (e.g. the Message class)
- Per-message overhead of generated parsing/serialization code
- Per-message overhead of generated descriptors

LITE_RUNTIME:
- Fixed overhead of lite runtime (e.g. includes MessageLite but not Message)
- Per-message overhead of generated parsing/serialization code

CODE_SIZE:
- Fixed overhead of full runtime (e.g. the Message class)
- Per-message overhead of generated descriptors

SPEED and LITE_RUNTIME should be about the same speed because they both benefit from the fast generated code for parsing and serialization, while CODE_SIZE is much slower because it relies on reflection instead of generated code. My impression is that CODE_SIZE is not really a good choice unless you have an unusual situation where you have a large number of protos and are really tight on code size. A basic rule of thumb would be to use the default (SPEED) on servers and LITE_RUNTIME on mobile.

I'm not sure offhand of the actual numbers for how binary size and speed differ between the three choices--Gerben (CC'd), do you happen to know some numbers for this question?

Mohamed Koubaa

unread,
Nov 29, 2016, 2:20:59 PM11/29/16
to Adam Cozzette, Arpit Baldeva, Protocol Buffers, Tim Kientzle, Gerben Stavenga
Hello,

I am sorry to bring back an old thread, but the outcome is not clear.  Is there either an intent or any ongoing work to support Any types with the lite runtime?

Best Regards,
Mohamed Koubaa
Software Developer
ANSYS Inc

--

Adam Cozzette

unread,
Nov 29, 2016, 8:22:58 PM11/29/16
to Mohamed Koubaa, Arpit Baldeva, Protocol Buffers, Tim Kientzle, Gerben Stavenga
Right now there doesn't seem to be a consensus around adding built-in support for Any in the lite runtime, so I suspect that the status quo will probably remain for now. If you would like to use Any with the lite runtime, I think it's probably best to just manually serialize and parse to and from your Any fields, since that will work even if it involves a little extra boilerplate.

Mohamed Koubaa

unread,
Dec 1, 2016, 1:36:33 PM12/1/16
to Adam Cozzette, Arpit Baldeva, Protocol Buffers, Tim Kientzle, Gerben Stavenga
Hello,

FWIW, this is the boilerplate I use for my proto3.0.0 project.  It depends on GetTypeName() whose future is uncertain in the lite runtime.  It appears to work in one of my tests but I am not sure if I am missing something subtle.  I'm using SerializeWithCachedSizesToArray because I learned that it is faster for large messages because it does not compute the size twice.

static void PackInto(google::protobuf::Any* target, const google::protobuf::MessageLite& msg)
{
    int msg_size = msg.ByteSize();
    char* msg_buffer = new char[msg_size];

    msg.SerializeWithCachedSizesToArray((google::protobuf::uint8*)msg_buffer); //avoids double 
    target->set_type_url(msg.GetTypeName());
    target->set_value(msg_buffer,msg_size);

    delete[] msg_buffer;
}

static void UnpackFrom(const google::protobuf::Any& source, google::protobuf::MessageLite* msg)
{
    EXPECT_EQ(source.type_url(), msg->GetTypeName()); //Could be converted to an assert or CHECK style macro in a non-test project
    msg->ParseFromArray(source.value().c_str(), source.value().size());
}

Thanks,
Mohamed Koubaa
Software Developer
ANSYS Inc

Mohamed Koubaa

unread,
Dec 1, 2016, 1:48:30 PM12/1/16
to Adam Cozzette, Arpit Baldeva, Protocol Buffers, Tim Kientzle, Gerben Stavenga
Hello,

This works as long as my test project links against the full protobuf runtime.  Attempting to link against the lite runtime produces the following unresolved symbols:

error LNK2001: unresolved external symbol "public: __cdecl google::protobuf::Any::Any(void)" (??0Any@protobuf@google@@QEAA@XZ)

error LNK2001: unresolved external symbol "void __cdecl google::protobuf::protobuf_AddDesc_google_2fprotobuf_2fany_2eproto(void)" (?protobuf_AddDesc_google_2fprotobuf_2fany_2eproto@protobuf@google@@YAXXZ)

error LNK2001: unresolved external symbol "public: virtual bool __cdecl google::protobuf::Any::MergePartialFromCodedStream(class google::protobuf::io::CodedInputStream *)" (?MergePartialFromCodedStream@Any@protobuf@google@@UEAA_NPEAVCodedInputStream@io@23@@Z)

error LNK2001: unresolved external symbol "public: virtual int __cdecl google::protobuf::Any::ByteSize(void)const " (?ByteSize@Any@protobuf@google@@UEBAHXZ)

error LNK2001: unresolved external symbol "public: void __cdecl google::protobuf::Any::MergeFrom(class google::protobuf::Any const &)" (?MergeFrom@Any@protobuf@google@@QEAAXAEBV123@@Z)

error LNK2001: unresolved external symbol "public: static class google::protobuf::Any const & __cdecl google::protobuf::Any::default_instance(void)" (?default_instance@Any@protobuf@google@@SAAEBV123@XZ)


Thanks,
Mohamed Koubaa
Software Developer
ANSYS Inc

Adam Cozzette

unread,
Dec 6, 2016, 11:53:38 AM12/6/16
to Mohamed Koubaa, Arpit Baldeva, Protocol Buffers, Tim Kientzle, Gerben Stavenga
Hi Mohamed, sorry for the delay getting back to you. I had forgotten about this but there is a known issue that protobuf lite does not contain the well-known protos: https://github.com/google/protobuf/issues/1889 Even if you tweak the build to include them, there is still the issue that the generated code from them will rely on symbols which are not present in the lite runtime.

I think there are two possible workarounds for now:
1) You could just link against the full runtime but continue using optimize_for = LITE in your .proto files. This way you would still have the code size benefit of having generated descriptors omitted from your binary, but you would have extra library code, so you would have to look at the size of your binary and see if this makes sense in your case.
2) You could essentially copy any.proto to another directory in your project and give it a different package name. Then you could use this private version of Any and it would be wire-compatible with google.protobuf.Any.

Mohamed Koubaa

unread,
Dec 7, 2016, 2:18:40 PM12/7/16
to Adam Cozzette, Arpit Baldeva, Protocol Buffers, Tim Kientzle, Gerben Stavenga
Hi Adams,

Nice to see there is an issue I could follow.  FYI I'm going with (1).

To add a data point, we're interested in the lite runtime not for mobile but to avoid exacerbating long link times when adding protobuf to an (already) large project.

Thanks,
Mohamed Koubaa
Software Developer
ANSYS Inc
Reply all
Reply to author
Forward
0 new messages