Yes. The wireshark dissector can print the numeric tag/value pairs for
a message without the .proto file.
The python script only generates the C/C++ glue code required to
integrate Protocol Buffers C++ parsing code into Wireshark. Complete C
++ Protocol Buffer code functionality is available to the dissector.
However, I was able to obtain only the first level of numeric tag/
value pairs using TextFormat::PrintToString() on a message with no
fields.
For example, the Wireshark output for the AddressBook example is
below:
1: "\n\005Dilip\020\001\032\035dilip.an...@gmail.com"
1: "\n\004Mary\020\002\032\016...@email.com"
Is there some other function TextFormat::Print() function that can
print the tag/value pairs for embedded messages? Is this even
feasible without having the .proto file?
The heuristic seems to work well. I had to make the following minor
modifications to TextFormat::PrintUnknownFields() to implement the
heuristic. Is there some existing function that already implements
this heuristic? If not, is it possible to add one to the codebase? It
appears to be very useful for scenarios where one doesn't have the
.proto file, and only requires minor code modifications listed below:
<code>
for (int j = 0; j < field.length_delimited_size(); j++) {
generator.Print(field_number);
EmptyMessage embedded_msg;
// The empty_message.pb.h and empty_message.pb.cc files
generated by protoc are included in Makefile.am and thus added to
libprotobuf
// #include <google/protobuf/empty_message.pb.h> is used earlier
string field_str = field.length_delimited(j);
if (embedded_msg.ParseFromArray(field_str.data(), field_str.size())) {
// the new action
generator.Print(":\n");
generator.Indent();
Print(embedded_msg.GetDescriptor(),
embedded_msg.GetReflection(), generator);
generator.Outdent();
generator.Print("\n");
} else {
// The original action
generator.Print(": \"");
generator.Print(CEscape(field.length_delimited(j)));
generator.Print("\"\n");
}
}
</code>
--
_________________________________________
Dilip Antony Joseph
Graduate Student
Computer Science Division,
University of California, Berkeley
http://www.cs.berkeley.edu/~dilip
I could fix this problem by adding a input->ConsumedEntireMessage()
check in WireFormat::SkipMessage() [code at end of email]. I couldn't
find documentation for the return value semantics of SkipMessage().
Is this an acceptable change? Am I missing some other way to use
WireFormat to parse a message into an UnknownFieldSet?
I will send the TextFormat patch as soon as the above issue is resolved.
Regards
Dilip
<code>
bool WireFormat::SkipMessage(io::CodedInputStream* input,
UnknownFieldSet* unknown_fields) {
while(true) {
uint32 tag = input->ReadTag();
if (tag == 0) {
// End of input. This is a valid place to end, so return true.
return true;
}
WireType wire_type = GetTagWireType(tag);
if (wire_type == WIRETYPE_END_GROUP) {
// Must be the end of the message.
if(!input->ConsumedEntireMessage()) return false;//ADDED by Dilip
return true;
}
if (!SkipField(input, tag, unknown_fields)) return false;
}
}
</code>
Thanks for the info.
Here is the patch that uses the heuristic you suggested to display
numeric key-value pairs for embedded fields, when the .proto is not
available. I have added Print() functions which take in just the
message bytes and length, similar to the existing ones that take in
Message/Reflection/Descriptor objects.
I have submitted the online individual contribution license form.
Regards
Dilip
Thanks, Dilip. Can you upload this to http://codereview.appspot.com/ and send it to me as a code review? At Google we require that every change be reviewed by another engineer before submission.