Text-oriented dump/restore utility for Protocol Buffers

248 views
Skip to first unread message

jimthoma...@yahoo.com

unread,
Jul 8, 2008, 6:32:30 PM7/8/08
to Protocol Buffers
I experimented with PB today and see real potential for using it for
configuration storage in an embedded appliance. PB appears to be much
simpler and lighter weight than an SQL based solution, and the proto
file maps mentally onto C structs, including nested structs. Very
nice.

Are there any plans for a general purpose tool that can parse a proto
file and associated PB binary file, and dump the data in human
readable formatted text? Perhaps such that the dumped text can be run
through a restore tool to recreate the PB binary?

These sorts of dump/restore tools give you the warm fuzzies that you
have control of the high-value data during migration or backup without
having to rely on a magic binary file, which can be problematic if it
is corrupted. You could save and read the text file as needed, and
'fix up' the text using an editor (observing all structural
constraints) and regenerate the PB binary from the manually modified
text file.

Each PB user could write his own text-oriented dump/restore utility
using data aware logic, but that can be time-consuming and error prone
(mainly errors of omission). If a general purpose dump/restore is
practical, that would be very useful. Granted, it may not be
practical due to the infinite permutations of how the proto file is
composed -- but heck, the SQL people do it. :-)

Thanks for the excellent open source contribution.

Jim

Kenton Varda

unread,
Jul 8, 2008, 7:16:37 PM7/8/08
to jimthoma...@yahoo.com, Protocol Buffers
On Tue, Jul 8, 2008 at 3:32 PM, <jimthoma...@yahoo.com> wrote:
Are there any plans for a general purpose tool that can parse a proto
file and associated PB binary file, and dump the data in human
readable formatted text?  Perhaps such that the dumped text can be run
through a restore tool to recreate the PB binary?

All the right components of this already exist -- check out the TextFormat class, which can print and parse protocol buffers in a human-readable format (the Python one only does printing at the moment but we're working on that).  This class is used to implement the DebugString(), toString(), and __str__() methods of the C++, Java, and Python message interfaces, respectively.

I actually intended to add some flags to the protocol compiler so that you can use it to convert between text and binary formats, but I didn't get a chance to do this before release.  Will probably come soon, though.

jimthoma...@yahoo.com

unread,
Jul 9, 2008, 1:52:04 PM7/9/08
to Protocol Buffers
On Jul 8, 6:16 pm, "Kenton Varda" <ken...@google.com> wrote:
Glad to hear this is already doable via TextFormat, and a provided
tool might be on the roadmap.

I am thinking of using PB cross-built for an ARM9 embedded target, so
ideally, the pb2text/text2pb conversion utility would have a small
footprint suitable for deployment in a limited flash memory based
filesystem.

I would use the protocol compiler with flags, but that sounds like it
may be a bit heavy for the target system. Is there any thought of a
smaller, standalone utility?

Thanks.

Jim

Kenton Varda

unread,
Jul 9, 2008, 2:15:26 PM7/9/08
to jimthoma...@yahoo.com, Protocol Buffers
On Wed, Jul 9, 2008 at 10:52 AM, <jimthoma...@yahoo.com> wrote:
I would use the protocol compiler with flags, but that sounds like it
may be a bit heavy for the target system.  Is there any thought of a
smaller, standalone utility?

You could very easily write such a utility based on the existing TextFormat class.

Helder Suzuki

unread,
Aug 8, 2008, 12:08:34 AM8/8/08
to Protocol Buffers
Hi Kenton,

I was trying to find how to parse text format into pb in Python, and I
noticed it's not implemented... I was just going to start working on
it!
It's good to know it's already being worked out, is there any plans
for release?

Thanks, Helder

On 8 jul, 16:16, "Kenton Varda" <ken...@google.com> wrote:

Kenton Varda

unread,
Aug 8, 2008, 12:49:59 PM8/8/08
to Helder Suzuki, Petar Petrov, Protocol Buffers
On Thu, Aug 7, 2008 at 9:08 PM, Helder Suzuki <helder...@gmail.com> wrote:
Hi Kenton,

I was trying to find how to parse text format into pb in Python, and I
noticed it's not implemented... I was just going to start working on
it!
It's good to know it's already being worked out, is there any plans
for release?

I think this is on Petar's todo list, but I'm sure he'd love it if you did it for him.  :)

Helder Suzuki

unread,
Aug 12, 2008, 7:10:27 AM8/12/08
to Kenton Varda, Petar Petrov, Protocol Buffers
Awesome!
I'll send a code for review soon...
Thanks, Helder

Helder Suzuki

unread,
Aug 16, 2008, 10:31:45 PM8/16/08
to Kenton Varda, Petar Petrov, Protocol Buffers
Hi Kenton and Petar,

I won't be able to work on the python text format parser this weekend
(I'm flying back to Brazil!), so I'm sending what I've done so far, so
that you can already start the code review.
There isn't much specification on the text format, so I'm using
Kenton's C++ implementation + test cases as the spec :-) A lot of code
and test cases are basically a translation from C++ to python, so it
should be easier to review.
The test cases are passing in my mac, I haven't tested on my windows,
but they should work there too.

Thanks,
Helder

tokenizer_diff.txt

Kenton Varda

unread,
Aug 18, 2008, 4:22:31 PM8/18/08
to Helder Suzuki, Petar Petrov, Protocol Buffers
Thanks, Helder.  Can you send this to Petar and me using codereview.appspot.com?
Reply all
Reply to author
Forward
0 new messages