Protocol Buffers

Keith Rarick

unread,

Jul 12, 2008, 7:49:23 PM7/12/08

to beansta...@googlegroups.com

For those of you who haven't seen this:

http://google-opensource.blogspot.com/2008/07/protocol-buffers-googles-data.html
http://code.google.com/p/protobuf/
http://code.google.com/apis/protocolbuffers/docs/overview.html

I'm thinking of completely ditching the ad hoc ASCII protocol and
using protocol buffers for beanstalkd 2.0. That's a ways off, so we
have plenty of time to discuss this possibility.

It currently has C++, Java, and Python libraries. There are no C,
Erlang, Ruby, or Perl libraries yet, but I assume most of these will
appear before we need them.

Pros of switching:
* speed of parsing/encoding,
* flexibility when communicating with older/newer protocol versions,
* simpler code.

Cons of switching:
* unavailability will hurt languages that aren't super-mainstream,
* can't use telnet to talk to beanstalkd any more.

If no one decides to write Erlang bindings for protocol buffers, then
it'll be mighty hard to have a beanstalk client in Erlang. Same goes
for any other language. I want the barrier to be very low for new
languages talking to beanstalkd. So this is practically a
deal-breaker.

Telnet is also almost a deal-breaker. I use telnet ALL THE TIME when
working with beanstalkd, and not just in debugging. I would absolutely
need some sort of debugging tool to let me talk to a
protocol-buffers-using server interactively.

Hopefully these problems will be solved somehow, but there are likely
more points on both sides that I'm missing.

What say you?

kr

Dustin

unread,

Jul 12, 2008, 11:33:53 PM7/12/08

to beanstalk-talk

I did the binary protocol implementation for memcached (and a
production-ready java client and at least demonstrable python
client). We had a few of the same goals, but didn't have things like
protocol buffers or thrift when that was happening.

We plan on keeping both protocols around for a while. *Currently*,
they both operate on the same port and there's autonegotiation to
figure out which protocol to use. I thought that was kind of a dumb
idea, but it was easier to just implement it than it was to argue with
everyone about why I was requiring another port.

Writing clients is fun. I suppose I do use telnet a lot with
beanstalk and memcached, but I also create little scripts to do
certain things that are too hard to do with a telnet (such as flip
through x0,000 jobs to remove duplicates while burying jobs that match
a certain pattern).

On Jul 12, 4:49 pm, "Keith Rarick" <k...@causes.com> wrote:
> For those of you who haven't seen this:
>

> http://google-opensource.blogspot.com/2008/07/protocol-buffers-google...http://code.google.com/p/protobuf/http://code.google.com/apis/protocolbuffers/docs/overview.html

Tim Fletcher

unread,

Jul 13, 2008, 7:29:50 AM7/13/08

to beanstalk-talk

> If no one decides to write Erlang bindings for protocol buffers, then
> it'll be mighty hard to have a beanstalk client in Erlang.

Not an issue. I've got the wire protocol decoding working in Erlang
(will try and tidy up the code and make it available soon).

Perl and Ruby bindings seem to be in the works as well:

http://groups.google.com/group/protobuf-perl
http://groups.google.com/group/protobuf/browse_thread/thread/e02a20fb1034dca6

Tim Fletcher

unread,

Jul 14, 2008, 12:02:00 PM7/14/08

to beanstalk-talk

> Not an issue. I've got the wire protocol decoding working in Erlang
> (will try and tidy up the code and make it available soon).

http://github.com/tim/erlang-protobuf/tree/master

Erich

unread,

Jul 14, 2008, 1:37:29 PM7/14/08

to beanstalk-talk

On Jul 12, 6:49 pm, "Keith Rarick" <k...@causes.com> wrote:

> Pros of switching:
> * speed of parsing/encoding,
> * flexibility when communicating with older/newer protocol versions,
> * simpler code.
>

* Fairly well understood and standard communication layer (at least by
the time beanstalkd would move to it)
* could make both of the following easy:
+ beanstalk as a plugin to another system/framework/etc
+ beanstalk plugins (e.g. logging, persistence, memcache)
* persistence would be easier in general anyway

> Cons of switching:
> * unavailability will hurt languages that aren't super-mainstream,
> * can't use telnet to talk to beanstalkd any more.
>

* introduces a new dependency. As it stands, the only real
requirements for beanstalk are:
1. libevent
2. interpreter for your client of choice (python, ruby, erlang, et
al)
3. yaml decoder for your lang, if you care about the stats. and
similar introspection tools. (i managed to get the python client sort
of working with a fake yaml parser just to not break the import)

* relies on a different framework's semantics (i dont know how big
this con is or how real it is, I haven't had the time to really
investigate gpb in depth yet)
* complictes debugging/testing. Personally i've tcpdumped a few runs
to watch what was happening more than a couple of times.

>
> Telnet is also almost a deal-breaker. I use telnet ALL THE TIME when
> working with beanstalkd, and not just in debugging. I would absolutely
> need some sort of debugging tool to let me talk to a
> protocol-buffers-using server interactively.
>

I like the telnet-ability of beanstalkd, but im sure something could
be made rather quickly in python/ruby/$FAV_SCRIPT_LANG to pretend
you're telneting. Just a thought...

> What say you?

Overall, I am -0 on this. The slight oppositions I have are:
1. I prefer simple text based protocols
2. Will it actually be faster?

On the other hand, I am all for standardizing payload on protocol
buffers, as well as the various bits of info. When I say
standardizing payload, I mean that it would be the reccomended payload
format, not strictly required. This way client writers could add some
tools to make the whole thing work slightly nicer (automatic object
serialization etc)

Regards,
Erich

Reply all

Reply to author

Forward