Re: RPC using google protbuffers

Johan Euphrosine

unread,

Jul 19, 2008, 6:31:38 AM7/19/08

to Frederik M.J.V., Protocol Buffers

Forwarding the list for public feedback, and discussions:

What do you think of the idea of defining the RPC protocol in a
".proto" file ?

And have each (protobuf) message prefixed by length, in wire protocol:

Thus one can use already existing Protocol like NetString, or
Int32StringPrefixed, for reading message one by one, and decode the
message using protobuf to dispatch the call (or the answer).

On Fri, Jul 18, 2008 at 5:17 PM, Johan Euphrosine <pro...@aminche.com> wrote:
> Sure, I'll take a look for implementing it next week,
>
> I'll be looking forward using the same 'format codes' you did,
> instead of doing some protobuf nesting I don't feel very confortable with:
> http://bazaar.launchpad.net/~proppy/txprotobuf/master/annotate/23?file_id=txprotobuf.proto-20080718121617-wa53tea90747nfwb-1
>
> On 7/18/08, Frederik M.J.V. <fre...@iskrembilen.com> wrote:
>> I am creating an rpc protocol using google protocol buffers, and i have made an
>> implementation for java. I see that you have made an implementation for python
>> and wondered if we could cooperate on the protocol to make the rpc
>> implementations work with eachother.
>>
>> I have documented the protocol i use at
>> http://iskrembilen.com/freqmod/web/Protorpcdoc.html , but are open for
>> suggetions and improvements. My java implementation is avaliable from
>> http://kdemod.iskrembilen.com/git/gitweb.cgi?p=freqmod/protorpc/.git;a=summary
>>
>> --
>>
>>
>> Frederik M.J.V
>>
>
>
> --
> bou ^
>

--
bou ^

fre...@gmail.com

unread,

Jul 19, 2008, 8:46:56 AM7/19/08

to Protocol Buffers

I think that using protocol buffers might be an alternative to the
format codes. It would certainly be more flexible. I have made a draft
specification based on my original specification and put it on
http://protorpc.likbilen.com/Protorpcdocprotobuf.html

The only downside i could see with this is that it might give a bit
larger overhead.

Marc Gravell

unread,

Jul 19, 2008, 10:08:03 AM7/19/08

to Protocol Buffers

Is that the right link? Isn't that the format-code spec? Or did I
misunderstand?

Personally I like the idea using .proto to describe the wire format...
it would be even better if (separate discussion) the different methods
had tags (in addition to names), as then the entire thing could be
treated as a regular .proto fragment - except we need to know the
length at the start. That is part of why I was suggesting "bytes" for
payloads, since that uses the string wire-type and is thus length
prefixed (allowing the server to refuse a request that is too big, for
DDOS reasons).

One thought, though: it might make good sense to make provision (from
the outset) for metadata exchange; meaning that there is a standard
message that says "give me your .proto", i.e. it returns the binary
compiled form of the .proto definition that describes the types used
and the services supported. The .proto equivalent of a wsdl, if you
like...

It would obviously be optional whether a server supports mex, but it
might make tooling/interop a lot easier if you can simply point your
developer tool at your known endpoint. Arguably another option is to
simply hand callers your .proto (either text or compiled). Just a
thought...

Marc

fre...@gmail.com

unread,

Jul 19, 2008, 10:25:31 AM7/19/08

to Protocol Buffers

http://protorpc.likbilen.com/Protorpcdocprotobuf.html is based on
protobuffers,
http://protorpc.likbilen.com/Protorpcdoc.html is the old spec.

You could add some more id's in the enum to support service exchange
and validation (validate that the service on the other side is based
on the same service that the stub uses), but i can't see any point in
sending the whole service as it has to be compiled etc. (at least for
java and c++).

I did not quite understand what you ment by tags, but I don't want the
rpc protocol (and it's protos) to be dependant on what messages that
are passed.

Marc Gravell

unread,

Jul 19, 2008, 11:15:13 AM7/19/08

to Protocol Buffers

> but i can't see any point in
> sending the whole service as it has to be compiled etc. (at least for
> java and c++).

I just mean the binary form of the service descriptor; it would only
be sent a: on request, and b: if the server supports it.
This is for tooling, not generally for runtime (although that usage
perhaps isn't precluded); and there is no point making such tools have
to parse a text file when the binary version is available (format
defined by descriptor.proto). Maybe this is something for separate
discussion, but it is very useful for a service to be self-describing
- but to be workable the caller needs an agreed way of asking "what
can you do?". I might be wrong, but can't the java and c++ versions
generate code *from* the binary version of the descriptor? That would
be a logical way to write such a tool...

> I did not quite understand what you ment by tags

The "ids" discussion. you started here:
http://groups.google.com/group/protobuf/browse_thread/thread/4ce58011263f5963
The language and encoding documents refer to the field identifiers as
"tags". I was just using the same terminology.
I didn't understand your second sentance, though - can you explain
what dependency it is that you don't want?

Marc

fre...@gmail.com

unread,

Jul 19, 2008, 11:48:35 AM7/19/08

to Protocol Buffers

I tried to understand what "tag" ment, but now i see. Btw. the parser
uses "number" internally as a reference to the tag's (i.e.
getNumber()).

The dependency point was that the rpc framework should be a library
and the proto's should be independent on the information that was sent
using it, i wrote that when i tried, and failed to understand what you
ment by tags.

I have created two messages to facilitate it, which are described like
this, improvements and suggestions are welcome as allways:

PROTOBUF_REQUEST message:

The PROTOBUF_REQUEST message is if the other parts wants to know how
the protobuffer describing this server looks like.

The message is sent like this:
msg_len(2),Message{code=PROTOBUF_REQUEST}

PROTOBUF_RESPONSE message

The PROTOBUF_RESPONSE message is sent as an answer to the
PROTOBUF_REQUEST, if the protobuffer is unavalable the buffer field
will be
empty.

The message is sent like this:
msg_len(2),Message{code=PROTOBUF_REQUEST,buffer=the protobuffer or
empty}

Marc Gravell

unread,

Jul 19, 2008, 12:44:13 PM7/19/08

to Protocol Buffers

I'll be honest, I can't visualise (from that description) what the
data on the wire would look like... but also - re having enums for
request/response - isn't that largely implicit? If we're (as a client)
sending a message, isn't is a request? etc. I'm not an expert on RPC,
so please tell me if I'm being daft.

Something else worth considering; all those nasty things like
identity. Privacy can probably be deferred to the transport (ssl, ssh,
etc) - but it would be good to know who the caller is, for example
using something akin to http headers. But ideally part of the message
header, not the transport.

As an example, I'd like to be able to write an http-based .proto RPC
server (aside: mime type?), but only know about http[s] as an
implementation detail, leaving everything like identity to the proto
code. A sockets server would, apart from the transport layer, work
exactly the same.

Marc

fre...@gmail.com

unread,

Jul 20, 2008, 4:48:45 AM7/20/08

to Protocol Buffers

response message:
msg_len(2),Message{code=PROTOBUF_RESPONSE,buffer=the protobuffer or
empty}
buffer just sends a string representation of the buffer, i would
suggest to use the input to
com.google.protobuf.Descriptors.FileDescriptor .internalBuildGeneratedFileFrom

The reason for having message codes are if we want bidirectional
messages over the same connection, so the server could push data, and
the client doesn't need to poll it (i.e. have noe concept of client
and server in the stream protocol, just peers)

When it comes to security i have tought about three situations:
Use external security: i.e. a named pipe on top of ssh

Use challenge response: i.e. the server and client has a field that
the other part has to respond to. (this doesn't provide encryption)

Use ssl: The connection is point to point secured by ssl, and the user
is authenticated in the protocol (e.g. by a login() method), or by
ssl certificates.

where the ssl solution seems the simplest one. This has the advantage
that we don't have to deal with authentication specific issues.

The reason for using ssl it to secure against middlemen and not send
any passwords on the wire in plain text.

btw. it seems like we have to use strings to identify method as google
doesn't want to have method tags.

fre...@gmail.com

unread,

Jul 20, 2008, 4:50:34 AM7/20/08

to Protocol Buffers

I could however drop the init codes now that we have protocol buffers
to enshure backwards compatibillity.

fre...@gmail.com

unread,

Jul 20, 2008, 7:28:37 AM7/20/08

to Protocol Buffers

I have implemented the protobuffer solution, and pushed it to the
repository.
While implementing the PROTOBUF_RESPONSE ( that i renamed to
DESCRIPTOR_RESPONSE) functionallity i understood that the way to
serialize descriptors had to be defined, and therefore i have updated
the specification ( http://protorpc.likbilen.com/Protorpcdocprotobuf.html
) .

[CPR]-AL.exe

unread,

Jul 21, 2008, 3:54:15 AM7/21/08

to Protocol Buffers

Johan Euphrosine

Could you be so kind to share your Python RPC implementation for
Protocol Buffers? That would be great!

On 19 июл, 14:31, "Johan Euphrosine" <pro...@aminche.com> wrote:
> Forwarding the list for public feedback, and discussions:
>
> What do you think of the idea of defining the RPC protocol in a
> ".proto" file ?
>
> And have each (protobuf) message prefixed by length, in wire protocol:
>
> Thus one can use already existing Protocol like NetString, or
> Int32StringPrefixed, for reading message one by one, and decode the
> message using protobuf to dispatch the call (or the answer).
>
>
>
> On Fri, Jul 18, 2008 at 5:17 PM, Johan Euphrosine <pro...@aminche.com> wrote:
> > Sure, I'll take a look for implementing it next week,
>
> > I'll be looking forward using the same 'format codes' you did,
> > instead of doing some protobuf nesting I don't feel very confortable with:

> >http://bazaar.launchpad.net/~proppy/txprotobuf/master/annotate/23?fil...

>
> > On 7/18/08, Frederik M.J.V. <freq...@iskrembilen.com> wrote:
> >> I am creating an rpc protocol using google protocol buffers, and i have made an
> >> implementation for java. I see that you have made an implementation for python
> >> and wondered if we could cooperate on the protocol to make the rpc
> >> implementations work with eachother.
>
> >> I have documented the protocol i use at

> >> http://iskrembilen.com/freqmod/web/Protorpcdoc.html, but are open for

> >> suggetions and improvements. My java implementation is avaliable from

> >> http://kdemod.iskrembilen.com/git/gitweb.cgi?p=freqmod/protorpc/.git;...

Johan Euphrosine

unread,

Jul 21, 2008, 4:38:49 AM7/21/08

to [CPR]-AL.exe, Protocol Buffers

Sure,
I already announced it here:
http://groups.google.com/group/protobuf/browse_thread/thread/8973583ec1dc5a8f/8c650c653cae7f7d?lnk=gst&q=txprotobuf#8c650c653cae7f7d

You can clone it using bazaar:
bzr branch lp:txprotobuf

2008/7/21 [CPR]-AL.exe <CPR.A...@gmail.com>:

--
bou ^

[CPR]-AL.exe

unread,

Jul 21, 2008, 4:48:19 AM7/21/08

to Protocol Buffers

Oh, found here: http://bazaar.launchpad.net/~proppy/txprotobuf/master/files

Thank you, Johan ;)

Marc Gravell

unread,

Jul 21, 2008, 5:43:50 AM7/21/08

to Protocol Buffers

Re your suggested .proto for RPC; can I suggest an addition? It feels
that even if we don't need them now, support for headers is a "must"
for the future; so perhaps bump the buffer tag a bit? heck, why not 8?
that way we have space before and after, still in the single-byte
zone.

That way, at a later date we can add an "optional MessageHeader header
= 6" or whatever, which could define (for example) "optional bytes
authToken = 1" etc. So common, known headers can be given specific low
ids, with custom headers having to settle for high ids, or possibly
even name/value pairs.

But my main point here: headers should preceed the body, and it would
be nice to be able to send a message with the tags in the correct
(ascending) sequence.

Does that make sense?

Marc

Marc Gravell

unread,

Jul 21, 2008, 6:03:07 AM7/21/08

to Protocol Buffers

I replied a few minutes ago but I think it got lost ;-(

I was saying that you might want to bump the body tag up a bit, to
leave space for a future "headers" tag (with a lower index). As long
as everything is <=15 that doesn't cost anything, but gives us the
opportunity to insert a headers type representing known headers
(treating custom headers either as extensions with higher tag numbers,
or as name/value pairs). But in particular, I can predict the need to
have (in the non-existant "MessageHeaders" object) something like an
"optional bytes authToken=1" etc.

Either way, any future headers really need to preceed the body, and it
would be nice to send the tags in ascending order...

Marc

fre...@gmail.com

unread,

Jul 21, 2008, 1:12:15 PM7/21/08

to Protocol Buffers

Is
message Message {
optional Type type=1;
optional uint32 id =2;
optional buffer auth=3;
optional buffer header=4;
optional string name =5;

optional bytes buffer = 9;
}
, where the header field is currently unused,
ok?

If not please post a suggestion on what you want the "Message" proto
part to look like.

Marc Gravell

unread,

Jul 21, 2008, 3:14:11 PM7/21/08

to Protocol Buffers

(should "buffer" [3 & 4] be "bytes"?)
Actually, until we know what it looks like I propose we don't put
anything there... just leave a gap in the tags so that we /can/
later...
What is the id here? I'd also suggest a more descriptive name than
"Type", which will cause issues in a few languages...
Re the name "buffer" [9] - suggest something more descriptive?

Something like:

Isn't the name "required" for dispatch?

message Message {
optional MessageType type=1;
optional uint32 id =2; // remind me what this is?
// gap for future
required string name =5; // required for dispatch?

optional bytes body = 9;

}

Marc

fre...@gmail.com

unread,

Jul 21, 2008, 4:53:08 PM7/21/08

to Protocol Buffers

Ok, a good suggestion, i'll use that if nobody else has any comments.

message Message {
optional MessageType type=1;// do you want to rename the field too,
and do you have any suggestions for field name?
optional uint32 id =2; // To track which response that answers which
request for asyncronous communication
required string name =5; //name of the method, required for request
as google don't want to implement number tags on method's

optional bytes body = 9;
}

The id is assigned when a request is created and responses to that
request (response, canceled, failed etc.) should send that id back to
make shure that the response is sent to the right caller when doing
asyncronous communication.

In any case the names are not important as the method buffers will be
wire compatible (AFAIK) as long as the fields have the same numbers
and types.

Marc Gravell

unread,

Jul 21, 2008, 5:00:10 PM7/21/08

to Protocol Buffers

> In any case the names are not important

Agreed; the runtime will be fine - but it makes life easier for us
mere mortals ;-p

Thanks for the additional info on id - that makes sense. And actually
[d'oh!] for some message-types maybe there *isn't* a name - so perhaps
it is optional after all? (you know more about your intention for
MessageType...).

Marc

fre...@gmail.com

unread,

Jul 22, 2008, 4:42:40 PM7/22/08

to Protocol Buffers

Name is only nessesarry for REQUEST, everything else uses the id, or
are method independent.

btw. first release of pyrorpc protorpc for python is released to git
repository (linked from protorpc.likbilen.com) . This version does not
support sockets, but uses an internal file emulation for testing. It
requires a python implementation patched for the method not
implemented error. Expect socket support soon.

Johan Euphrosine

unread,

Jul 25, 2008, 11:48:55 AM7/25/08

to fre...@gmail.com, Protocol Buffers

Hi,

I've begun implementation of your rpc protocol in a new branch:

http://bazaar.launchpad.net/~proppy/txprotobuf/freqmod/files

Tests are passing with theses modifications:
http://bazaar.launchpad.net/~proppy/txprotobuf/freqmod/revision/37
http://bazaar.launchpad.net/~proppy/txprotobuf/freqmod/revision/38

Note that for now it uses Int32 prefix, yours is using Int16 right ?

You can pull the code with:
bzr branch lp:~proppy/txprotobuf/freqmod

--
bou ^

fre...@gmail.com

unread,

Jul 25, 2008, 12:46:30 PM7/25/08

to Protocol Buffers

On Jul 25, 5:48 pm, "Johan Euphrosine" <pro...@aminche.com>

> Note that for now it uses Int32 prefix, yours is using Int16 right ?

Ok we can standardize on uint32 (little endian), i'll update my
specification and code.

If you want to use my code and try to get it included in twisted i can
make the _python_ version avaliable under the MIT license. ( just send
me a mail/post a message and i will put it in the git repository )

Glenn Tarbox

unread,

Jul 25, 2008, 1:05:36 PM7/25/08

to fre...@gmail.com, Protocol Buffers

On Fri, 25 Jul 2008 09:46:30 -0700, <fre...@gmail.com> wrote:

> On Jul 25, 5:48 pm, "Johan Euphrosine" <pro...@aminche.com>
>
>> Note that for now it uses Int32 prefix, yours is using Int16 right ?
>
> Ok we can standardize on uint32 (little endian), i'll update my
> specification and code.

I'd like to verify: I think they're the same but its the "network"
encoding for 32 bit unsigned integer we're standardizing on. Java uses
that encoding natively. Python requires the "pack" prefix:

i=pack("!I",length)

don't know what C++ requires to disambiguate but I'm sure its in there.

>
> If you want to use my code and try to get it included in twisted i can
> make the _python_ version avaliable under the MIT license. ( just send
> me a mail/post a message and i will put it in the git repository )

I've been thinking a bit about all this. It seems to me that it wouldn't
be hard to go one step further and venture into true distributed objects.
I reference the twisted-related Foolscap project:

http://foolscap.lothar.com/trac

which is really the next version of twisted's PB architecture. Not
entirely sure of the history but I think the author got a job so the
projects are somewhat disjoint. but it seems clear that the intent is to
merge it all with twisted. (but I don't actually know)

I've been using Foolscap and its very solid. The notion of the Tub is
also very useful in terms of object management semantics and could be
"easily" (noting that nothing is easy) implemented in the "usual"
languages.

The one needed extension is the notion of a generalized object reference.
Host:port:object typically works fine and there's a url definition in
foolscap.

As one generalizes things get tricker: as you're all too aware. Defining
a few messages and services in protobuf might get us most of the way there
and hopefully avoid the generalized problem black hole. (i.e. the endless
threads on idempotency etc.)

-glenn

--
Glenn H. Tarbox, PhD || gl...@tarbox.org
"Don't worry about people stealing your ideas. If your ideas are any
good you'll have to ram them down peoples throats" -- Howard Aiken

Frederik M.J.V.

unread,

Jul 25, 2008, 1:49:25 PM7/25/08

to Glenn Tarbox, Protocol Buffers

> On Fri, 25 Jul 2008 09:46:30 -0700, <fre...@gmail.com> wrote:
> > On Jul 25, 5:48 pm, "Johan Euphrosine" <pro...@aminche.com>
> >
> >> Note that for now it uses Int32 prefix, yours is using Int16 right ?
> >
> > Ok we can standardize on uint32 (little endian), i'll update my
> > specification and code.
>
> I'd like to verify: I think they're the same but its the "network"
> encoding for 32 bit unsigned integer we're standardizing on. Java uses
> that encoding natively. Python requires the "pack" prefix:
>
> i=pack("!I",length)
>

Well "network encoding" is big endian i.e. one thousand and forty two is
written "2401" instead of little endian "1024" (i am not sure which way is
which).
I will swap to the network byte order (big endian) if i don't hear anything.

> don't know what C++ requires to disambiguate but I'm sure its in there.

it is just bitshifting and bitwise or and and. (i am using it in java too)

> > If you want to use my code and try to get it included in twisted i can
> > make the _python_ version avaliable under the MIT license. ( just send
> > me a mail/post a message and i will put it in the git repository )

>
> I've been thinking a bit about all this. It seems to me that it wouldn't
> be hard to go one step further and venture into true distributed objects.
> I reference the twisted-related Foolscap project:
>

...

> As one generalizes things get tricker: as you're all too aware. Defining
> a few messages and services in protobuf might get us most of the way there
> and hopefully avoid the generalized problem black hole. (i.e. the endless
> threads on idempotency etc.)
>

The foolscap architecture looks interesting, but I want to make a simple
library (like protobuf is) with no external dependencies except protobuffers at
the core. The TwoWayBase shares the idea of having no server and client when
it comes to the protocol layer.

Then one could build more interresting things on top of that. Therefore i am
implementing sockets and a socket server without twisted too. (it's only
about 300 lines of code)