Message from discussion
Deserializing Messages of unknown type at compile-time
Received: by 10.100.92.9 with SMTP id p9mr2816437anb.3.1221122437980;
Thu, 11 Sep 2008 01:40:37 -0700 (PDT)
Return-Path: <alexloddenga...@gmail.com>
Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.155])
by mx.google.com with ESMTP id 22si13205788yxr.1.2008.09.11.01.40.36;
Thu, 11 Sep 2008 01:40:37 -0700 (PDT)
Received-SPF: pass (google.com: domain of alexloddenga...@gmail.com designates 72.14.220.155 as permitted sender) client-ip=72.14.220.155;
Authentication-Results: mx.google.com; spf=pass (google.com: domain of alexloddenga...@gmail.com designates 72.14.220.155 as permitted sender) smtp.mail=alexloddenga...@gmail.com; dkim=pass (test mode) header...@gmail.com
Received: by fg-out-1718.google.com with SMTP id 19so163605fgg.17
for <protobuf@googlegroups.com>; Thu, 11 Sep 2008 01:40:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=gamma;
h=domainkey-signature:received:received:message-id:date:from:to
:subject:cc:in-reply-to:mime-version:content-type:references;
bh=koongYg3Ztt5SNcTTJ6sOWTzzlKhcGSf9d9h8C/rDyA=;
b=aRarrZZOWY8ycr41T9BPxEypkKR8BhUGNkJ2vR1JoIOAYEjtHljr4E7N+JAvGvcogy
YfUynJHgk5c2CGaausSAOj0Xpc6C0vdw1P1RYoNqboRS/uju2EzO8RIhbkNGmkVGygUx
KZO/KgJggRk21jZZTF9SchYZD1qe+ni0oOpf8=
DomainKey-Signature: a=rsa-sha1; c=nofws;
d=gmail.com; s=gamma;
h=message-id:date:from:to:subject:cc:in-reply-to:mime-version
:content-type:references;
b=lpTv/yq3fquDHVmAD7/LCx2TIKEaN73bp6aEyCB3gPS44E1axpANJ8d0awdsLsW/cs
OGO/+O9FBYtEknm8XBNRUaonHEOFXJFmu7y/VsbaPnb1ILrfhJReP97SQ0bPzCUuZvm7
Spdm50vM6xv0p6A1HsktV6vX8qG7X1/9wKOjM=
Received: by 10.180.218.16 with SMTP id q16mr1748668bkg.15.1221122436029;
Thu, 11 Sep 2008 01:40:36 -0700 (PDT)
Received: by 10.181.13.1 with HTTP; Thu, 11 Sep 2008 01:40:35 -0700 (PDT)
Message-ID: <c5c51be00809110140y7c1780e5ibe5e8a283f1402ab@mail.gmail.com>
Date: Thu, 11 Sep 2008 16:40:35 +0800
From: "Alex Loddengaard" <alexloddenga...@gmail.com>
To: Chris <turingt...@gmail.com>
Subject: Re: Deserializing Messages of unknown type at compile-time
Cc: "Kenton Varda" <ken...@google.com>,
"Protocol Buffers" <protobuf@googlegroups.com>
In-Reply-To: <48C8D46C.7010...@gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_Part_23019_20907920.1221122436039"
References: <e69a4eeb-9a58-49a8-bd0a-3697d607e...@x16g2000prn.googlegroups.com>
<4112ecad0809081016s7dab89eaw10850c523ccc3...@mail.gmail.com>
<c5c51be00809081818t5baa04d4tc51cb6ced8db0...@mail.gmail.com>
<4112ecad0809081827xf632cf0s4410de4f7701c...@mail.gmail.com>
<4112ecad0809081828l3accc78dl86b95f177c830...@mail.gmail.com>
<c5c51be00809081842t27948604p51ac9956c44c8...@mail.gmail.com>
<c5c51be00809081947n2239a78ak83a52d85c77f9...@mail.gmail.com>
<c5c51be00809082111x14f66412s8ed6c4413769d...@mail.gmail.com>
<4112ecad0809090956i3034f221mb2c4d93383c6d...@mail.gmail.com>
<48C8D46C.7010...@gmail.com>
------=_Part_23019_20907920.1221122436039
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Hi Chris,
Once I learned that Messages are not self-delimiting (thanks, Kenton!), I
started working with Hadoop's source to stop the trailing bits from being
included in the InputStream. I've since fixed this issue, kind of at least
;).
Perhaps a good general solution is to allow a user to put an option in a
.proto file or a Message declaration that makes Messages self-delimiting.
That way users who want speed don't need to us it, and users who want
convenience can use it. The implementation of this would probably be
tricky, I'm sure.
Thanks for the follow up, Chris. For now I'm good to go! Let me know if I
can provide any other feedback.
Alex
On Thu, Sep 11, 2008 at 4:18 PM, Chris <turingt...@gmail.com> wrote:
> Hi Alex,
>
> Kenton Varda wrote:
>
> On Mon, Sep 8, 2008 at 9:11 PM, Alex Loddengaard <
>> alexloddenga...@gmail.com <mailto:alexloddenga...@gmail.com>> wrote:
>>
>> I have a follow-up question:
>>
>> Will using
>> /messageInstance.newBuilderForType().mergeFrom(input).build();/
>> work for a stream that contains trailing binary information?
>>
>>
>> No, it won't work. Protocol buffers are not self-delimiting. They assume
>> that the input you provide is supposed to be one complete message, not a
>> message possibly followed by other stuff.
>>
>> You will need to somehow communicate the size of the message and make sure
>> to limit the input to that size.
>>
> Aha. This <binary>message<binary> case is one of the heretofore
> hypothetical use cases I am discussing in the adjacent thread on this
> mailing list / group. The thread is online at
>
>
> http://groups.google.com/group/protobuf/browse_thread/thread/b0ce2c7d8b05896e?hl=en
> and was spawned from
>
> http://groups.google.com/group/protobuf/browse_thread/thread/b0ce2c7d8b05896e?hl=en#
>
> This is mainly myself, Jon, and Kenton slowly forming a consensus on the
> right API for delimited messages. I had proposed simply adding the length
> (varint) before the message, and Kenton demonstrated c++ code for this. Jon
> proposed adding a field number / wiretype tag before the length and message,
> which makes it look much more like a protocol-buffer field on the wire.
>
> What do you need Alex?
>
> --
> Chris
>
------=_Part_23019_20907920.1221122436039
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
<div dir="ltr">Hi Chris,<br><br>Once I learned that Messages are not self-delimiting (thanks, Kenton!), I started working with Hadoop's source to stop the trailing bits from being included in the InputStream. I've since fixed this issue, kind of at least ;).<br>
<br>Perhaps a good general solution is to allow a user to put an option in a .proto file or a Message declaration that makes Messages self-delimiting. That way users who want speed don't need to us it, and users who want convenience can use it. The implementation of this would probably be tricky, I'm sure.<br>
<br>Thanks for the follow up, Chris. For now I'm good to go! Let me know if I can provide any other feedback.<br><br>Alex<br><br><div class="gmail_quote">On Thu, Sep 11, 2008 at 4:18 PM, Chris <span dir="ltr"><<a href="mailto:turingt...@gmail.com">turingt...@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi Alex,<br>
<br>
Kenton Varda wrote:<div><div></div><div class="Wj3C7c"><br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Mon, Sep 8, 2008 at 9:11 PM, Alex Loddengaard <<a href="mailto:alexloddenga...@gmail.com" target="_blank">alexloddenga...@gmail.com</a> <mailto:<a href="mailto:alexloddenga...@gmail.com" target="_blank">alexloddenga...@gmail.com</a>>> wrote:<br>
<br>
I have a follow-up question:<br>
<br>
Will using<br>
/messageInstance.newBuilderForType().mergeFrom(input).build();/<br>
work for a stream that contains trailing binary information?<br>
<br>
<br>
No, it won't work. Protocol buffers are not self-delimiting. They assume that the input you provide is supposed to be one complete message, not a message possibly followed by other stuff.<br>
<br>
You will need to somehow communicate the size of the message and make sure to limit the input to that size.<br>
</blockquote></div></div>
Aha. This <binary>message<binary> case is one of the heretofore hypothetical use cases I am discussing in the adjacent thread on this mailing list / group. The thread is online at<br>
<br>
<a href="http://groups.google.com/group/protobuf/browse_thread/thread/b0ce2c7d8b05896e?hl=en" target="_blank">http://groups.google.com/group/protobuf/browse_thread/thread/b0ce2c7d8b05896e?hl=en</a><br>
and was spawned from<br>
<a href="http://groups.google.com/group/protobuf/browse_thread/thread/b0ce2c7d8b05896e?hl=en#" target="_blank">http://groups.google.com/group/protobuf/browse_thread/thread/b0ce2c7d8b05896e?hl=en#</a><br>
<br>
This is mainly myself, Jon, and Kenton slowly forming a consensus on the right API for delimited messages. I had proposed simply adding the length (varint) before the message, and Kenton demonstrated c++ code for this. Jon proposed adding a field number / wiretype tag before the length and message, which makes it look much more like a protocol-buffer field on the wire.<br>
<br>
What do you need Alex?<br>
<br>
-- <br><font color="#888888">
Chris<br>
</font></blockquote></div><br></div>
------=_Part_23019_20907920.1221122436039--