Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

2.1.0 release is up

124 views
Skip to first unread message

Kenton Varda

unread,
May 13, 2009, 7:04:25 PM5/13/09
to Protocol Buffers
http://code.google.com/p/protobuf/downloads/list

Aaaand, I just realized that CHANGES.txt still has the release date as ????.  :(

/me is not very good at release engineering.

Oh well.

Kenton Varda

unread,
May 13, 2009, 7:06:04 PM5/13/09
to Protocol Buffers
Here's the major changes (from CHANGES.txt):

  General
  * Repeated fields of primitive types (types other that string, group, and
    nested messages) may now use the option [packed = true] to get a more
    efficient encoding.  In the new encoding, the entire list is written
    as a single byte blob using the "length-delimited" wire type.  Within
    this blob, the individual values are encoded the same way they would
    be normally except without a tag before each value (thus, they are
    tightly "packed").
  * For each field, the generated code contains an integer constant assigned
    to the field number.  For example, the .proto file:
      message Foo { optional int bar_baz = 123; }
    would generate the following constants, all with the integer value 123:
      C++:     Foo::kBarBazFieldNumber
      Java:    Foo.BAR_BAZ_FIELD_NUMBER
      Python:  Foo.BAR_BAZ_FIELD_NUMBER
    Constants are also generated for extensions, with the same naming scheme.
    These constants may be used as switch cases.
  * Updated bundled Google Test to version 1.3.0.  Google Test is now bundled
    in its verbatim form as a nested autoconf package, so you can drop in any
    other version of Google Test if needed.
  * optimize_for = SPEED is now the default, by popular demand.  Use
    optimize_for = CODE_SIZE if code size is more important in your app.
  * It is now an error to define a default value for a repeated field.
    Previously, this was silently ignored (it had no effect on the generated
    code).
  * Fields can now be marked deprecated like:
      optional int32 foo = 1 [deprecated = true];
    Currently this does not have any actual effect, but in the future the code
    generators may generate deprecation annotations in each language.
  * Cross-compiling should now be possible using the --with-protoc option to
    configure.  See README.txt for more info.

  protoc
  * --error_format=msvs option causes errors to be printed in Visual Studio
    format, which should allow them to be clicked on in the build log to go
    directly to the error location.
  * The type name resolver will no longer resolve type names to fields.  For
    example, this now works:
      message Foo {}
      message Bar {
        optional int32 Foo = 1;
        optional Foo baz = 2;
      }
    Previously, the type of "baz" would resolve to "Bar.Foo", and you'd get
    an error because Bar.Foo is a field, not a type.  Now the type of "baz"
    resolves to the message type Foo.  This change is unlikely to make a
    difference to anyone who follows the Protocol Buffers style guide.

  C++
  * Several optimizations, including but not limited to:
    - Serialization, especially to flat arrays, is 10%-50% faster, possibly
      more for small objects.
    - Several descriptor operations which previously required locking no longer
      do.
    - Descriptors are now constructed lazily on first use, rather than at
      process startup time.  This should save memory in programs which do not
      use descriptors or reflection.
    - UnknownFieldSet completely redesigned to be more efficient (especially in
      terms of memory usage).
    - Various optimizations to reduce code size (though the serialization speed
      optimizations increased code size).
  * Message interface has method ParseFromBoundedZeroCopyStream() which parses
    a limited number of bytes from an input stream rather than parsing until
    EOF.
  * GzipInputStream and GzipOutputStream support reading/writing gzip- or
    zlib-compressed streams if zlib is available.
    (google/protobuf/io/gzip_stream.h)
  * DescriptorPool::FindAllExtensions() and corresponding
    DescriptorDatabase::FindAllExtensions() can be used to enumerate all
    extensions of a given type.
  * For each enum type Foo, protoc will generate functions:
      const string& Foo_Name(Foo value);
      bool Foo_Parse(const string& name, Foo* result);
    The former returns the name of the enum constant corresponding to the given
    value while the latter finds the value corresponding to a name.
  * RepeatedField and RepeatedPtrField now have back-insertion iterators.
  * String fields now have setters that take a char* and a size, in addition
    to the existing ones that took char* or const string&.
  * DescriptorPool::AllowUnknownDependencies() may be used to tell
    DescriptorPool to create placeholder descriptors for unknown entities
    referenced in a FileDescriptorProto.  This can allow you to parse a .proto
    file without having access to other .proto files that it imports, for
    example.
  * Updated gtest to latest version.  The gtest package is now included as a
    nested autoconf package, so it should be able to drop new versions into the
    "gtest" subdirectory without modification.

  Java
  * Fixed bug where Message.mergeFrom(Message) failed to merge extensions.
  * Message interface has new method toBuilder() which is equivalent to
    newBuilderForType().mergeFrom(this).
  * All enums now implement the ProtocolMessageEnum interface.
  * Setting a field to null now throws NullPointerException.
  * Fixed tendency for TextFormat's parsing to overflow the stack when
    parsing large string values.  The underlying problem is with Java's
    regex implementation (which unfortunately uses recursive backtracking
    rather than building an NFA).  Worked around by making use of possesive
    quantifiers.
  * Generated service classes now also generate pure interfaces.  For a service
    Foo, Foo.Interface is a pure interface containing all of the service's
    defined methods.  Foo.newReflectiveService() can be called to wrap an
    instance of this interface in a class that implements the generic
    RpcService interface, which provides reflection support that is usually
    needed by RPC server implementations.
  * RPC interfaces now support blocking operation in addition to non-blocking.
    The protocol compiler generates separate blocking and non-blocking stubs
    which operate against separate blocking and non-blocking RPC interfaces.
    RPC implementations will have to implement the new interfaces in order to
    support blocking mode.
  * New I/O methods parseDelimitedFrom(), mergeDelimitedFrom(), and
    writeDelimitedTo() read and write "delemited" messages from/to a stream,
    meaning that the message size precedes the data.  This way, you can write
    multiple messages to a stream without having to worry about delimiting
    them yourself.
  * Throw a more descriptive exception when build() is double-called.
  * Add a method to query whether CodedInputStream is at the end of the input
    stream.
  * Add a method to reset a CodedInputStream's size counter; useful when
    reading many messages with the same stream.
  * equals() and hashCode() now account for unknown fields.

  Python
  * Added slicing support for repeated scalar fields. Added slice retrieval and
    removal of repeated composite fields.
  * Updated RPC interfaces to allow for blocking operation.  A client may
    now pass None for a callback when making an RPC, in which case the
    call will block until the response is received, and the response
    object will be returned directly to the caller.  This interface change
    cannot be used in practice until RPC implementations are updated to
    implement it.
  * Changes to input_stream.py should make protobuf compatible with appengine.

Kenton Varda

unread,
May 13, 2009, 7:21:05 PM5/13/09
to Protocol Buffers
Updated documentation covering all this has been submitted and should go live in a couple hours.

Henner Zeller

unread,
May 13, 2009, 7:41:01 PM5/13/09
to Kenton Varda, Protocol Buffers
Thanks for releasing!

Good enough, these things happen. Thanks for your continuous support
and hard work!

-h

Peter K.

unread,
May 13, 2009, 8:35:32 PM5/13/09
to Protocol Buffers
Good job, Kenton!

Thanks for your efforts.

Ciao,

Peter K.

clint.foster

unread,
May 14, 2009, 10:18:28 AM5/14/09
to Protocol Buffers
It's very nice to see support in the API for length-prefixed messages
and blocking RPC's. Both will reduce the amount of boilerplate code
needed for many protobuf applications.

Antony Dovgal

unread,
May 14, 2009, 10:27:21 AM5/14/09
to clint.foster, Protocol Buffers
On 14.05.2009 18:18, clint.foster wrote:
> It's very nice to see support in the API for length-prefixed messages

Yes, native support for this kind of feature would be very welcome.

--
Wbr,
Antony Dovgal

Kenton Varda

unread,
May 14, 2009, 4:02:47 PM5/14/09
to Antony Dovgal, clint.foster, Protocol Buffers
Yep, it's there in Java.  I didn't get the chance to add the equivalent support to C++ or Python yet, but if someone wants to submit a patch, go for it.

Chris

unread,
May 15, 2009, 11:51:11 AM5/15/09
to Protocol Buffers
Kenton Varda wrote:
> Here's the major changes (from CHANGES.txt):
>
> General
> * Repeated fields of primitive types (types other that string,
> group, and
> nested messages) may now use the option [packed = true] to get a more
> efficient encoding. In the new encoding, the entire list is written
> as a single byte blob using the "length-delimited" wire type. Within
> this blob, the individual values are encoded the same way they would
> be normally except without a tag before each value (thus, they are
> tightly "packed").
I see http://code.google.com/apis/protocolbuffers/docs/proto.html has
been updated.
I will add Haskell support for this.

> * For each field, the generated code contains an integer constant
> assigned
> to the field number. For example, the .proto file:
> message Foo { optional int bar_baz = 123; }
> would generate the following constants, all with the integer value
> 123:
> C++: Foo::kBarBazFieldNumber
> Java: Foo.BAR_BAZ_FIELD_NUMBER
> Python: Foo.BAR_BAZ_FIELD_NUMBER
> Constants are also generated for extensions, with the same naming
> scheme.
> These constants may be used as switch cases.
Currently the wire layer has the field number baked in; it is never
exposed except through the reflection API.

Not hard to define a bunch of Int values. But in Haskell these cannot
be used as case targets. To do that I have to create the Int values as
Enum constructors. Which is less good.

For now I'll just ignore them, and add a note about it to the user.
This will be a demand driven feature.


> other version of Google Test if needed.

> * It is now an error to define a default value for a repeated field.
> Previously, this was silently ignored (it had no effect on the
> generated
> code).

easy


> * Fields can now be marked deprecated like:
> optional int32 foo = 1 [deprecated = true];
> Currently this does not have any actual effect, but in the future
> the code
> generators may generate deprecation annotations in each language.

easy
> protoc


> * The type name resolver will no longer resolve type names to
> fields. For
> example, this now works:
> message Foo {}
> message Bar {
> optional int32 Foo = 1;
> optional Foo baz = 2;
> }
> Previously, the type of "baz" would resolve to "Bar.Foo", and
> you'd get
> an error because Bar.Foo is a field, not a type. Now the type of
> "baz"
> resolves to the message type Foo. This change is unlikely to make a
> difference to anyone who follows the Protocol Buffers style guide.

Ack, the Haskell version needs to be updated to track this change.
This means I have to go back and understand the name resolution module
in my Haskell implementation.
Hmmm....
It currently has a "resolve in environment" that returns the first hit.
I'll have to update that.
> C++


> * DescriptorPool::AllowUnknownDependencies() may be used to tell
> DescriptorPool to create placeholder descriptors for unknown entities
> referenced in a FileDescriptorProto. This can allow you to parse
> a .proto
> file without having access to other .proto files that it imports, for
> example.

hmmm....odd.
> Java


> * New I/O methods parseDelimitedFrom(), mergeDelimitedFrom(), and
> writeDelimitedTo() read and write "delemited" messages from/to a
> stream,
> meaning that the message size precedes the data. This way, you
> can write
> multiple messages to a stream without having to worry about delimiting
> them yourself.

This will help responding to that FAQ.

Chris Kuklewicz

unread,
May 17, 2009, 9:57:36 AM5/17/09
to Protocol Buffers
I am patching the Haskell implementation and I have a follow up
question to this:

On May 14, 12:06 am, Kenton Varda <ken...@google.com> wrote:
>   * The type name resolver will no longer resolve type names to fields.  For
>     example, this now works:
>       message Foo {}
>       message Bar {
>         optional int32 Foo = 1;
>         optional Foo baz = 2;
>       }
>     Previously, the type of "baz" would resolve to "Bar.Foo", and you'd get
>     an error because Bar.Foo is a field, not a type.  Now the type of "baz"
>     resolves to the message type Foo.  This change is unlikely to make a
>     difference to anyone who follows the Protocol Buffers style guide.

You did not fix this similar case, where the "int32 Baz" field causes
an error when trying to extend the "message Baz":

package test_resolve;

message Foo {
optional int32 Baz = 2;

extend Baz {
optional int32 nonsense = 76335;
}
}

message Baz {
extensions 100 to max;
}

I will make the Haskell version compatible with protoc-2.1.0 but
perhaps you want to make the above a legal proto file in the future.

What do people think?

Kenton Varda

unread,
May 18, 2009, 2:06:20 PM5/18/09
to Chris Kuklewicz, Protocol Buffers
You're right, this should have been handled too.  Oh well, I'll stick it on my TODO list for a later release.

Hopefully most open source users follow the style guide, making this irrelevant.  Inside Google we unfortunately have a lot of code that predates the style guide and uses CamelCase field names.  People kept getting confused as to why code like this didn't work:
  optional Foo Foo = 1;
even though similar code also would not work in any of our main programming languages (C++, Java, Python).  Eventually I caved and made it a non-error so that people would stop complaining.

Chris

unread,
May 19, 2009, 3:08:51 AM5/19/09
to Protocol Buffers
As for the improved name resolution:

Kenton Varda wrote:
> On Sun, May 17, 2009 at 6:57 AM, Chris Kuklewicz <turin...@gmail.com
> <mailto:turin...@gmail.com>> wrote:
>
>
> What do people think?
>
>
> You're right, this should have been handled too. Oh well, I'll stick
> it on my TODO list for a later release.

I am quite happy to have helped. The two name resolution functions were
side by side in my code; making the decision to fix only one looked
odd. I will immediately support resolving extendee names to Messages,
ignoring Fields and other things.

As for the "packed" fields, I just now got my Haskell version to the
next stage:
(1) new new runtime and converter both compile with "packed" support
(2) it can convert the new unittest.proto into Haskell code with
"packed" support
(3) the generated Haskell code compiles against new runtime with
"packed" support
(4) it has regenerated its own descriptor.proto and been recompiled
(enums needed an extra line to get packed fields efficiently)

So the next stage is to test the behavior and see if it can
inter-operate with itself and with packed files from protobuf-2.1.0.

Making the extension fields also "packable" was tedious but did not
require redesigning anything. Whew. The "unknown" field support did
not need updating at all.

As for the newly exposed field number constants: I cannot make them a
proper enum data type in Haskell because those are closed definitions
and so could not include any of the extension fields outside the
message's own proto file. I could still make them type safe constants,
but these could not be used as targets of a case statement. The data is
available through reflection, so I will wait to implement anything else
until an actual person comes to me with a use case that I can make
design decisions for.

As for delimiting messages by prepending the length: I already had these
commands, so all I did was change the documentation from "author's
extension" to "compatible with protobuf-2.1.0". Not that I actually
tested it...

--
Chris

Reply all
Reply to author
Forward
0 new messages