Why to reinvent the wheel ?

4,039 views
Skip to first unread message

Kalki70

unread,
Nov 8, 2010, 10:34:11 PM11/8/10
to Protocol Buffers, kal...@pobox.com
Hello,

I just discovered this developers tool, and I can't understand why it
was invented. Why didn't Google use ASN.1, which is standard and it is
used for this, to make a language, platform independent description of
data to be enconded later as XML, or different binary formats, that
can be faster and more efficient?

All this is like reinventing ASN.1

For instance, the example shown on the web page :

message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;

enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}

message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
}

In ASN.1, which is a standard used for over 20years, the same could be
written similar to this (I haven't used it in a while, maybe I made
some mistakes ):


PhoneType ::= ENUMERATED { MOBILE, HOME, WORK }
PhoneNumber ::= SEQUENCE
{
number [1] IA5String ,
phone [2] PhoneType DEFAULT HOME
}

Person ::= SEQUENCE
{
name [1] IA5String ,
id [2] INTEGER,
email [3] OCTET STRING OPTIONAL
phone [4] SET OF PhoneNumber
}

Best regards,

Luis

Kenton Varda

unread,
Nov 9, 2010, 12:42:51 AM11/9/10
to Kalki70, Protocol Buffers, kal...@pobox.com
My understanding of ASN.1 is that it has no affordance for forwards- and backwards-compatibility, which is critical in distributed systems where the components are constantly changing.


--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To post to this group, send email to prot...@googlegroups.com.
To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.


Kenton Varda

unread,
Nov 9, 2010, 12:59:15 AM11/9/10
to Kalki70, Protocol Buffers, kal...@pobox.com
OK, I looked into this again (something I do once every few years when someone points it out).

ASN.1 *by default* has no extensibility, but you can use tags, as I see you have done in your example.  This should not be an option.  Everything should be extensible by default, because people are very bad at predicting whether they will need to extend something later.

The bigger problem with ASN.1, though, is that it is way over-complicated.  It has way too many primitive types.  It has options that are not needed.  The encoding, even though it is binary, is much larger than protocol buffers'.  The definition syntax looks nothing like modern programming languages.  And worse of all, it's very hard to find good ASN.1 documentation on the web.

It is also hard to draw a fair comparison without identifying a particular implementation of ASN.1 to compare against.  Most implementations I've seen are rudimentary at best.  They might generate some basic code, but they don't offer things like descriptors and reflection.

So yeah.  Basically, Protocol Buffers is a simpler, cleaner, smaller, faster, more robust, and easier-to-understand ASN.1.

Christopher Smith

unread,
Nov 9, 2010, 2:11:19 AM11/9/10
to Kenton Varda, Kalki70, Protocol Buffers, kal...@pobox.com
On Mon, Nov 8, 2010 at 9:59 PM, Kenton Varda <ken...@google.com> wrote:
The bigger problem with ASN.1, though, is that it is way over-complicated.

THIS
 
 It has way too many primitive types.  It has options that are not needed.  The encoding, even though it is binary, is much larger than protocol buffers'.

Actually, the PER encoding isn't too bad, although it doesn't encode ints using varint style encoding, which tends to help with most data sets.
 
 The definition syntax looks nothing like modern programming languages.  And worse of all, it's very hard to find good ASN.1 documentation on the web.

Yup, this one too. On the plus side, you can find the standards well defined, which helps when building independent implementations.
 
It is also hard to draw a fair comparison without identifying a particular implementation of ASN.1 to compare against.  Most implementations I've seen are rudimentary at best.  They might generate some basic code, but they don't offer things like descriptors and reflection.

Complexity yields rudimentary implementations.
 
So yeah.  Basically, Protocol Buffers is a simpler, cleaner, smaller, faster, more robust, and easier-to-understand ASN.1.

:-)

--
Chris

multijon

unread,
Nov 9, 2010, 8:13:00 AM11/9/10
to Protocol Buffers
As a side note, the company I worked at used ASN.1 for five years to
encode all of its product's communication messages (Using PER
encoding), with what was supposed to be a highly optimized
implementation of ASN.1.

One of my last projects in the company was to try and convert our
encoding method (and the underlying data structure) from ASN.1 to
Protobuf. A project that was estimated to be long and tiring turned
out to be rather easy, eliminating plenty of unnecessary (in protobuf,
but necessary in ASN.1) memory allocations, thus both speeding
performance and decreasing the memory footprint of our product by
50-70% (!).

So yeah, I'll join Kenton's description of Protobuf to be a 'simpler,
cleaner, smaller, faster, more robust and easier-to-understand ASN.1'.

Jon

On Nov 9, 12:59 am, Kenton Varda <ken...@google.com> wrote:
> OK, I looked into this again (something I do once every few years when
> someone points it out).
>
> ASN.1 *by default* has no extensibility, but you can use tags, as I see you
> have done in your example.  This should not be an option.  Everything should
> be extensible by default, because people are very bad at predicting whether
> they will need to extend something later.
>
> The bigger problem with ASN.1, though, is that it is way over-complicated.
>  It has way too many primitive types.  It has options that are not needed.
>  The encoding, even though it is binary, is much larger than protocol
> buffers'.  The definition syntax looks nothing like modern programming
> languages.  And worse of all, it's very hard to find good ASN.1
> documentation on the web.
>
> It is also hard to draw a fair comparison without identifying a particular
> implementation of ASN.1 to compare against.  Most implementations I've seen
> are rudimentary at best.  They might generate some basic code, but they
> don't offer things like descriptors and reflection.
>
> So yeah.  Basically, Protocol Buffers is a simpler, cleaner, smaller,
> faster, more robust, and easier-to-understand ASN.1.
>
>
>
>
>
>
>
> On Mon, Nov 8, 2010 at 9:42 PM, Kenton Varda <ken...@google.com> wrote:
> > My understanding of ASN.1 is that it has no affordance for forwards- and
> > backwards-compatibility, which is critical in distributed systems where the
> > components are constantly changing.
>
> >> protobuf+u...@googlegroups.com<protobuf%2Bunsubscribe@googlegroups.c om>
> >> .

Kalki70

unread,
Nov 9, 2010, 9:04:31 AM11/9/10
to Protocol Buffers


On Nov 9, 2:42 am, Kenton Varda <ken...@google.com> wrote:
> My understanding of ASN.1 is that it has no affordance for forwards- and
> backwards-compatibility, which is critical in distributed systems where the
> components are constantly changing.

Unfortunately, you are wrong, it provides forwards and backwards
compatibility, and that is why it is still being used in so many
protoclos, like telecommunications protocols, that keep changing every
year.

Best regards,

Luis
> > protobuf+u...@googlegroups.com<protobuf%2Bunsu...@googlegroups.com>
> > .

Kalki70

unread,
Nov 9, 2010, 9:15:00 AM11/9/10
to Protocol Buffers
Hello again,

On Nov 9, 2:59 am, Kenton Varda <ken...@google.com> wrote:
> OK, I looked into this again (something I do once every few years when
> someone points it out).
>
> ASN.1 *by default* has no extensibility, but you can use tags, as I see you
> have done in your example.  This should not be an option.  Everything should
> be extensible by default, because people are very bad at predicting whether
> they will need to extend something later.

You can extend it even without using tags. I used tags to show a more
similar encoding as Protobuf.

>
> The bigger problem with ASN.1, though, is that it is way over-complicated.
>  It has way too many primitive types.  It has options that are not needed.
>  The encoding, even though it is binary, is much larger than protocol
> buffers'.  The definition syntax looks nothing like modern programming
> languages.  And worse of all, it's very hard to find good ASN.1
> documentation on the web.
>

You saw on my example that syntax is quite similar to that of
protobuf. Yes, it CAN be very complicated, but it doesn't need to be.
You can use it in a simpler way. You are not forced to use all
primitive types. The encoding can be shorter or bigger, depending on
the enconding rules used. PER is a good example of short encoding, if
length is important in a specific project.
And the best part is that all these encodings are STANDARD. Why to
create a propietary implementation if there is a standard?
It is like microsoft using their propietary formats for offiice
documents, instead on open standards.
Wht if tomorrow Microsoft says : "Oh, I need something simpler than
ASN.1, so we will create a different model": And then we wil have a
different version of "protobuf". And like this, many companies could
develop their own implementations.


> It is also hard to draw a fair comparison without identifying a particular
> implementation of ASN.1 to compare against.  Most implementations I've seen
> are rudimentary at best.  They might generate some basic code, but they
> don't offer things like descriptors and reflection.
>
Well, Google, with all their resources, could have, instead of
creating "something like ASN.1, but different", put some effort
developing some apis, like those from protobuf, but for ASN.1. They
could have supported maybe a subset of full ASN.1, but they would
still be using an standard, and it would be easier to communicate with
existing systems that support ASN.1

> So yeah.  Basically, Protocol Buffers is a simpler, cleaner, smaller,
> faster, more robust, and easier-to-understand ASN.1.

Oh, come on, you are not being serious. You can say many of those
things. What do you mean, for instance : "faster" ??
ASN.1 has no speed. The speed comes from the ASN.1 compiler. "More
robust" ?? I see there are, like with any development, bugs that are
being fixed. Better to stick with somethign that has been used for
over 20 years, if you think about "More robust":
"Easier to understand", well, you saw my example, and it is very easy
to understand.
>
> On Mon, Nov 8, 2010 at 9:42 PM, Kenton Varda <ken...@google.com> wrote:
> > My understanding of ASN.1 is that it has no affordance for forwards- and
> > backwards-compatibility, which is critical in distributed systems where the
> > components are constantly changing.
>
> >> protobuf+u...@googlegroups.com<protobuf%2Bunsu...@googlegroups.com>
> >> .

Kalki70

unread,
Nov 9, 2010, 9:21:44 AM11/9/10
to Protocol Buffers


On Nov 9, 10:13 am, multijon <multi...@gmail.com> wrote:
> As a side note, the company I worked at used ASN.1 for five years to
> encode all of its product's communication messages (Using PER
> encoding), with what was supposed to be a highly optimized
> implementation of ASN.1.
>
> One of my last projects in the company was to try and convert our
> encoding method (and the underlying data structure) from ASN.1 to
> Protobuf. A project that was estimated to be long and tiring turned
> out to be rather easy, eliminating plenty of unnecessary (in protobuf,
> but necessary in ASN.1) memory allocations, thus both speeding
> performance and decreasing the memory footprint of our product by
> 50-70% (!).

Again I must insist about this. ASN.1 doesn't use memory allocations.
It is an abstract language to describe data, like the abstract syntax
created from scratch for protobuf, but so similar to simplified ASN.1.
That is what it means : "Abstract Syntax Notation".
Maybe the ASN.1 compiler that you used used too many memory
allocations or was not too fast. There are some very good, like from
OSS Novalka.
But, Google could have sticked to this existing standard, ASN.1, and
develop their own compiler, supporting maybe not all ASN.1, just the
part needed in protobuf. And then we could have the best of both
worlds. A good, simple, fast compiler, that creates simple to use
APIs, and compatible with an already existing standard.

>
> So yeah, I'll join Kenton's description of Protobuf to be a 'simpler,
> cleaner, smaller, faster, more robust and easier-to-understand ASN.1'.

I already made my coments on this.

Best regards,

Luis
> > >> protobuf+u...@googlegroups.com<protobuf%2Bunsubscr...@googlegroups.c om>

Kalki70

unread,
Nov 9, 2010, 9:44:43 AM11/9/10
to Protocol Buffers
Oh, I just found out that you are the developer. It seems I am not the
only one who thinks you reinvented the wheel :

http://google-opensource.blogspot.com/2008/07/protocol-buffers-googles-data.html

As someone mentioned there :

"The apparent complexity of ASN.1 is largely due to its flexibility -
if you're using only the sort of functionality that pbuffer gives you,
it would be pretty much the same, I would think."

Luis.

On Nov 9, 2:42 am, Kenton Varda <ken...@google.com> wrote:
> My understanding of ASN.1 is that it has no affordance for forwards- and
> backwards-compatibility, which is critical in distributed systems where the
> components are constantly changing.
>
> > protobuf+u...@googlegroups.com<protobuf%2Bunsu...@googlegroups.com>
> > .

Henner Zeller

unread,
Nov 9, 2010, 10:42:12 AM11/9/10
to Kalki70, Protocol Buffers
There are some standards that pack many different ways to put things
under one umbrella. This is because they have been decided by
committee with many different companies involved that all want to
bring in 'their way'. The multitude of ways to encode things with
ASN.1 (why there is more than one ?) or to choose between encoding
with tags or without (from what I gathered from this communication)
means that there are more ways than one to do things.

How worth is a standard of communication if there are a myriad ways to
do things ? You end up with implementations that only support part of
it and suddenly the standard becomes worthless because two different
implementations don't support the same options of doing things on both
sides.

This discussion reminds me of SOAP. In the beginning, there was
XML-RPC - extremely simple way to communicate using XML with some
small shortcomings but a developer could get started in seconds
reading a simple example communication.
Then standard committees came in and started developed SOAP: out came
a 'standard' that easily is printed 5000 pages with all different ways
to encode things in XML, different transport schemes etc. Same problem
for many years: many implementations that all couldn't speak to each
other. Complicated to use (Yes and I wrote a book about it - I'll
never touch SOAP again).
It got a bit better but people moved on and don't use SOAP anymore.

Protocol buffers are simple and there is only one way to do things.
Simplicity usually wins over developers. This is why they developed
Protocol Buffers at Google in the early 2000s. They're putting it out
here for others to use, but you don't have to.

> To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.

Christopher Smith

unread,
Nov 9, 2010, 12:01:33 PM11/9/10
to Kalki70, Protocol Buffers
On Tue, Nov 9, 2010 at 6:15 AM, Kalki70 <kalk...@gmail.com> wrote:
On Nov 9, 2:59 am, Kenton Varda <ken...@google.com> wrote:
> OK, I looked into this again (something I do once every few years when
> someone points it out).
>
> ASN.1 *by default* has no extensibility, but you can use tags, as I see you
> have done in your example.  This should not be an option.  Everything should
> be extensible by default, because people are very bad at predicting whether
> they will need to extend something later.

You can extend it even without using tags. I used tags to show a more
similar encoding as Protobuf.

Without tags it is not extensible in the same sense as protocol buffers.

> The bigger problem with ASN.1, though, is that it is way over-complicated.
>  It has way too many primitive types.  It has options that are not needed.
>  The encoding, even though it is binary, is much larger than protocol
> buffers'.  The definition syntax looks nothing like modern programming
> languages.  And worse of all, it's very hard to find good ASN.1
> documentation on the web.
>

You saw on my example that syntax is quite similar to that of
protobuf. Yes, it CAN be very complicated, but it doesn't need to be.
You can use it in a simpler way. You are not forced to use all
primitive types.

You are looking at it merely from the perspective of someone wishing to use ASN.1, not someone implementing it. The problem is the complexity of implementing ASN.1 in itself brings with it a number of shortcomings.
 
The encoding can be shorter or bigger, depending on
the enconding rules used. PER is a good example of short encoding, if
length is important in a specific project.

PER's encoding of ints is a great example of ASN.1's disadvantages.

Most of the compactness in PER severely limits extensibility, as it relies on the decoder having a complete knowledge of the encoded data structure. Even in such cases, if you have fields which normally have small values (2^21 or less) but occasionally may have larger values (and this is a pretty common scenario), the protocol buffer encoding mechanism is going to be much more compact. Even outside of that case, the sheer number of types supported by ASN.1 requires that in order for PER encodings to be extensible, the "preamble" for fields must take up far more space than it does with protocol buffers.
 
And the best part is that all these encodings are STANDARD. Why to
create a propietary implementation if there is a standard?

I think this question has already been answered, but it is worth pointing out that the fact that the market place of ideas has produced Hessian, Avro, Thrift, etc., suggests there 
 
It is like microsoft using their propietary formats for offiice
documents, instead on open standards.

No, it actually is quite different. The initial implementation of PB was meant for encoding data that was not to be shared with outside parties (and we are all glad that that data isn't going to be shared). Secondly, the PB implementation is far, far simpler than the ASN.1 standard. Finally, Google provides their complete implementation of an encoder/decoder as open source.
 
Wht if tomorrow Microsoft says : "Oh, I need something simpler than
ASN.1, so we will create a different model": And then we wil have a
different version of "protobuf". And like this, many companies could
develop their own implementations.

This is the reality now.
 
> It is also hard to draw a fair comparison without identifying a particular
> implementation of ASN.1 to compare against.  Most implementations I've seen
> are rudimentary at best.  They might generate some basic code, but they
> don't offer things like descriptors and reflection.

Well, Google, with all their resources, could have, instead of

When protocol buffers were developed, "with all their resources" wasn't nearly as impressive sounding as it was now. The reality is that Google had very limited resources and more importantly it would have wasted them without realizing any advantage (and certainly realizing several disadvantages) for its business.
 
creating "something like ASN.1, but different", put some effort
developing some apis, like those from protobuf, but for ASN.1. They
could have supported maybe a subset of full ASN.1, but they would
still be using an standard, and it would be easier to communicate with
existing systems that support ASN.1

I think you are assuming that being able to communicate with existing ASN.1 systems would be one of the goals. That's a pretty huge assumption. But hey, let's assume that for a moment.

There are, last I checked, a half dozen encoding formats for ASN.1. Let's say you implemented just one (PER). There are two variants of PER (aligned and not aligned). Even if you restrict yourself to one variant, you still have ~18 different field types to handle. Even if you restrict yourself to a subset of about 5 that represent functionality inside protocol buffers, you have range encodings which represent 2^n variations, while protocol buffers essentially restricts you to 2. So the end result is at best, to get down to the functionality in PB's, the "subset" here is at best about 1% of what is defined as ASN.1. I guarantee you that had that been PB's implementation, it would have provided effectively no benefit to someone wishing to communicate with a system using ASN.1, while simultaneously not achieving the other goals set out by PB's as effectively as this implementation. In short: nobody would use it.
 
> So yeah.  Basically, Protocol Buffers is a simpler, cleaner, smaller,
> faster, more robust, and easier-to-understand ASN.1.

Oh, come on, you are not being serious. You can say many of those
things. What do you mean, for instance : "faster" ??
ASN.1 has no speed. The speed comes from the ASN.1 compiler.

Try finding an ASN.1 compiler that produces faster encoders and decoders. You can't even get close. The complexity of ASN.1 alone seriously gets in the way of being able to make them efficient, and even if you overcome that you still end up spending a lot more time implementing features that takes away from time for tuning.
 
"More
robust" ?? I see there are, like with any development, bugs that are
being fixed. Better to stick with somethign that has been used for
over 20 years, if you think about "More robust":

I'm sorry, but I've used a variety of ASN.1 implementations and protocol buffers, and there is no question that the ASN.1 implementations, even the 20 year old ones, are far buggier. PB's simplicity gives it an unfair advantage, but that's the point.
 
"Easier to understand", well, you saw my example, and it is very easy
to understand.

I actually had to bring up a developer on ASN.1 recently. He still doesn't feel he really understands it after working on it for several weeks and getting a lot of coaching from me. Even more recently he had to learn protocol buffers. Without any support, in one day he had his solution working and within a week he felt like he had a complete understanding...

ASN.1 has been around for over two decades, as you point out. The fact that despite this, when Google needed PB's, there wasn't an open source solution that met their needs is kind of a testament in itself about ASN.1's shortcomings. To this day I'd be really surprised if you could point to an ASN.1 implementation that would meet the needs of most people using protocol buffers (in my case, I looked, and didn't find anything close). The fact that over the last 25 years there have been many, many other encoding standards defined (including XML, which now has its own ASN.1 encoding rule) makes a very clear case that ASN.1 is not the ideal solution for a lot of situations.

It turns out in practice, defining things like encoding standards is very much a process of making trade offs rather than "good vs. bad" decisions, and ASN.1 clearly made a whole host of different trade offs than protocol buffers. Obviously in your case that makes ASN.1 a better choice for you, but that is exactly why protocol buffers might be a better solution for people in a different context.

--
Chris

Christopher Smith

unread,
Nov 9, 2010, 12:15:12 PM11/9/10
to Kalki70, Protocol Buffers
On Tue, Nov 9, 2010 at 6:21 AM, Kalki70 <kalk...@gmail.com> wrote:
On Nov 9, 10:13 am, multijon <multi...@gmail.com> wrote:
> As a side note, the company I worked at used ASN.1 for five years to
> encode all of its product's communication messages (Using PER
> encoding), with what was supposed to be a highly optimized
> implementation of ASN.1.
>
> One of my last projects in the company was to try and convert our
> encoding method (and the underlying data structure) from ASN.1 to
> Protobuf. A project that was estimated to be long and tiring turned
> out to be rather easy, eliminating plenty of unnecessary (in protobuf,
> but necessary in ASN.1) memory allocations, thus both speeding
> performance and decreasing the memory footprint of our product by
> 50-70% (!).

Again I must insist about this. ASN.1 doesn't use memory allocations.

Yes, but the implementations do. Try getting an ASN.1 implementation as efficient as protocol buffers. It takes a lot more effort than implementing protocol buffers from scratch. That's part of the advantage.
 
There are some very good, like from OSS Novalka.

First, they provide about a dozen different products for ASN.1, which by itself saying a lot about ASN.1's complexity. Secondly, the tool isn't available as open source. Additionally, the solution is so "cheap" that they don't list pricing (I'm trying to remember the pricing the last time I looked at it, but it escapes me). Finally, the last time I tested it, the encode/decode wasn't nearly as fast in C++, let alone Java. There isn't even an implementation for Python or a variety of other languages that have very fast and fully compatible implementations of protocol buffers. Those are some huge advantages for protocol buffers in my mind, despite OSS having devoted far more resources to tackling the problem than everyone collectively has on protocol buffers.
 
--
Chris

Christopher Smith

unread,
Nov 9, 2010, 12:25:21 PM11/9/10
to Kalki70, Protocol Buffers
On Tue, Nov 9, 2010 at 6:44 AM, Kalki70 <kalk...@gmail.com> wrote:
Oh, I just found out that you are the developer. It seems I am not the
only one who thinks you reinvented the wheel :

http://google-opensource.blogspot.com/2008/07/protocol-buffers-googles-data.html

Yes, this is not a new line of thinking.
 
As someone mentioned there :

"The apparent complexity of ASN.1 is largely due to its flexibility -
if you're using only the sort of functionality that pbuffer gives you,
it would be pretty much the same, I would think."

I think what you are failing to appreciate is that that flexibility in and of itself imposes a huge toll. Think of C vs. C++.

--
Chris

Kenton Varda

unread,
Nov 9, 2010, 1:35:03 PM11/9/10
to Kalki70, Protocol Buffers
On Tue, Nov 9, 2010 at 6:15 AM, Kalki70 <kalk...@gmail.com> wrote:
Well, Google, with all their resources, could have, instead of
creating "something like ASN.1, but different", put some effort
developing some apis, like those from protobuf, but for ASN.1. They
could have supported maybe a subset of full ASN.1, but they would
still be using an standard, and it would be easier to communicate with
existing systems that support ASN.1

As Chris astutely points out, the complexity of ASN.1 makes it much harder to produce a high-quality implementation.  Google's resource are not unlimited.  Engineering time spend implementing obscure primitive types or other unnecessary features is time NOT spent improving the speed and robustness of the implementation.
 
Oh, come on, you are not being serious. You can say many of those
things. What do you mean, for instance : "faster" ??
ASN.1 has no speed. The speed comes from the ASN.1 compiler.

I challenge you, then, to show me an ASN.1 implementation that is faster than our C++ protobuf implementation.  Given the 10+-year head start, there should be one, right?
 
"More
robust" ?? I see there are, like with any development, bugs that are
being fixed. Better to stick with somethign that has been used for
over 20 years, if you think about "More robust":

There are bugs, but not critical ones.  Remember that essentially *all* of Google's server-to-server communications use protobufs.
 
"Easier to understand", well, you saw my example, and it is very easy
to understand.

Sorry, but no.  The syntax is not intuitive to C++ or Java developers the way protobuf's syntax is.

Dave Bailey

unread,
Nov 9, 2010, 6:54:53 PM11/9/10
to Protocol Buffers
On Nov 9, 6:15 am, Kalki70 <kalki...@gmail.com> wrote:
> Hello again,
>
> On Nov 9, 2:59 am, Kenton Varda <ken...@google.com> wrote:
[...]
> > The bigger problem with ASN.1, though, is that it is way over-complicated.
> >  It has way too many primitive types.  It has options that are not needed.
> >  The encoding, even though it is binary, is much larger than protocol
> > buffers'.  The definition syntax looks nothing like modern programming
> > languages.  And worse of all, it's very hard to find good ASN.1
> > documentation on the web.
>
> You saw on my example that syntax is quite similar to that of
> protobuf. Yes, it CAN be very complicated, but it doesn't need to be.
> You can use it in a simpler way. You are not forced to use all
> primitive types. The encoding can be shorter or bigger, depending on
> the enconding rules used. PER is a good example of short encoding, if
> length is important in a specific project.
> And the best part is that all these encodings are STANDARD. Why to
> create a propietary implementation if there is a standard?
> It is like microsoft using their propietary formats for offiice
> documents, instead on open standards.
[...]

It's not a proprietary implementation if the entire specification and
implementation source code can be downloaded by anyone who wants it,
not just for C++/Java/Python, but for many other languages as well.

> > It is also hard to draw a fair comparison without identifying a particular
> > implementation of ASN.1 to compare against.  Most implementations I've seen
> > are rudimentary at best.  They might generate some basic code, but they
> > don't offer things like descriptors and reflection.
>
> Well, Google, with all their resources, could have, instead of
> creating "something like ASN.1, but different", put some effort
> developing some apis, like those from protobuf, but for ASN.1. They
> could have supported maybe a subset of full ASN.1, but they would
> still be using an standard, and it would be easier to communicate with
> existing systems that support ASN.1

I can't speak for Google, but maybe they had no need to communicate
with existing systems that support ASN.1. Maybe they were building
something from scratch, and therefore had the opportunity to develop
something optimal for their particular problems.

The thing is that this is just the way of the world, and I think it's
a good thing. Imagine if nobody ever said to themselves, I know there
is an XYZ (operating system, RDBMS, web server, ...) out there, but
maybe I can make an implementation that works better for me. I'll
call it Linux. Or MySQL, or nginx, or who knows what. And I'm sure,
as those systems were built, the authors had to deal with people
asking them why they were reinventing the wheel.

> > So yeah.  Basically, Protocol Buffers is a simpler, cleaner, smaller,
> > faster, more robust, and easier-to-understand ASN.1.
>
> Oh, come on, you are not being serious. You can say many of those
> things. What do you mean, for instance : "faster" ??
> ASN.1 has no speed. The speed comes from the ASN.1 compiler. "More
> robust" ?? I see there are, like with any development, bugs that are
> being fixed. Better to stick with somethign that has been used for
> over 20 years, if you think about "More robust":

I think "faster" in this case refers to the run time performance of
the generated code, in terms of encoding and decoding a particular
block of structured data.

With regard to bugs, I've been using the C++ protobuf implementation
provided by Google for about a year and a half, and I have never
experienced a crash or bug of any kind, through many, many terabytes
of protobuf I/O, tens of billions of messages with multiple levels of
nested submessages and dozens of fields. So as far as I am concerned,
that is robust enough for me. And if I have any questions about the
way the implementation works, I can just go look at the code. The
simplicity of the specification and implementation are a huge
advantage there.

-dave

Austin Ziegler

unread,
Nov 10, 2010, 7:24:33 AM11/10/10
to prot...@googlegroups.com
On 2010-11-09, at 12:01, Christopher Smith <cbs...@gmail.com> wrote:
On Tue, Nov 9, 2010 at 6:15 AM, Kalki70 <kalk...@gmail.com> wrote:
On Nov 9, 2:59 am, Kenton Varda <ken...@google.com> wrote:
> The bigger problem with ASN.1, though, is that it is way over-complicated.
>  It has way too many primitive types.  It has options that are not needed.
>  The encoding, even though it is binary, is much larger than protocol
> buffers'.  The definition syntax looks nothing like modern programming
> languages.  And worse of all, it's very hard to find good ASN.1
> documentation on the web.

You saw on my example that syntax is quite similar to that of
protobuf. Yes, it CAN be very complicated, but it doesn't need to be.
You can use it in a simpler way. You are not forced to use all
primitive types.

You are looking at it merely from the perspective of someone wishing to use ASN.1, not someone implementing it. The problem is the complexity of implementing ASN.1 in itself brings with it a number of shortcomings.

In a project (LDAP support for Ruby) that I've been involved with for a few hears, we have to deal with ASN.1 BER encoding; a parser/generator can't really choose to do a subset of the encoding. It's a lot more complex to use the BER encoder than if LDAP were based on something that can be self-describing like PB.

ASN.1 is over-complex and under-understood pretty much universally.

-a

Oliver Jowett

unread,
Nov 11, 2010, 8:19:18 AM11/11/10
to Kalki70, Protocol Buffers
On Wed, Nov 10, 2010 at 3:21 AM, Kalki70 <kalk...@gmail.com> wrote:

> Maybe the ASN.1 compiler that you used used too many memory
> allocations or was not too fast. There are some very good, like from
> OSS Novalka.

I've used both OSS Nokalva's ASN.1 to Java compiler and protobuf in
anger. protobuf is at least as fast, provides a better API (especially
if you want to do any reflection), and is less buggy than OSS's
product. Being able to build protobuf from source makes our build
process a lot simpler, too.

We actually use both in our system - OSS when we must talk ASN.1 in
external protocols, and protobuf for our internal protocols where we
are not implementing to an external specification.

I think that protobuf's simplicity is a large part of why its
implementation is better than the various ASN.1 products out there.
ASN.1 seems to be the Ada of protocol description languages, really..

Oliver

Peter Ondruška

unread,
Mar 21, 2013, 4:14:20 AM3/21/13
to prot...@googlegroups.com
Well, you hit the nail, with that note about M$ Wheel(tm) :D

Henner Zeller

unread,
Mar 21, 2013, 4:17:40 AM3/21/13
to Vic Devin, Protocol Buffers, Kalki70
If you look at the complicated way CORBA works and the simplicity and of Protobufs, then you know. You assume infinite resources in companies (and hence assume assume politicial statement behind protobufs), but in reality it is about getting things done. CORBA is a non-starter.


On 21 March 2013 09:04, Vic Devin <vfn...@gmail.com> wrote:

This thread seems to be a bit old but anyway this topic became suddenly important for me since I start to hear the "Protobuf" new magic word.

Now I was a bit surprised to discover that it is actually the same idea as CORBA!

So the question asked at the beginning of this thread „Why to reinvent the wheel?“ is appropriate.

Why Google need to reinvent the wheel if technologies were available?

Now I have my own answer: companies dont trust anyone but themselves, they aspire to be the world domination force, they dont care about existing standards and they replace them with their own!

Next question is: do I want to live in such a world where there is no trust and collaboration between people for the common good? My answer is NO, which seems to go against the current „politically correct“ way of life of having to be a liberal capitalist „competitive“, against the competitors (as if they were your worst enemies)!

I have the feeling that the computer industry could be much more efficient if there were better standards and improving them instead of reinventing the wheel by private companies!

So when company X will finally be dead, it will bring down also the companies which relied on their wheels, unless of course they will be quick to replace them!

I understand that long-lasting standards drag with them old technologies and way of thinking, but why couldn’t they make a new CORBA standard to be the same as Protobuf if this is so good, but still call it CORBA so that everyone knows which wheel you are talking about, and that it doesnt belong to any particular company but goes to the benefit of us all!?!?

Sure also Microsoft will tell you that you dont need to reinvent the wheel, but provided you use their wheel! ;)

 http://dotnet.dzone.com/articles/don%E2%80%99t-reinvent-wheel-part-1

So as I see it this is not so much a technological motivation, rather a political / economical one. 

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.

To post to this group, send email to prot...@googlegroups.com.

Oliver Jowett

unread,
Mar 21, 2013, 4:24:30 AM3/21/13
to Vic Devin, prot...@googlegroups.com, Kalki70
On Thu, Mar 21, 2013 at 8:04 AM, Vic Devin <vfn...@gmail.com> wrote:

This thread seems to be a bit old but anyway this topic became suddenly important for me since I start to hear the "Protobuf" new magic word.

Now I was a bit surprised to discover that it is actually the same idea as CORBA!


No it's not - protobuf can be used to build a RPC mechanism, but there are many other things that you can use protobuf for that you can't use CORBA for.

For example, I've used it to write persistent EDRs to a file in a structured format, and to stream network messages where there's no simple request/response pairing.

The analogy with ASN.1 is a better one (and see my previous comments on that)

Oliver

Vic Devin

unread,
Mar 21, 2013, 4:48:48 AM3/21/13
to prot...@googlegroups.com, Vic Devin, Kalki70
Thanks for your quick answers.

Maybe my point was not so clearly conveyed. What I mean is not to say which technology is better, CORBA, ASN.1, or Protobuf. 
What I mean is that they all try to solve, leaving aside all tech details, the same basic problem, i.e. remote communication between software entities.
We should be referring to this concept in a more standard way, naming it in a standard way.

To make the comparison with the wheel again, we dont call it anything else then "wheel" because the concept is a given and widely and universally understood. 
Can you imagine the 1st cave men who first invented it, one would make it wooden and call it "Spinner", another make it marble and call it "Stonner" and another "Crasher".
The poor cave men still didnt make the process of abstracting the concept to simply "wheel", away from any particular "implementation details".

As I see it the progress happens also because of finding these universal abstractions, and Protobuf dont need to say that reinvented the way of making 2 remote module able to communicate with each other, its simply a different (better) implementation of the same (abstract) concept.

Oliver Jowett

unread,
Mar 21, 2013, 5:16:59 AM3/21/13
to Vic Devin, prot...@googlegroups.com, Kalki70
On Thu, Mar 21, 2013 at 8:48 AM, Vic Devin <vfn...@gmail.com> wrote:
 
What I mean is that they all try to solve, leaving aside all tech details, the same basic problem, i.e. remote communication between software entities.

There are plenty of applications of protobuf (and ASN.1 for that matter) that do not involve remote communication at all.

Oliver

Feng Xiao

unread,
Mar 21, 2013, 1:55:24 PM3/21/13
to Vic Devin, Protocol Buffers
On Thu, Mar 21, 2013 at 1:48 AM, Vic Devin <vfn...@gmail.com> wrote:
Thanks for your quick answers.

Maybe my point was not so clearly conveyed. What I mean is not to say which technology is better, CORBA, ASN.1, or Protobuf. 
What I mean is that they all try to solve, leaving aside all tech details, the same basic problem, i.e. remote communication between software entities.
We should be referring to this concept in a more standard way, naming it in a standard way.
What's the standard that defines remote communication? Who defines the standard and why should we follow?
 

To make the comparison with the wheel again, we dont call it anything else then "wheel" because the concept is a given and widely and universally understood. 
Can you imagine the 1st cave men who first invented it, one would make it wooden and call it "Spinner", another make it marble and call it "Stonner" and another "Crasher".
The poor cave men still didnt make the process of abstracting the concept to simply "wheel", away from any particular "implementation details".

As I see it the progress happens also because of finding these universal abstractions, and Protobuf dont need to say that reinvented the way of making 2 remote module able to communicate with each other, its simply a different (better) implementation of the same (abstract) concept.


On Thursday, March 21, 2013 10:24:30 AM UTC+2, Oliver wrote:
On Thu, Mar 21, 2013 at 8:04 AM, Vic Devin <vfn...@gmail.com> wrote:

This thread seems to be a bit old but anyway this topic became suddenly important for me since I start to hear the "Protobuf" new magic word.

Now I was a bit surprised to discover that it is actually the same idea as CORBA!


No it's not - protobuf can be used to build a RPC mechanism, but there are many other things that you can use protobuf for that you can't use CORBA for.

For example, I've used it to write persistent EDRs to a file in a structured format, and to stream network messages where there's no simple request/response pairing.

The analogy with ASN.1 is a better one (and see my previous comments on that)

Oliver

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.
To post to this group, send email to prot...@googlegroups.com.

Vic Devin

unread,
Mar 25, 2013, 7:25:38 AM3/25/13
to prot...@googlegroups.com, Vic Devin
Oliver, 
yes, Protobuf can do many more other things than simply remoter communication, but again my point is that these are all "low level" stuff which is again in "the reinventig wheel territory"; if software industry wishes to make a giant leap forward we should start building applications without even thinking about all these low level details, RPC, serialization formats, etc. 
In fact thanks to "high level" programming languages we are able to forget the complicated modern CPU architectures which we would have to think about if we were stuck with programming in assembler!
Ideally I want to concentrate on the "business logic", relying on the fact that I dont need to care about the rest.
Feng,
The standard that defines remote communication is (or used to be!?) CORBA. Using a standard is like talking the same language, so there is a bigger chance that we might better communicate and understand each other.

Oliver Jowett

unread,
Mar 25, 2013, 9:06:58 AM3/25/13
to Vic Devin, prot...@googlegroups.com
On Mon, Mar 25, 2013 at 11:25 AM, Vic Devin <vfn...@gmail.com> wrote:
Oliver, 
yes, Protobuf can do many more other things than simply remoter communication, but again my point is that these are all "low level" stuff which is again in "the reinventig wheel territory"; if software industry wishes to make a giant leap forward we should start building applications without even thinking about all these low level details, RPC, serialization formats, etc. 
In fact thanks to "high level" programming languages we are able to forget the complicated modern CPU architectures which we would have to think about if we were stuck with programming in assembler!
Ideally I want to concentrate on the "business logic", relying on the fact that I dont need to care about the rest.

Not quite sure where you're trying to go with this discussion, but anyway ..

My point is that you'd picked the wrong abstraction to start with. If it's not simple to pick the right abstraction in the first place, why do you think that the standard you end up with is actually going to be universally useful? Same argument applies even at the lower level. Just picking something at random, you wouldn't want to use protobuf's encoding in places where DER is currently used (e.g. digital signatures) because protobuf does not guarantee a particular encoding for a given input.

Just trying to ignore the low level concerns usually ends up producing bad software in my experience - yes you can always slap another layer of abstraction on top of something, but there's a cost to that.

Feng,
The standard that defines remote communication is (or used to be!?) CORBA. Using a standard is like talking the same language, so there is a bigger chance that we might better communicate and understand each other.

Sun RPC, ROSE (and the related uses of it in e.g. TCAP etc.), SOAP, and plain RMI spring to mind just off the top of my head. What's special about CORBA?

Having a variety of standards and implementations is actually a good thing here because they have different characteristics and you can pick the one that is appropriate for your situation, rather than being forced to live with the design decisions that someone else happened to make. Have a large toolkit, pick the right tool for the job, and let the best implementation win!

Oliver

Christopher Smith

unread,
Mar 25, 2013, 7:11:31 PM3/25/13
to Vic Devin, Protocol Buffers
On Mon, Mar 25, 2013 at 4:25 AM, Vic Devin <vfn...@gmail.com> wrote:
Oliver, 
yes, Protobuf can do many more other things than simply remoter communication, but again my point is that these are all "low level" stuff which is again in "the reinventig wheel territory"; if software industry wishes to make a giant leap forward we should start building applications without even thinking about all these low level details, RPC, serialization formats, etc. 

I very much doubt that any serialization framework *ever* created a giant leap forward. Sometimes it makes sense to revisit the plumbing and build a better "wheel" because the original design wasn't well suited to how it is currently used (ironically this has, in fact, happened several times with the wheel... nobody uses the original design). In the case of protocol buffers, there were lots of problems with existing frameworks. If you don't perceive any problems, you probably shouldn't bother using protocol buffers. Indeed, your perception of what protocol buffers are suggests a use case where they'd be a bit of a square peg to your round hole anyway.

In fact thanks to "high level" programming languages we are able to forget the complicated modern CPU architectures which we would have to think about if we were stuck with programming in assembler!

There is a certain philosophy that goes along that way. More than a few hundred times it has been demonstrated to be problematic, particularly when working on solutions that are unique in some capacity (sometimes just scale or efficiency requirements). Certainly it's not hard to talk to folks who have switched to protobufs from other serialization frameworks and realized significant wins. It's been a big enough deal that other projects have spun up trying to improve on the advantages of protobufs.
 
Ideally I want to concentrate on the "business logic", relying on the fact that I dont need to care about the rest.

Ideally, I want hardware constraints & failures not to happen. We don't often get to live in an ideal world. I agree though, it'd be nice if it weren't ever an issue. I don't see how fixing the plumbing so it works better somehow gets in the way of other people living in a world where the plumbing is never an issue...
 
The standard that defines remote communication is (or used to be!?) CORBA.

Umm... no.

Which version fo CORBA would be that "standard" then? Would that be CORBA 1.0 (hard to believe since it didn't even have a standardized serialization format for over the wire communications), Corba 2.0, which generally doesn't support encrypted transport and won't get through most firewalls? Or was that Corba 3, with it's common Component Model? Is that using IIOP (which ironically, doesn't work so well over large portions of the Internet), or HTIOP, or SSLIOP, or ZIOP? Presumably it is an implementation with a POA because without that most stuff doesn't work together at all?

CORBA never really took over. It had a brief window of opportunity when Netscape integrated IIOP in to the browser, but that has long since passed. To this day NFS still talks ONC-RPC, and SMB is basically Microsoft's evil DCE RPC variant. I think there are still some ugly DCOM things floating around as well. If you talk to Facebook/Twitter/most other web services, you're mucking around with JSON or XML (over SOAP or REST), and of course there's all that WSDL out there. Heck, even Java programs that used RMI (mostly all dead and buried at this point), where CORBA compatibility was in theory just a configuration flag you set, generally eschewed using RMI-IIOP and instead went with JRMP whenever possible.

More importantly though, lots of people don't even using protocol buffers for remote communications, but rather for storing data (arguably this was the primary use case at Google originally as well), and CORBA's solutions in that space were *anything* but broadly used and for most people would be solving the wrong problem (seriously, who would want to store web log data in something like ObjectStore?!).
 
Using a standard is like talking the same language, so there is a bigger chance that we might better communicate and understand each other.

Yes, but one problem with a lot of standards (CORBA included) is that in practice they end up representing multiple ways of doing things, making it so that implementing "the standard" is like implementing half a dozen standards (actually, CORBA and ASN.1 both have so much functionality and are flexible enough I'm sure there is a way to make them compatible with a particular use case of protocol buffers with minimal effort), which in practice means that instead of implementing one thing well, you end up implementing several things poorly. CORBA was complicated enough that it actually didn't work terribly well at all until after the hype about it had died down.

Honestly, your entire line of reasoning could just as easily have applied to CORBA when it came out (indeed, IIRC CORBA was so much a reinvention of the wheel that HP's original CORBA ORB was implemented on top of their existing RPC solution).

I get that you don't see the advantage of using protocol buffers for your circumstance, and given you know the problem better than anyone else in the discussion, odds are you are correct. But please don't try to argue that this makes CORBA a standard or a better solution for everyone's needs than protocol buffers.

--Chris



--
Chris

Christopher Smith

unread,
Mar 28, 2013, 7:35:01 PM3/28/13
to Vic Devin, Protocol Buffers
Not to flog a dead horse TOO much, but while researching some Thrift related frustrations today, I stumbled across a somewhat old blog entry by Lev Walkin, developer of the asn1c compiler. http://lionet.info/asn1c/blog/2010/07/18/thrift-semantics/

I think he does a pretty good job pointing out some of the problems with ASN.1, while pointing out mistakes some of the "newer" serialization formats made where they failed to learn from ASN.1's history. Of course, I imagine lots of people disagree about what is a "mistake" and what is a "design decision", but in those differences of opinion lies precisely the reason why one would "reinvent the wheel": one person's "yet another wheel" is another person's "train tracks".
--
Chris
Reply all
Reply to author
Forward
0 new messages