Why not allow null values in Java for scalar types?

James

unread,

Dec 17, 2010, 11:25:56 AM12/17/10

to Protocol Buffers

I am new to using Protocol Buffers and was really surprised why a
compiler option is not available to control how scalar types are
mapped in Java when generating the message objects. I understand the
concept of having a null int in C++ is not possible, but isn't is
really backwards to have to call another method in order to check if a
value has been set in Java? Having a null reference in Java does
exactly this!

In reading the reference docs:
http://code.google.com/apis/protocolbuffers/docs/reference/java-generated.html

The compiler will generate the following accessor methods in both the
message class and its builder:

* bool hasFoo(): Returns true if the field is set.
* int getFoo(): Returns the current value of the field. If the
field is not set, returns the default value.

Why not give protoc another control flag that would map all primitive
data types to their corresponding objects? (i.e. This would just map
a int -> Integer or double -> Double or a long - > Long) This surely
would not break the protocol, but would give a much cleaner and "Java
like" way of programming. I just feel like I am working in Java but
with a "C++ like" interface. Java now has auto-boxing so really if
you like working with int and doubles instead of the Object
counterparts, you would still be okay...

Wouldn't you rather code like a "Java Programmer" by writing:

if(getFoo() != null)
{
// do something with getFoo()
}

instead of:

if(hasFoo())
{
// do something with getFoo()
}

Think of how this works when you are not trying to pass along the data
to another method that already takes an Integer object and that method
is ok with the reference being null.

I rather write:

doSomethingWithFoo(getFoo());

instead of

doSomethingWithFoo(hasFoo() ? getFoo() : null);

Kenton Varda

unread,

Dec 22, 2010, 7:09:39 PM12/22/10

to James, Protocol Buffers

There are problems with this:

1) Boxing and unboxing primitives is relatively expensive, compared to just passing them as primitives. If performance matters to you at all (and for many protobuf users, it does), you probably don't want this.

2) If you accept messages from untrusted sources, your resulting code will be more prone to security problems. For example, say you write some code where the field "foo" is declared optional but, in practice, is always set. It's likely that you're going to end up with some code that *assumes* that it is set, and so doesn't check for null. This code may pass all your tests because people unfortunately don't usually test against invalid inputs. If some malicious user then sends you a message missing that field, your code is going to crash. The protobuf design avoids that by having the getters return a default value if the field is not set, so if you forget to check, it's not a huge deal.

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To post to this group, send email to prot...@googlegroups.com.
To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.

David Yu

unread,

Dec 23, 2010, 5:59:41 AM12/23/10

to Kenton Varda, James, Protocol Buffers

On Thu, Dec 23, 2010 at 8:09 AM, Kenton Varda <ken...@google.com> wrote:

There are problems with this:

1) Boxing and unboxing primitives is relatively expensive, compared to just passing them as primitives. If performance matters to you at all (and for many protobuf users, it does), you probably don't want this.

The cached instances from unboxing (Integer.valueOf, etc) will help a little bit.

Also, with objects, you will not need the boolean field hasX to check if the field is set. You just need to check for null.