Current implementation of equals may have undesirable consequences.

130 views
Skip to first unread message

James

unread,
Aug 16, 2009, 4:10:51 PM8/16/09
to Project Lombok
I've been having some issues working with the a class using the @Data
annotation. I've discovered something while looking through the
lombok code and would like to see what some other people out there
think.

I noticed that currently lombok requires the instances of the two
classes to be the same to be equal. This could be an issue you have
an object that is wrapped with a proxy. It's class is not the same as
the actual object, but should be treated as such.

I think instead of requiring the class to be the same, it should be
looking for the case that either of them should be able to be cast to
the other (common ancestry at some point other than object).

Any other thoughts on this?

James

unread,
Aug 16, 2009, 4:13:21 PM8/16/09
to Project Lombok
A possible way to do this may be to do a test of instanceof instead of
getClass() == getClass.

Reinier Zwitserloot

unread,
Aug 16, 2009, 7:14:38 PM8/16/09
to Project Lombok
The contract for the equals method states that equals should be both
transitive and commutative. The logical conclusion of that rule means
that 2 objects can only be equal if they are of the same type.
Therefore, lombok's method of generating equals will not be changing -
any equals implementation that employs instanceof while still
comparing all fields is *BY DEFINITION* broken.

And now, for a long story about why this is:

transitive means:

if a.equals(b) and b.equals(c), then a.equals(c) MUST be true.

commutative means:

if a.equals(b) then b.equals(a) MUST be true.


Imagine we have Shape, which holds x and y coordinates, and we have a
subclass of shape named Circle that adds a radius to this.

We'll make Shape s, at coordinates (10,10), and Circle c1, at
coordinates (10, 10) with a radius of 50. We'll also make circle c2,
also at coordinates (10, 10), but with a radius of 100.

If we use your idea and use instanceof, then s.equals(c1) would be
true, as c1 is a Shape, and it has the same coordinates as s. However,
c1.equals(s) would be false, as a Shape is not an 'instanceof' a
Circle. However, if s.equals(c1) but !c1.equals(s), our equals
implementation is broken, as it's not commutative.

So, let's expand on your proposal and try to fix it via adding a new
rule: if ( other.equals(this) ) return true;

However, that's no help - we'd still have a broken equals
implementation. After all:

c1.equals(c2) = FALSE (radii not equal)
c1.equals(s) = TRUE (true because s.equals(this) was true)
s.equals(c2) = TRUE (true because c2 is instanceof Shape and has the
same coords as s).

And thus, we have broken the transitivity rule - by transitivity,
c1.equals(c2) should have been true too.

The only way to fix it is for the equals implementation of circle to
ignore the radius field entirely, but then c1.equals(c2) would be
true, which is rather silly - why are 2 circles with different radii
equals to each other? That's a preposterous notion.

But, let's roll with that. Let's ignore radius. Where do we stop? If
circle should ignore its fields, then why shouldn't shape ignore its
fields as well? But if every class ignores all fields when computing
equality, then the only correct equals implementation is "return
true;" - which is useless - all objects would be equal to each other!

So, if we are to go this way, then one MUST specify the target type.
Thus, something like:

@EqualityViaAncestor(Shape.class) @Data
public class Circle extends Shape {
private int radius;
}


And this is getting rather complicated. It also becomes hard to read -
you'd have to know that the EqualityViaAncestor thing means that the
'radius' field is completely ignored for hashCode and equals. The
equals implementation should strictly call super and nothing else. And
therein lies the crux of the matter:

Let's say lombok supported all this. We've just made equality more
complicated by a large factor, we've added another annotation, and for
*WHAT*? The only benefit is that you don't have to write:

public boolean equals(Object other) {
return super.equals(other);
}

public int hashCode() {
return super.hashCode();
}

That's it. Oodles of complexity to avoid writing 5 lines of simple,
short code.


That's not at all how lombok is designed to work. We have a complex
thing, that is rarely used, for which you save a tiny amount of
boilerplate. That's the opposite of lombok's goals, which is to take
simple things, used often, that saves lots of boilerplate.

James

unread,
Aug 16, 2009, 9:16:42 PM8/16/09
to Project Lombok
Ok. I guess the thought had not entirely occurred to me that
instanceof isn't symmetric. It makes sense, I just had not taken the
time to think about it.

All of this being said, there are still some other possible side
effects.

Consider this scenario...
We are using the Circle class as you have it defined above that
implements an interface ICircle.
In the application, we have two instances of ICircle, one that was
created by a framework and as such was created as a proxy of the
actual Circle object. The other is a direct instantiation of the
circle class. We set both instances to have the same exact data.

According to the equals method that lombok provides, they will not be
equal because the class types are exactly not be equal. However, they
will both compute to the same hashcode since the underlying hashcode
comes from the Circle class.

As a result if we have a HashMap<ICircle>, we really don't know which
object we will get back out. Will it be the proxy or the direct
instance? If they were equal, it really wouldn't matter, but the
objects say they are not equal so this becomes ambiguous.


I guess this all makes sense, but it kind of limits the usefulness of
the generation of equals and hashcode. In the applications that I'm
working with, all instances of classes will be proxies due to usage of
AOP. Under this definition of equals, the current lombok code won't
work.

Reinier Zwitserloot

unread,
Aug 16, 2009, 9:27:01 PM8/16/09
to Project Lombok
No equals method will work at all. HashMaps simply cannot be used with
objects that might be proxied if what you're saying is true. It's a
problem with AOP, not with lombok.

If memory serves, Effective Java, which, if anything can be called
that, is the java bible, suggests lombok's exact implementation as the
only proper way to write an equals method.

Peter Becker

unread,
Aug 16, 2009, 11:00:26 PM8/16/09
to project...@googlegroups.com
The only proper solution would be having a hash structure using an
implementation of hashCode() and equals(..) external to the objects. The
whole idea of having those two methods on Object was bad to start with,
but we have to live with it. Equivalence is contextual and not intrinsic.

Peter

Peter Becker

unread,
Aug 16, 2009, 11:04:23 PM8/16/09
to project...@googlegroups.com
Reinier Zwitserloot wrote:
[...]

> If memory serves, Effective Java, which, if anything can be called
> that, is the java bible, suggests lombok's exact implementation as the
> only proper way to write an equals method.
>
IIRC the first edition actually proposed the wrong version (using
"instanceof") and it got fixed in the second edition. Eclipse's
generation code used to do the wrong thing, too. It's a very common
mistake since people want to believe you could actually have a useful
Object.equals(..) implementation in the context of polymorphism.

Peter

James

unread,
Aug 17, 2009, 12:44:21 AM8/17/09
to Project Lombok
Thanks for the comments. Not exactly the answers I was hoping for,
but I think they are the correct answer. As I was doing a little bit
of reading on this, I did find a bit of a (small) controversy about
using instanceof vs getClass, and I do believe that the getClass is
the correct solution. It may lead to some un-obvious hiccups, but I
think it makes the most sense in terms of correctness & simplicity.

Thanks again for your feedback on this.

Reinier Zwitserloot

unread,
Aug 17, 2009, 2:27:20 AM8/17/09
to Project Lombok
Peter:

Exactly right. Comparable/Comparator is a much better API design. Two
objects can be equal, or not, in a gazillion different ways. You'd
have to have an external 'equalator'. But, as you say, we're stuck
with this ugly design :(. By the way, Javolution is an alternative to
java.util that actually supports such a construct, if you're
interested.


James:

It's unfortunate such an on-the-surface simple thing as equality ends
up causing this many headaches. Whenever I get around to writing up
the FAQ, I'll trawl the forums for issues, and this one will
definitely get a place there. Thanks for checking out lombok, and
apologies that I had to disappoint you.

James

unread,
Aug 17, 2009, 3:29:49 AM8/17/09
to Project Lombok
I really like what I've seen from lombok. I know I've had a few
issues (largely due to trying to use AOP), but I'm very impressed with
the simplicity and capabilities that lombok provides. I'm also really
looking forward to the concept of the property annotation I read you
guys are working on. Due to the issue of equality, I'll have to do a
little bit of re-thinking on my design, but in retrospect it is
probably something that would have caused me grief if I hadn't caught
it this early on.

Thanks again.

Maaartin

unread,
Oct 12, 2009, 8:33:09 PM10/12/09
to Project Lombok
On Aug 17, 1:14 am, Reinier Zwitserloot <reini...@gmail.com> wrote:
...
> The contract for the equals method states that equals should be both
> transitive and commutative. The logical conclusion of that rule means
> that 2 objects can only be equal if they are of the same type.

Although I agree with everything you wrote below, I disagree with
this. You only need to guarantee that no subclass overrides the method
in a non-compatible way, e.g., by comparing additional fields. The
problem is you can hardly achieve it in a meaningfull way.

You can do it by using final, but most of the time you need it is
because of using proxies (otherwise having "equal" instances of
different classes hardly makes any sense). But e.g., lazy hibernate
proxies can be only generated if there're no final methods at all.

On Aug 17, 3:16 am, James <warchild...@gmail.com> wrote:
...
> According to the equals method that lombok provides, they will not be
> equal because the class types are exactly not be equal.  However, they
> will both compute to the same hashcode since the underlying hashcode
> comes from the Circle class.

Having the same hashCode for two objects is never a problem, you could
even set hashCode=0 for all objects and have still everything working
(but very very slow).

> As a result if we have a HashMap<ICircle>, we really don't know which
> object we will get back out.  Will it be the proxy or the direct
> instance?

IMHO, this can't be a consequence of same hashCode. However, I see
it's a problem.

Michael Lorton

unread,
Oct 12, 2009, 9:27:55 PM10/12/09
to project...@googlegroups.com
On Aug 17, 1:14 am, Reinier Zwitserloot <reini...@gmail.com> wrote:

The contract for the equals method states that equals should be both
transitive and commutative. The logical conclusion of that rule means
that 2 objects can only be equal if they are of the same type.

Does that follow?  Consider:

Integer two = 2;
Double twoPointZero = 2.0;
assert(two.equals(twoPointZero));

That may not be a perfect example (because of floating-point imprecision) but I can easily imagine several closely interoperating types where objects of different classes are "equal".

M.


Reinier Zwitserloot

unread,
Oct 12, 2009, 10:33:22 PM10/12/09
to Project Lombok
Yes, it follows.

Reason #1: Try it. you'll get an AssertError; new Integer(2) is NOT
equal to new Double(2.0)!!! This is a little wonky, because "2 == 2.0"
evaluates to true.

Here's the theory: An integer and a double are never equals to each
other. It's comparing apples and oranges. The .equals() method
actually allows you to compare apples to oranges; "someString".equals
(new ArrayList()); is perfectly legit, and obviously returns false.
Similarly, new Integer(2).equals(new Double(2.0)) is perfectly legit
and returns false because they aren't even the same type.

However, java math operations (which == is, if it involves number
types on both sides) do not allow using two different types. However,
in certain circumstances, a cast to convert one of the two elements to
the other (or, in rare cases, convert -both-) is implicit, if one of
them can be cast to the other with no loss (such as int to double,
byte to int, or float to double). That's what's really happening in '2
== 2.0' - it does not mean that int 2 is equal to double 2.0. It
simply means that '(double)2 == 2.0' is true, and that '2 == 2.0' has
that cast in it too, it's just implicit.

Fun source of puzzlers, of course. Still, .equals() of Integer and
Double don't break the rule.

Reason #2: It follows if you accept certain pragmatic axiomas. For
example, you can get around the rule if a subclass is not allowed to
change the state space in any way. For example, one can argue that
ArrayList and LinkedList are both Lists in the purest sense - they do
not have extra (publically visible) state. This isn't like a Shape and
a Circle, where the Circle adds 'radius' to the state space. ArrayList
and LinkedList break the rule, in fact: They will be equals() to each
other if they both contain the same elements in the same order.
However, this kind of implementation-based splitting is extremely
rare, and lombok does not cater to the rare cases. Another way around
it is if you say that the new state space is not considered in equals
comparisons, but then you'd have the weird situation that 2 circles
are equal if their center is in the same place, even if their radii
are different, which is silly. Another one is where you extend a class
just to add features that purely operate on the original space. This
is kind of like LinkedList, which adds Stack methods to an ordinary
list. This kind of design sucks in too many ways to count, and lombok
tends not to cater to outdated patterns.

Either way, implementing this stuff is impossible to automate, and
thus, even if it was both common and a good idea, lombok just cannot
go there. See my first post in this thread for why automating it is
hard to impossible.


NB: Just as an FYI, comparing 2 with 2.0 is not going to go wrong due
to floating point imprecision. A 'double' can express with perfect
accuracy every single number that is expressable as an 'int'. A double
can NOT represent every legal long without loss, but it CAN represent
every feasible timestamp (in getCurrentTimeMillis() format) - at least
every timestamp between the big bang and the heat death of the
universe. Things go pearshaped if you try to store non-integral
numbers in them and expect perfect accuracy.

Martin Grajcar

unread,
Oct 15, 2009, 2:56:56 AM10/15/09
to project...@googlegroups.com
Hi!

Reinier Zwitserloot wrote:
> Yes, it follows.

No it doesn't. Again, I agree with nearly everything you wrote,
but I'm quite sure my example (given at the end of this posting) works
(at least in principle, I never tested it).

My example is surely not very practical, especially there's little hope for autogenerated equals.

> Reason #1: Try it. you'll get an AssertError; new Integer(2) is NOT
> equal to new Double(2.0)!!! This is a little wonky, because "2 == 2.0"
> evaluates to true.

I think he just wanted to demostrate the principle.

> Here's the theory: An integer and a double are never equals to each
> other. It's comparing apples and oranges. The .equals() method
> actually allows you to compare apples to oranges; "someString".equals
> (new ArrayList()); is perfectly legit, and obviously returns false.
> Similarly, new Integer(2).equals(new Double(2.0)) is perfectly legit
> and returns false because they aren't even the same type.
>
> However, java math operations (which == is, if it involves number
> types on both sides) do not allow using two different types. However,
> in certain circumstances, a cast to convert one of the two elements to
> the other (or, in rare cases, convert -both-) is implicit, if one of
> them can be cast to the other with no loss (such as int to double,
> byte to int, or float to double).

Three of the implicit conversions are lossy: int or long to float and long to double.
Either sun didn't know better, or they prefered to mimic C.

> That's what's really happening in '2
> == 2.0' - it does not mean that int 2 is equal to double 2.0. It
> simply means that '(double)2 == 2.0' is true, and that '2 == 2.0' has
> that cast in it too, it's just implicit.
>
> Fun source of puzzlers, of course. Still, .equals() of Integer and
> Double don't break the rule.
>
> Reason #2: It follows if you accept certain pragmatic axiomas. For
> example, you can get around the rule if a subclass is not allowed to
> change the state space in any way. For example, one can argue that
> ArrayList and LinkedList are both Lists in the purest sense - they do
> not have extra (publically visible) state. This isn't like a Shape and
> a Circle, where the Circle adds 'radius' to the state space. ArrayList
> and LinkedList break the rule, in fact: They will be equals() to each
> other if they both contain the same elements in the same order.
> However, this kind of implementation-based splitting is extremely
> rare, and lombok does not cater to the rare cases. Another way around
> it is if you say that the new state space is not considered in equals
> comparisons, but then you'd have the weird situation that 2 circles
> are equal if their center is in the same place, even if their radii
> are different, which is silly.

I think even in such a case there can be an useful equals like this:

Shape.equals(Object o) {
...
if (!(o instanceof Shape)) return false;
Shape s = (Shape) o;
if (getCenter().equals(s.getCenter()) return false;
if (s.getClass() == Shape.class) return true;
return s.equals(this); // delegate to subclass as it's better qualified
}

But there're at least two problems with it (despite it's overcomplicated):
It's quite errorprone on it's own and additionally you must override equals in each subclass.
I believe I could get it work even with some Square-s and Rectangle-s, but it's not easy.

> Another one is where you extend a class
> just to add features that purely operate on the original space. This
> is kind of like LinkedList, which adds Stack methods to an ordinary
> list. This kind of design sucks in too many ways to count, and lombok
> tends not to cater to outdated patterns.
>
> Either way, implementing this stuff is impossible to automate, and
> thus, even if it was both common and a good idea, lombok just cannot
> go there. See my first post in this thread for why automating it is
> hard to impossible.
>
>
> NB: Just as an FYI, comparing 2 with 2.0 is not going to go wrong due
> to floating point imprecision. A 'double' can express with perfect
> accuracy every single number that is expressable as an 'int'. A double
> can NOT represent every legal long without loss, but it CAN represent
> every feasible timestamp (in getCurrentTimeMillis() format) - at least
> every timestamp between the big bang and the heat death of the
> universe. Things go pearshaped if you try to store non-integral
> numbers in them and expect perfect accuracy.

Regards, Maaartin.

public abstract class Matrix {
Matrix(int size) {
this.size = size; // disallow subclassing outside of the package
}

@Override
public boolean equals(Object obj) {
if (this == obj) return true;
if (obj == null) return false;
if (!(obj instanceof Matrix)) return false;
final Matrix other = (Matrix) obj;
if (size != other.size) return false;
for (int i=0; i<size; ++i) {
for (int j=0; j<size; ++j) {
if (get(i, j) != other.get(i, j)) return false;
}
}
return true;
}

public abstract double get(int i, int j);

public int getSize() {
return size;
}

@Override
public final int hashCode() {
return size; // very stupid and inefficient but correct
}

private final int size;
}

public class DiagonalMatrix extends Matrix {
public DiagonalMatrix(int size) {
super(size);
data = new double[size];
}

@Override
public double get(int i, int j) {
return i!=j ? 0 : data[i];
}

private final double[] data;
}

public class IdentityMatrix extends Matrix {
public IdentityMatrix(int size) {
super(size);
}

@Override
public double get(int i, int j) {
return i!=j ? 0 : 1;
}
}

public class SquareMatrix extends Matrix {
public SquareMatrix(int size) {
super(size);
data = new double[size][size];
}

@Override
public double get(int i, int j) {
return data[i][j];
}

private final double[][] data;
}

Reinier Zwitserloot

unread,
Oct 15, 2009, 6:24:55 AM10/15/09
to Project Lombok
Well, he may just have been using that for illustration, but, so was
I: sun takes their 'equals should return false when comparing 2
different types' *VERY* seriously, and breaks it pretty much only for
java.util.LinkedList and ArrayList.

As you say, there's no hope to auto-generate this stuff, so I don't
really see what lombok should be doing different, here. Lombok is not
a programming language; it makes common boilerplate easy for you. It
has no real business trying to create non-common boilerplate, or
common non-boilerplate. Fortunately your matrix situation is fairly
uncommon.

Sun knew about the lossy nature of long/double to float conversion.
They were unfortunately trying to mimic C.

Your equals example for shape is going to lead to a lot of mistakes.
People subclassing it with new state (such as a circle, adding
'radius', or a square, adding width) will probably misunderstand. They
SHOULD always return false when given a shape. People subclassing
without new state (say, a ShapeThatCanPrintHelloWorld) will likely
just call super.equals() to test if the Shape bits are equal (the
center), and in doing so, create an endless loop!


Your Matrix example is indeed 'correct' - though there are an awful
lot of caveats. For example, Matrix' constructor is package private,
and it has to be, because you MUST control all subclasses of this
thing lest some unthinking soul decided to create a ColouredMatrix
that also stores a colour for each matrix, and thus breaks the equals
contract. If this was common, we'd have to worry about it, but this
isn't the standard design. A Matrix should just contain all the
numbers directly, and you have some utility methods to cook up
identity and diagonal matrices. The argument in favour of hierarchy is
that you can create enormous identity and diagonal matrices in a
fraction of the memory requirements, and even create 'lazy' matrices
that don't calculate its contents until you need them. If you need
that, you'd either have to rewrite Matrix to optionally take a
'ValuesProvider', which doesn't really feel right either. The notion
of equality of java just falls flat here. What java really needs is
contextual equality, so that you can do:

Equality.equals(Matrix.class, matrixA, matrixB) - meaning: are object
A and B equal *IF* they are both treated as vanilla matrices? That
way, a red identity matrix of size 2 and a green identitymatrix of
size 2 should be equal, as the context explicitly says that the colour
is irrelevant for this comparison.

With such a structure, you can easily do this subclass stuff.
Technically, a DiagonalMatrix({1, 1}) is not equal to an IdentityMatrix
(2) in the "IdentityMatrix" context, as a DiagonalMatrix isn't an
IdentityMatrix. It's just kind of pointless to use DiagonalMatrix as
an equality context. So, it all works out.

Comparator was done right. Equality was done wrong. Eh. Them's the
breaks. Lombok can't change anything in rt.jar; at best, we can add to
it and reshuffle some names around. We can't mess with fundamental
rules, even if they fall short in situations like this.

Kevin Wong

unread,
Oct 15, 2009, 10:54:45 PM10/15/09
to Project Lombok
Lots of equals discussion and solutions here: http://tal.forum2.org/equals.
There are several solutions that claim to handle the inheritance case,
though I'm not sure if it will be practical to use any in Lombok.

jpp

unread,
Oct 16, 2009, 5:09:41 AM10/16/09
to Project Lombok
> If memory serves, Effective Java, which, if anything can be called
> that, is the java bible, suggests lombok's exact implementation as the
> only proper way to write an equals method

My Effective Java, Second Edition makes the following suggestions (pp.
42–43):

1. Use the == operator to check if the argument is a reference to this
object. […]
2. Use of instanceof operator to check if the argument has the correct
type. […]
3. Cast the argument to the correct type. […]
4. For each “significant” field in the class, check if that field of
the argument matches the corresponding field of this object.

Later (p. 46), it (consistently) proposes an implementation of equals
() for a PhoneNumber class that contains three short fields:

@Override public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof PhoneNumber))
return false;
PhoneNumber pn = (PhoneNumber)o;
return pn.lineNumber == lineNumber
&& pn.prefix == prefix
&& pn.areaCode == areaCode;
}

There are no corrections about this in the errata at
http://java.sun.com/docs/books/effective/errata.html. Based on the
discussion in this thread, which convinced me that instanceof is, if
not directly faulty, much more dangerous than getClass(), should we
take this to Josh?

Reinier Zwitserloot

unread,
Oct 16, 2009, 6:08:13 AM10/16/09
to Project Lombok
Yes, please do. I thought for sure that had been updated in a newer
edition.

On Oct 16, 11:09 am, jpp <jppel...@gmail.com> wrote:
> > If memory serves, Effective Java, which, if anything can be called
> > that, is the java bible, suggests lombok's exact implementation as the
> > only proper way to write an equals method
>
> My Effective Java, Second Edition makes the following suggestions (pp.
> 42–43):
>
> 1. Use the == operator to check if the argument is a reference to this
> object. […]
> 2. Use of instanceof operator to check if the argument has the correct
> type. […]
> 3. Cast the argument to the correct type. […]
> 4. For each “significant” field in the class, check if that field of
> the argument matches the corresponding field of this object.
>
> Later (p. 46), it (consistently) proposes an implementation of equals
> () for a PhoneNumber class that contains three short fields:
>
> @Override public boolean equals(Object o) {
>   if (o == this)
>     return true;
>   if (!(o instanceof PhoneNumber))
>     return false;
>   PhoneNumber pn = (PhoneNumber)o;
>   return pn.lineNumber == lineNumber
>       && pn.prefix == prefix
>       && pn.areaCode == areaCode;
>
> }
>
> There are no corrections about this in the errata athttp://java.sun.com/docs/books/effective/errata.html. Based on the
Reply all
Reply to author
Forward
0 new messages