@Data classes: problems with equals and hashCode()

3,872 views
Skip to first unread message

v6ak

unread,
Aug 4, 2009, 3:00:44 AM8/4/09
to Project Lombok
http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals%28java.lang.Object%29
http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode%28%29

The fourth equals's rule is "# It is consistent: for any non-null
reference values x and y, multiple invocations of x.equals(y)
consistently return true or consistently return false, provided no
information used in equals comparisons on the objects is modified.".
There is the problem: If I have a mutable @Data class, the equals
(Object) method violate this rule.
Solution: if I have a mutable class, it should not generate these
methods.

The second equals's rule is "# It is symmetric: for any non-null
reference values x and y, x.equals(y) should return true if and only
if y.equals(x) returns true."
So if I have an immutable class, it should be final because of the
equals(Object) and hashCode() methods.

Do you understand?

v6ak

unread,
Aug 4, 2009, 3:26:08 AM8/4/09
to Project Lombok
Well, immutable data classes needn't to be final, but they should have
final equals and hashCode.

You currently can use this hack:
import lombok.Data;

public @Data class MutableDataClass {

private int x;

@Override
public boolean equals(Object obj) {
return super.equals(obj);
}

@Override
public int hashCode(){
return super.hashCode();
}

public static void main(String[] args) {
System.out.println(new MutableDataClass().equals(new MutableDataClass
()));
System.out.println(new MutableDataClass().hashCode() == new
MutableDataClass().hashCode());
}

}

The output should be "false\nfalse" or (possibly) "false\ntrue". If
you comment or ommit hasCode and equals, the output is "true\ntrue".

The consistency are important in collenctions, especially in Set<E>
and Map<K, V>.

On 4 srp, 09:00, v6ak <vit...@gmail.com> wrote:
> http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals%28...http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode%...

Reinier Zwitserloot

unread,
Aug 4, 2009, 7:47:55 AM8/4/09
to Project Lombok
v6ak: I think you misread those rules.

It says, for example: *provided no information used in equals
comparisons on the object is modified*.

That's why you can have equals implementations for mutable objects. In
fact, the JDK core libraries contain tons of mutable objects with
hashCode/equals implementations. Think of ArrayList for example!


I'm fairly sure our equals/hashCode implementations are top-of-the-
line, following every rule and yet being as flexible as is possible
under those circumstances. The only rule it breaks is the no-
exceptions rule: hashCode and equals should never throw exceptions,
but you CAN get one in rare occasions: If you have an array that
contains itself, you'll get a StackOverflowError. However, the core
libraries also break this rule (ArrayList also throws a
StackOverflowError if it contains itself), and there's no viable way
to solve it.

v6ak

unread,
Aug 4, 2009, 12:15:39 PM8/4/09
to Project Lombok
I misunderstood the quoted part of the rule ("provided no information
used in equals comparisons on the object is modified"). I thought that
it must not provide any information about the modification.

Well, I understand it now, I think.

The second problem is not a problem because of:
if (o.getClass() != this.getClass()) return false;
instead of:
if (! (o instanceof ThisClass )) return false;
.
However it could be strange if I have data class Foo with fields x and
y, two instances of foo (foo1 and foo2) and foo1.getX().equals
(foo2.getX()) && foo1.getY().equals(foo2.getY()) && (!foo1.equals
(foo2)) is true. It can be caused by subclassing.


On 4 srp, 13:47, Reinier Zwitserloot <reini...@gmail.com> wrote:
> v6ak: I think you misread those rules.
>
> It says, for example: *provided no information used in equals
> comparisons on the object is modified*.
>
> That's why you can have equals implementations for mutable objects. In
> fact, the JDK core libraries contain tons of mutable objects with
> hashCode/equals implementations. Think of ArrayList for example!
>
> I'm fairly sure our equals/hashCode implementations are top-of-the-
> line, following every rule and yet being as flexible as is possible
> under those circumstances. The only rule it breaks is the no-
> exceptions rule: hashCode and equals should never throw exceptions,
> but you CAN get one in rare occasions: If you have an array that
> contains itself, you'll get a StackOverflowError. However, the core
> libraries also break this rule (ArrayList also throws a
> StackOverflowError if it contains itself), and there's no viable way
> to solve it.
>
> On Aug 4, 9:00 am, v6ak <vit...@gmail.com> wrote:
>
> >http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals%28......

Peter Becker

unread,
Aug 5, 2009, 4:52:31 AM8/5/09
to project...@googlegroups.com
v6ak wrote:
> http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals%28java.lang.Object%29
> http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode%28%29
>
[...]

> The second equals's rule is "# It is symmetric: for any non-null
> reference values x and y, x.equals(y) should return true if and only
> if y.equals(x) returns true."
> So if I have an immutable class, it should be final because of the
> equals(Object) and hashCode() methods.
>
I'm a bit behind with my email, but I don't think this part has been
answered yet.

A class declared to be immutable should IMO always be final, otherwise
you can't guarantee that all of its instances are immutable. After all
instances of a subclass are instances of the superclass, too. A common
problem you could get if you don't follow the rule is that someone uses
the supposedly immutable class as a key for a hash structure, but the
instance used has a hashCode() implementation using a mutable member --
which is a bad idea, but very common. The "no subclass" rule changes if
you either trust the world or have tooling in place to guarantee the
immutability by other means than the compiler.

You can implement polymorphic equals(..) and hashCode() without breaking
the contract, though: you just need to make sure you test for class
equality, not subclassing. You need that in general and it is a common
mistake to use "instanceof" in an equals(..) implementation. AFAIK it is
always either wrong or trivially the same as the class equality (if the
class is final). The assumption of a natural notion of equality is
questionable anyway, but unfortunately Java lacks the equivalent of a
Comparator.

Peter


Reinier Zwitserloot

unread,
Aug 5, 2009, 7:30:16 AM8/5/09
to Project Lombok
That's in fact exactly what lombok does:

equals will check if the incoming object's class is the same as its
own class. This makes it fully polymorphic, and you can in fact chain
equals calls by using the @EqualsAndHashCode(callSuper=true)
mechanism. The documentation states that you should be careful as not
all equals methods are implemented correctly, but it works fine if the
entire hierarchy all have lombok-generated equals methods. (in code,
lombok does this: if ( getClass() != o.getClass() ) return false; -
see the pre/post example on the features pages.

Immutable classes should probably be final, yes. However, it would be
rather overly presumptuous for lombok to toss a 'final' on your class
just because you're using @Data on a class with all-final fields. It
would reduce flexibility, it doesn't really reduce any boilerplate
("final" is 6 characters, and conveys important semantic information,
so it fulfills neither of the requirements to be boilerplate: It is
neither overly long, nor pure noise), and it would make lombok that
much more non-intuitive: If you don't already know that lombok semi-
magically makes your classes final if you @Data them when they contain
all-final fields, you'd never jump to that conclusion.

So, just keep manually writing 'final' in there:

public final @Data class Point { private int x, y; }


At some point an @Immutable annotation may become popular. It's a bit
beyond lombok's current technical abilities to add this anytime soon,
though. Also, there are issues of practicality involved:

- A file is immutable in the technical sense (in that its one 'name'
field is final), but as a practical sense, files aren't immutable at
all: They represent a file, and if I call '.delete()' on it, obviously
the state represented by the file object changes. This is endemic for
any 'pointer'-like object. If we call such objects immutable, then
immutable is relatively useless - the only practical consideration you
can give them, is that making defensive copies is useless. You can't
memoize any of the method calls (memoize = remember the result for a
given input, and cache it, automatically), and you can't streamline
aspects of its execution because the code isn't guaranteed to be side-
effect-free either.

So, let's try this again but this time with @SideEffectFree and
@Memoizable, which means, respectively: Calling this method NEVER
causes any state changes (example: .toLowerCase() on String, or .get()
on a map, mutable or not, as long as it isn't an access-indexed
linkedhashmap), and: If called on the same object with the same input,
the result will never change (again, .toLowerCase() would qualify for
that one). These are more practical:

Not checking the result of a @SideEffectFree method, or creating one
with a void return type, is pointless. Warn - it's neccessarily a bug.
This would catch "bigInteger.plus(otherBigInteger);" being erroneously
used as a statement to change 'bigInteger' (it doesn't change
anything; it returns a new BigInteger with the result).

@SideEffectFree, and especially @Memoized can be aggressively
optimized by the JVM.

On the flip side, these bring their own issues:

- should you be able to opt out of a compile-time SideEffectFree
check? Logging is obviously a side-effect, but you should probably be
able to stick special annotations on the various log method that say:
I know this LOOKS like a side-effect, but treat it as if its side-
effect-free.

- Should memoizable/sideeffectfree be an implementation detail (THIS
particular method so happens to be sef/memoizable, but methods
overriding this one don't have to be - also means sticking memoizable
or sef on an abstract method is pointless, kind of like sticking
'synchronized' on those), or should it strictly inherit: That way you
can create immutable hierarchies, for example. immutability and object
hierarchies are problematic in java today, as you can't enforce
immutability on your children. On the other hand, making it a part of
the interface requires major changes to the JVM and verifier. It also
moves sef/memoizable from the 'compiler, please double-check I didn't
mess up and accidentally introduced something mutable in here' to the
'compiler, you MUST ensure this class doesn't actually change
anything, my code depends on it', which would make implementing it
much, much more difficult.

- Should SEF and Memoized by rolled into one annotation? Memoized
isn't necessarily SEF, but a non-SEF memoizable isn't very practical;
caching the value would stop the side-effect, so the only practical
thing you can do is call the memoizable anyway, but already continue
the main flow in the CPU pipeline as you already know what the result
is going to be. SEF also doesn't mean memoizable. HashMap.get(key) is
SEF, but obviously not memoizable. So, we appear to need: "SEF" and
"SEFAndMemoizable", but clearly those need better names. How many java
programmers are even going to understand these concepts?

- There are lots of borderline classes where it's not at all obvious:
I already mentioned java.io.File, but think about java.util.Random:
grabbing random values technically changes the state of the Random
object, but in practice, especially for thread-safe implementations of
it, Random appears to be stateless. You get a random number when you
call .next() on it, regardless of what other code has been doing to
that object in the mean time. Yet another argument for allowing a
programmer to override what the compiler can infer.


On Aug 5, 10:52 am, Peter Becker <peter.becker...@gmail.com> wrote:
> v6ak wrote:
> >http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals%28...
> >http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode%...

Peter Becker

unread,
Aug 6, 2009, 4:22:52 AM8/6/09
to project...@googlegroups.com
Reinier Zwitserloot wrote:
> That's in fact exactly what lombok does:
>
I expected that, I was just trying to clear up what seemed a
misconception on v6ak's side -- not only in the email I replied to, but
also in another one where he was questioning the direct class comparison.
> equals will check if the incoming object's class is the same as its
> own class. This makes it fully polymorphic, and you can in fact chain
> equals calls by using the @EqualsAndHashCode(callSuper=true)
> mechanism. The documentation states that you should be careful as not
> all equals methods are implemented correctly, but it works fine if the
> entire hierarchy all have lombok-generated equals methods. (in code,
> lombok does this: if ( getClass() != o.getClass() ) return false; -
> see the pre/post example on the features pages.
>
> Immutable classes should probably be final, yes. However, it would be
> rather overly presumptuous for lombok to toss a 'final' on your class
> just because you're using @Data on a class with all-final fields. It
> would reduce flexibility, it doesn't really reduce any boilerplate
> ("final" is 6 characters, and conveys important semantic information,
> so it fulfills neither of the requirements to be boilerplate: It is
> neither overly long, nor pure noise), and it would make lombok that
> much more non-intuitive: If you don't already know that lombok semi-
> magically makes your classes final if you @Data them when they contain
> all-final fields, you'd never jump to that conclusion.
>
> So, just keep manually writing 'final' in there:
>
> public final @Data class Point { private int x, y; }
>
That's all perfectly fine with me. I would not expect @Data to imply
final and it is easy enough to add that keyword. There are even cases
where I would trust the world enough to have a hierarchy of immutables:
whenever that world is closed enough.

[@Immutable vs. @SideEffectFree]

We had this discussion on the JavaPosse list before and I think we
disagree on the semantics of "immutable" and the usefulness of "side
effect free". To briefly restate my view:

"immutable" means that the object never changes observed state (or at
least not during normal operation). That includes not being able to be
changed from the outside, it most certainly includes any non-memory
state changed through the object's methods.

"side effect free" applies only to methods and means that calling the
method does not change anything at all. If you extend that to the object
level (which you are the only person I know to do), then I'd say it
means that no method on the object changes any state in the whole
system. You might then extend that notion to some less pure form by
allowing logging etc.

Neither of these is a subset of the other. An object could be SEF, but
still be mutable through means external to its interface. Dually an
object could be immutable, but have side effects by changing some state
outside its own.

I believe the only sensible thing to implement is a notion of a value
object. A value object is basically a read-only C-style struct made out
of other value objects, baselined by some primitives. It is by
implication SEF (assuming no global variables are used). "Primitive" in
this context does of course not necessarily refer to the notion of
Java's primitives.

The other option I can think of would be trying to model the boundaries
of the object's state. It's an aggregation vs. composition thing: if you
reference an object, you might treat the object's state as part of your
own or just the reference as such. This affects not only the notion of
immutability, but also questions such as value identity, serialization
and cloning. I don't know how feasible such an approach would be in
general, I am pretty sure Java is quite far from letting you be that
expressive.

Peter

Vít Šesták

unread,
Aug 6, 2009, 7:34:26 AM8/6/09
to project...@googlegroups.com
Well, I did the misconception because
* misunderstanding Object.equals(Object) consistency rule
* misunderstanding lombok's Object.equals(Object) implementation
If it was like I though, it would not be misconception, I think. It would disable some surely bad usage only.

However, there is one idea your agree: Immutable classes should be final. It would be useful if you recieve warning about @Data classes with all fields final that are not final except @Data(suppressNonFinalWarning = true).

v6ak
Reply all
Reply to author
Forward
0 new messages