Cached hashCode

Maaartin-1

unread,

Mar 28, 2011, 9:18:34 AM3/28/11

to project...@googlegroups.com

What about

@EqualsAndHashCode(cached=true)

caching the computed hashCode? The generated methods could look like

public int hashCode() {
if ($hashCode <> 0) return $hashCode;
... do the usual computation
return $hashCode = result;
}
// String.hashCode() works this way.

public boolean equals(Object o) {
// from http://projectlombok.org/features/EqualsAndHashCode.html
if (o == this) return true;
if (o == null) return false;
if (!(o instanceof Square)) return false;
Square other = (Square) o;

// the only new code
if (this.$hashCode <> other.$hashCode
&& this.$hashCode <> 0
&& other.$hashCode <> 0) return false;

... do the usual computation
}
// This could speed up equals in many cases.
// For whatever reason, String.equals doesn't work this way.

public void setName(String name) {
$hashCode = 0;
this.name = name;
}
// Lombok generated setters should invalidate the cache,
// except for fields excluded from EAHC.
// Self-made mutators are "Somebody Else's Problem",
// the main use is for immutable classes, anyway.

What do you think about it?

Regards, Maaartin.

Nikolas Everett

unread,

Mar 28, 2011, 10:03:35 AM3/28/11

to project...@googlegroups.com

I like it in general but would like to have a think about how to warn people about that whole self made setters thing.

It would also be nice to know how much of a performance improvement this nets you. In particular I'm thinking about immutable classes in which I typically use Guava's immutable collections which seem to cache their hashcodes already.

While I think about it this whole thing falls apart if you expose mutable collections which is another thing to warn people about.

Nik

--
You received this message because you are subscribed to the Google
Groups group for http://projectlombok.org/

To post to this group, send email to project...@googlegroups.com
To unsubscribe from this group, send email to
project-lombo...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/project-lombok?hl=en

Reinier Zwitserloot

unread,

Mar 28, 2011, 11:43:48 AM3/28/11

to project...@googlegroups.com

I rather doubt this is worth the hassle. Here are the cons:

(A) Invalidating the hash on mutable classes. This is a big deal - what if the superclass has some setters? What if you write your own setter or anything else that mutates this class? What if some object you refer to in a field contains a reference to something else, that something else is mutated, and as a result that ref changed hashCode and thus our class should change along with it?

(B) Does this actually net you any serious speed? The only significant gain here is if you include a field whose hashCode() takes very long to calculate and which doesn't itself cache, *AND* your object is 100% immutable, including all refs. How many classes even exist for which that applies? Isn't the solution to make THOSE classes cache their hashcodes? j.u.collection stuff doesn't cache hashes for the same reason lombok doesn't. It doesn't know it can safely do so.

(C) It's another parameter. We want to avoid implementing everything that seems useful to a vanishingly small subset of users, because once we go down that path, @Data is going to end up with 55 parameters. I think we can all agree @Data with no parameters is a lot better than @Data with 55 parameters but applicable in about 5% more situations than @Data currently is.

--Reinier Zwitserloot

Maaartin-1

unread,

Mar 28, 2011, 5:02:11 PM3/28/11

to project...@googlegroups.com

On 11-03-28 16:03, Nikolas Everett wrote:
> I like it in general but would like to have a think about how to warn
> people about that whole self made setters thing.

I don't thing that anything beyond something like this in necessary:

Note that the hashCode doesn't get updated automatically when the object
changes (except by setters generated by Lombok for the class itself).
Use it for immutable objects or make sure that $hashCode=0 is assigned
in each mutator.

> It would also be nice to know how much of a performance improvement this
> nets you.

No idea, maybe it's not worth it. When I wrote hashCode() for immutable
objects manually, I often did it, since it's so easy; but I've never
measured it. It may be a useless premature optimization.

> In particular I'm thinking about immutable classes in which I
> typically use Guava's immutable collections which seem to cache their
> hashcodes already.

AFAIK, only the hashCode of the whole collection gets cached
(RegularImmutableSet.hashCode), not the values for the entries.

> While I think about it this whole thing falls apart if you expose
> mutable collections which is another thing to warn people about.

Whenever you use hashCode with mutable classes you may get a problem.

(continued below)

On 11-03-28 17:43, Reinier Zwitserloot wrote:
> I rather doubt this is worth the hassle. Here are the cons:
>
> (A) Invalidating the hash on mutable classes. This is a big deal - what
> if the superclass has some setters? What if you write your own setter or
> anything else that mutates this class? What if some object you refer to
> in a field contains a reference to something else, that something else
> is mutated, and as a result that ref changed hashCode and thus our class
> should change along with it?

Sure. All of this should invalidate the hashCode, however, that's the
reason for cached=false by default. Obviously, in case you refer to
anything mutable without your control (including the superclass part of
the object), you can't use the caching. When you write any mutator
yourself, you can use the caching but must set $hashCode=0.

> (B) Does this actually net you any serious speed? The only significant
> gain here is if you include a field whose hashCode() takes very long to
> calculate and which doesn't itself cache, *AND* your object is 100%
> immutable, including all refs. How many classes even exist for which
> that applies? Isn't the solution to make THOSE classes cache their
> hashcodes?

I can't tell, but maybe somebody else will.

> j.u.collection stuff doesn't cache hashes for the same
> reason lombok doesn't. It doesn't know it can safely do so.

Actually, any HashSet could cache the hashcodes without any risk. In
case they ever change, the HashSet is already completely broken. But for
the other sets and for values in any Map, I agree.

> (C) It's another parameter. We want to avoid implementing everything
> that seems useful to a vanishingly small subset of users, because once
> we go down that path, @Data is going to end up with 55 parameters. I
> think we can all agree @Data with no parameters is a lot better than
> @Data with 55 parameters but applicable in about 5% more situations than
> @Data currently is.

I agree with this. So forget it unless somebody comes up with a lot of
use cases.

Regards, Maaartin.

Maaartin G

unread,

Sep 7, 2013, 11:35:56 AM9/7/13

to project...@googlegroups.com

I'd like to revive my idea of cached hashCode as the gains may be pretty huge:

https://microbenchmarks.appspot.com/runs/595365ac-302e-454e-a9ec-40e8d2cb2f32#r:scenario.benchmarkSpec.parameters.slow,scenario.benchmarkSpec.parameters.stringLength&c:scenario.benchmarkSpec.parameters.prefixed

Actually, instead of caching I'm proposing hashCode precomputation for immutable classes. I don't care about mutable ones, as they shouldn't be used as map keys anyway, so their hashCode and equals are not really important.

While Lombok greatly simplifies writing immutable classes, there's no way how to add this trivial optimization to @RequiredArgsConstructor and @EqualsAndHashCode yourself. So for performance-critical classes you have to write the boilerplate manually like in FastElement here:

https://dl.dropboxusercontent.com/u/4971686/published/maaartin/lombok/CachedHashCodeBenchmark.java

The benchmarked elements contain two strings of `stringLength`; according to `prefixed` they differ in the first or last character. String having a long common prefix come often from URLs or file names.

Reinier Zwitserloot

unread,

Sep 7, 2013, 2:42:49 PM9/7/13

to project-lombok

Using some creative semi double locks (actually lock free; having 2 threads calculating the hashcode simultaneously is suboptimal but does not result in a wrong answer, therefore, it's fine) should make it free enough to calculate this on demand once (or, in case of multiple threads calling hashCode for the first time virtually simultaneously) 2 or 3 times at most.

I guess the implementation should involve a new annotation (@CacheHashcode?).

Should we put in something for writing your own hashCode algorithm? That's a bit complicated, as the @EqAHC code has to take that into account for generating its 'hey, you should either override both or neither implementation' stuff.

--Reinier Zwitserloot

--
You received this message because you are subscribed to the Google Groups "Project Lombok" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-lombo...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Fabrizio Giudici

unread,

Sep 7, 2013, 3:15:04 PM9/7/13

to project-lombok, Reinier Zwitserloot

On Sat, 07 Sep 2013 20:42:49 +0200, Reinier Zwitserloot
<rei...@zwitserloot.com> wrote:

> I guess the implementation should involve a new annotation
> (@CacheHashcode?).

What about enabling this feature when there's the JSR-305 @Immutable
annotation together with @EqualsAndHashCode?

--
Fabrizio Giudici - Java Architect @ Tidalwave s.a.s.
"We make Java work. Everywhere."
http://tidalwave.it/fabrizio/blog - fabrizio...@tidalwave.it

Lenny Primak

unread,

Sep 7, 2013, 3:15:57 PM9/7/13

to project...@googlegroups.com, Reinier Zwitserloot

I second that.

Maaartin G

unread,

Sep 7, 2013, 7:38:56 PM9/7/13

to project...@googlegroups.com

On Saturday, September 7, 2013 8:42:49 PM UTC+2, Reinier Zwitserloot wrote:

Using some creative semi double locks (actually lock free; having 2 threads calculating the hashcode simultaneously is suboptimal but does not result in a wrong answer, therefore, it's fine) should make it free enough to calculate this on demand once (or, in case of multiple threads calling hashCode for the first time virtually simultaneously) 2 or 3 times at most.

Maybe simply ignoring concurrency as in

http://guava-libraries.googlecode.com/git/guava/src/com/google/common/collect/ImmutableEnumSet.java

might be right, but I'm not sure. A problem could occur with a thread saving the hashCode writes after another one invalidating it because of a change.

The immutable case is probably the more important one and allows final field like in

http://guava-libraries.googlecode.com/git/guava/src/com/google/common/collect/RegularImmutableSet.java

But I'm afraid making it transient is impossible with standard serialization.

I guess the implementation should involve a new annotation (@CacheHashcode?).

I don't know.... maybe @CachedHashCode? :D But seriously, it depends on what it should do:

1. cache hashCode (or initialize a final field hashCode?)

2. use hashCode in equals

I can't see any case when doing only one thing makes sense. So probably an argument to EqHC should do.

Should we put in something for writing your own hashCode algorithm? That's a bit complicated, as the @EqAHC code has to take that into account for generating its 'hey, you should either override both or neither implementation' stuff.

I'm not sure... the standard multiplication by 31 is rather stupid, but nobody seems to care. The method is pretty magic, as generating hash collisions even for strings of length two needs some invention (2 chars fit into an int, so WHY?). Using the same "certified" multiplier for Arrays.hashCode only adds insult to injury. Any bigger odd multiplier would be better and I can't see any disadvantage.

To the actual question: While something like

@EqAHC(cachedHashCode=true, acceptExistingHashCode=true)

doesn't look nice, it's acceptable for me. A separate annotation for the former would be probably nicer; the latter (suppressing the @EqAHC warning) is probably rare enough, so that some ugliness is fine.

On Saturday, September 7, 2013 9:15:04 PM UTC+2, Fabrizio Giudici wrote:

What about enabling this feature when there's the JSR-305 @Immutable
annotation together with @EqualsAndHashCode?

Sounds good except for

1. JSR-305 might be dead

http://code.google.com/p/guava-libraries/issues/detail?id=1218

http://old.nabble.com/Is-JSR305-dead--td34759199.html

2. For some tiny objects the added int may be seen as too expensive.

And again, detecting any @Immutable annotation doesn't seem to be right. Creating @lombok.Immutable will be proved wrong in 20+ years, when @javax.annotation.Immutable gets finally standardized.

Roel Spilker

unread,

Sep 8, 2013, 10:33:34 AM9/8/13

to project...@googlegroups.com

I think it should be a property of the @EAHC annotation, default false, don't care about multithreading, use an Integer field for easy uninitialized checks. Don't take mutability into account. Just listen to the property of the annotation.

That's easy to explain and easy to implement.

--

Fabrizio Giudici

unread,

Sep 8, 2013, 12:05:13 PM9/8/13

to project...@googlegroups.com, Maaartin G

On Sun, 08 Sep 2013 01:38:56 +0200, Maaartin G <maaar...@gmail.com>
wrote:

> Sounds good except for
> 1. JSR-305 might be dead
> http://code.google.com/p/guava-libraries/issues/detail?id=1218
> http://old.nabble.com/Is-JSR305-dead--td34759199.html

Given that JSR-305 is informative only (I mean, there's no working code to
maintain), what's the problem with its dormant status?

Martin Grajcar

unread,

Sep 8, 2013, 2:12:28 PM9/8/13

to project...@googlegroups.com

On Sun, Sep 8, 2013 at 4:33 PM, Roel Spilker <r.sp...@gmail.com> wrote:

I think it should be a property of the @EAHC annotation, default false, don't care about multithreading, use an Integer field for easy uninitialized checks. Don't take mutability into account. Just listen to the property of the annotation.

IMHO, Integer is wrong. It's nice and it's perfectly clear when it's uninitialized, so I'd go for it in normal code. But not in a library or in generated code, as it's slow (indirection) and eats too much memory (AFAIK 24 bytes without compressed oops). Something like

public int hashCode() {

if ($hashCode == 0) $recomputeHashCode();

return $hashCode;

}

is slow when hashCode happens to be 0, which is rather improbable. Using Integer is slow always (usually not that slow; depending on cache misses).

There are at least the following workarounds:

0. Simply ignore it. This gets done in String.hashCode.

1. Simply replace 0 by 1. This gets done in JDK for identityHashCode. IMHO, having twice as many ones is rather harmless.

2. Simply replace 0 by some other hash computed using some different formula (e.g. the same computation with a different PRIME). Unfortunately, this may return 0 again (which is improbable and can be solved by falling back to 1).

3. Use an additional field like

public int hashCode() {

if ($hashCode == 0 && !$hashCodeIsInitialized) $recomputeHashCode();

return $hashCode;

}

This costs in theory one byte for the boolean. Practically it costs 0 or 8 bytes due to the alignment.

That's easy to explain and easy to implement.

Still some open questions:

- Should the hashCode field be user visible?

- Should it be invalidated by generated mutators?

- Should user-defined mutators be augmented by invalidation code?

- Should all fields be forced to be private?

Roel Spilker

unread,

Sep 8, 2013, 5:32:38 PM9/8/13

to project...@googlegroups.com

I like the 'replace 0 by 1' approach.

Martin Grajcar

unread,

Sep 9, 2013, 1:27:39 PM9/9/13

to project...@googlegroups.com

On Sun, Sep 8, 2013 at 11:32 PM, Roel Spilker <r.sp...@gmail.com> wrote:

I like the 'replace 0 by 1' approach.

I also think it should suffice.

For immutable classes something like http://code.google.com/p/projectlombok/issues/detail?id=570 might be better:

@RequiredArgsConstructor @EqualsAndHashCode(cached=true)

class Element {

private final String a, b;

@PostConstruct

private final int hashCode = 97531 * a.hashCode() + b.hashCode();

}

Since hashCode is explicitly initialized, it won't be included in the ctor arguments.

Because of @PostConstruct, the intialization gets moved to the end of the ctor.

So you get

@RequiredArgsConstructor @EqualsAndHashCode(cached=true)

class Element {

private final String a, b;

private final int hashCode;

Element(String a, String b) {

this.a = a;

this.b = b;

this.hashCode = 97531 * a.hashCode() + b.hashCode();

}

public int hashCode() {

return hashCode;

}

.... standard generated equals

}

No idea if this is the way to go. Unlike the ordering specified in http://docs.oracle.com/javase/specs/jls/se7/html/jls-12.html#jls-12.5, @PostConstruct makes sense to me. Here it allows extending the generated ctor by custom code, making all variables private, and defining custom hashCode....

Reinier Zwitserloot

unread,

Sep 23, 2013, 8:23:37 PM9/23/13

to project...@googlegroups.com

@Immutable is a bad idea in java. It can't work. It makes no sense, and whatever incomplete definition you can come up for it is useless (in that you can't use it as an indicator of memoizability, it does not guarantee deterministic results based on parameter input remaining unchanged, it does not imply that calling a method has no side-effects and therefore any code ignoring the return value is by definition a bug – so what's left?).

What does 'immutable' mean? That all fields are final? Alright, include a List field that isn't immutable and hashCodes can change.

You could add the further restriction that all fields also need to themselves be immutable types, but:

(A) someone is going to have to delve into the rt.jar and all other popular java libraries and add @Immutable annotations all over the place. Without them, it'll all fall apart.

(B) There's a more general problem that for example many classes have Lists in them, which are actually ImmutableList. ImmutableList (the type) is immutable. List isn't. Now what?

Even granting for now that the above isn't an actual problem, you're still not in the clear. I can make a fake mutable field by creating a static IdentityHashMap. Sure, I'm intentionally trying to mess with the system, but I'm just showing you can get around whatever rule you like to come up with, as long as you continue to try and frame the issue as 'immutability'. Immutability is a property of fields. It has zip squat to do with methods. Classes have methods. Therefore, classes cannot possibly be called 'immutable' in practice. It makes no sense.

Some cases in point:

* java.io.File is as immutable as it gets: 1 field, it's final, and its type is itself immutable (String). And yet, almost all of its methods are NOT memoizable. You can call .length() on it, and then call it again later on, and the value can change. This proves immutable classes no matter how stringent the definition, aren't memoizable (immediately throwing out the notion of caching hashCode on immutables).

* any non-final class can't be immutable by any definition. Unless javac enforces it on subclasses, which it won't, because my name isn't Mark Reinhold.

* Let's turn it around: What does the presence of hashCode() on a non-immutable class imply? You shouldn't use mutable keys in maps and set's behaviour will be non-sensical, so what's left? If @Immutable is shorthand for 'memoize hashCode please', that's tantamount to saying we should always memoize it, period.

The only concept I'm willing to encode as an annotation is 'side effect free' (A property of methods, never of fields, unlike immutability which is the reverse), and even that one is iffy. It has nothing to do with hashcode generation, caching, or immutability though. It doesn't imply memoization.

Martin Grajcar

unread,

Sep 26, 2013, 9:13:32 PM9/26/13

to project...@googlegroups.com

On Tue, Sep 24, 2013 at 2:23 AM, Reinier Zwitserloot <rein...@gmail.com> wrote:

@Immutable is a bad idea in java. It can't work.

You're answering to me, but it was Fabrizio who mentioned the @Immutable annotation. I just noted that for immutable classes precomputation instead of memoization can be done.

I agree that no definition can really work, but some fuzzy idea of immutability does. We may agree that String is immutable, although it contains currently two mutable fields (hash and hash32). We might even agree that File is mutable, although it contains no mutable fields.

All the Guava's ImmutableCollections are immutable only assuming that their content is immutable too and there's no way how to enforce this. Yet the collections get used a lot and work fine.

It makes no sense, and whatever incomplete definition you can come up for it is useless (in that you can't use it as an indicator of memoizability, it does not guarantee deterministic results based on parameter input remaining unchanged, it does not imply that calling a method has no side-effects and therefore any code ignoring the return value is by definition a bug – so what's left?).

What does 'immutable' mean? That all fields are final? Alright, include a List field that isn't immutable and hashCodes can change.

You could add the further restriction that all fields also need to themselves be immutable types, but:

(A) someone is going to have to delve into the rt.jar and all other popular java libraries and add @Immutable annotations all over the place. Without them, it'll all fall apart.

(B) There's a more general problem that for example many classes have Lists in them, which are actually ImmutableList. ImmutableList (the type) is immutable. List isn't. Now what?

Even granting for now that the above isn't an actual problem, you're still not in the clear. I can make a fake mutable field by creating a static IdentityHashMap. Sure, I'm intentionally trying to mess with the system, but I'm just showing you can get around whatever rule you like to come up with, as long as you continue to try and frame the issue as 'immutability'. Immutability is a property of fields. It has zip squat to do with methods. Classes have methods. Therefore, classes cannot possibly be called 'immutable' in practice. It makes no sense.

Some cases in point:

* java.io.File is as immutable as it gets: 1 field, it's final, and its type is itself immutable (String). And yet, almost all of its methods are NOT memoizable. You can call .length() on it, and then call it again later on, and the value can change. This proves immutable classes no matter how stringent the definition, aren't memoizable (immediately throwing out the notion of caching hashCode on immutables).

* any non-final class can't be immutable by any definition. Unless javac enforces it on subclasses, which it won't, because my name isn't Mark Reinhold.

* Let's turn it around: What does the presence of hashCode() on a non-immutable class imply? You shouldn't use mutable keys in maps and set's behaviour will be non-sensical, so what's left?

You shouldn't but often you have to: All the Hibernate entities must be mutable, but you need then as keys in maps and sets. Serialization often enforces mutability and so do many other frameworks.

Mutability of maps keys is fine as long as only fields not involved in EqAHc are mutable. Even changing the involved fields is OK as long as it happens before the object gets used in the map - this is what actually happens with the entities.

If @Immutable is shorthand for 'memoize hashCode please', that's tantamount to saying we should always memoize it, period.

Actually, memoization like this could work: When you put an object into a set, this is usually the first time when hashCode gets called. Afterwards it mustn't change. However, this might hit someone one day.

The only concept I'm willing to encode as an annotation is 'side effect free' (A property of methods, never of fields, unlike immutability which is the reverse),

How would you call a final class with all methods being side effect free?

Is a memoizing method like String.hashCode or String.hash32 side effect free?

and even that one is iffy. It has nothing to do with hashcode generation, caching, or immutability though. It doesn't imply memoization.

Agreed. But a class which both

- is final or has only private fields

- and has only side effect free methods

can memoize all its methods, right? (When they get overridden, then the memoization will be ineffective).

But this all is important to memoization in general, which was not exactly my point. I'm just thinking about something like having the following three possibilities for @EqualsAndHashCode:

- normal - this is the standard behavior and should be the default as all others cost memory

- precomputed - add a private final field and use it in equals

- memoized - add a private lazily initialized field and use it in equals

I'm not claiming that all three possibilities are necessary. Unfortunately, it looks like the precomputation is much faster than the memoization, see my updated benchmark and its results. I didn't expect such a difference and I'm trying to find a bug in my benchmark.

Surprisingly, replacing

if (!(obj instanceof MediumElement)) return false;

final MediumElement other = (MediumElement) obj;

if (hashCode() != other.hashCode()) return false;

by

if (!(obj instanceof MediumElement)) return false;

if (hashCode() != obj.hashCode()) return false;

final MediumElement other = (MediumElement) obj;

converts the use of the cached hashCode in equals into a pessimization. Sure, there's a megamorphic call to hashCode, but Hotspot should know better.

Reinier Zwitserloot

unread,

Oct 4, 2013, 8:32:28 AM10/4/13

to project...@googlegroups.com

A class that has only private fields, AND has only SEF methods seems like it should be able to memoize everything, but, nope:

Let's say one of the methods (let's call it 'foo') calls another method ('bar') as part of calculating its return value. If 'foo' isn't overridden in a subclass, but 'bar' is, then foo may no longer be memoizable. Therefore, 'foo' would start emitting results based entirely on when the first call with that parameter list happened, which is bad.

I'm a little bit confused on your 2 code snippets. Which one appears to be faster in your caliper tests?

I can foresee adding something like @CacheHashCode(strategy=CachingStrategy.PRECOMPUTE / CachingStrategy.LAZY). I guess we could just add this to @EqualsAndHashCode instead, but I can foresee a @Memoize annotation which will add memoization to any method, and PRECOMPUTE can still be useful there, at least for methods with no argument lists (or possibly methods where there is a finite list of possible argument values). Also, it seems reasonable to offer the option for lombok to generate the appropriate boilerplate to precompute/lazily cache a hand-written hashCode method.

Martin Grajcar

unread,

Oct 9, 2013, 7:46:19 AM10/9/13

to project...@googlegroups.com

On Fri, Oct 4, 2013 at 2:32 PM, Reinier Zwitserloot <rein...@gmail.com> wrote:

A class that has only private fields, AND has only SEF methods seems like it should be able to memoize everything, but, nope:

Let's say one of the methods (let's call it 'foo') calls another method ('bar') as part of calculating its return value. If 'foo' isn't overridden in a subclass, but 'bar' is, then foo may no longer be memoizable. Therefore, 'foo' would start emitting results based entirely on when the first call with that parameter list happened, which is bad.

I see I was wrong.

I'm a little bit confused on your 2 code snippets. Which one appears to be faster in your caliper tests?

I see it's confusing. Both snippets are related to caching and there are actually 4 cases:

- precomputation with equals using hashCode, 2 ns.

- caching with equals using hashCode as in the first snippet, 4-5 ns.

- caching with equals using hashCode as in the second snippet, 12 ns.

- normal, 6-22 ns depending on length

I wrote there's a megamorphic call to hashCode in the second snippet, but that's not true; what counts is the number of implementation seen at the given call site, so it's monomorphic. For sure we can say that the second snippet is bad, but I'd like to know what's going on there. But that's a different story.

I can foresee adding something like @CacheHashCode(strategy=CachingStrategy.PRECOMPUTE / CachingStrategy.LAZY). I guess we could just add this to @EqualsAndHashCode instead, but I can foresee a @Memoize annotation which will add memoization to any method,

With some limited cache in case of methods with an infinite list of argument values?

and PRECOMPUTE can still be useful there, at least for methods with no argument lists (or possibly methods where there is a finite list of possible argument values).

Agreed. This all makes sense, however, @EqualsAndHashCode needs a special syntax as there's no method you could place the @Memoize annotation on. Moreover, it's not the memoization alone, it's also the usage of the memoized hashCode in equals.

I could imagine something like @EqualsAndHashCode(onHashCode=@_(@Memoize)) which is a bit strange. It would make sense if there were more reasons to put some annotations on generated hashCode, but I can't see any. It also assumes that lombok interprets the annotations placed by lombok itself. It's probably a too strange idea.

Also, it seems reasonable to offer the option for lombok to generate the appropriate boilerplate to precompute/lazily cache a hand-written hashCode method.

Yes, but again, not only precompute/memoize hashCode, but also use it in equals. I guess there's no need to state this explicitly: equals should use hashCode if and only if it's precomputed or memoized.

Maybe something like

enum CachingStrategy {NONE, LAZY, PRECOMPUTE}

@EqualsAndHashCode(strategy=CachingStrategy.LAZY, acceptExisting=true)

@Memoize(strategy=CachingStrategy.LAZY)

could cover it all. Here, CachingStrategy.NONE is currently useless, but seems to make sense for completeness. Something like acceptExisting=true could be generally useful e.g. to make lombok generate things which would otherwise be skipped because of existing methods.

Reinier Zwitserloot

unread,

Oct 9, 2013, 9:06:55 AM10/9/13

to project-lombok

Okay, so we have some concerns here:

* @Memoize should maybe be a lombok feature; it has a few parameters. One of them is the strategy (LAZY or PRECOMPUTE). PRECOMPUTE is not legal unless the method has no arguments, or arguments have limited values: booleans, enums, bytes, chars, and shorts will be allowed, but that's it, and I guess a hardcoded limit on the combinatorial explosion (no @Memoizing a method with 3 'char' params). Probably 2^20 max. I can't imagine it's useful to precompute anything bigger than that. Furthermore, @Memoize has an option to set a maximum cache size. LAZY caches will always run as an MRU (probably LinkedHashMap, I guess?). This option is not legal on PRECOMPUTE. Also, PRECOMPUTE will not use LinkedHashMap; I guess it should use an array + a function to map all inputs onto a single number from 0-maxcombinatorial. About enums: Do we try and precompute 'null', or skip it? About short/byte/etc: Do we also allow Byte, Short, and Character? If so, do we also memoize the results when calling with 'null'? About exceptions: Do we memoize the exception? It would have a funky stacktrace. Or do we just throw the type and message of the exception, losing the cause and breaking if the type thrown does not have a constructor with 1 parameter (String), or do we just not care about it, thus causing an exception on class load for PRECOMPUTE, and never memoizing the exception for LAZY?

* @EqualsAndHashCode will gain an option to memoize hashCode. The parameter is LAZY or PRECOMPUTE; something like @EqualsAndHashCode(cacheHashCode=CacheStrategy.PRECOMPUTE).

* If hashCode() is handwritten but @Memoized, @EqualsAndHashCode will not generate an equals method anyway, because it either generates both, or neither. However, if (cacheHashCode=something) is provided, equals() will automatically call .hashCode() on itself and on the argument and if they don't return the same number, it returns false immediately. If they are the same number, equals runs as normal (because 2 objects with the same hash may not be .equals() to each other, it may just be a collision). It is not possible to force this behaviour any other way.

That sound about right?

--Reinier Zwitserloot

Martin Grajcar

unread,

Oct 9, 2013, 10:07:15 AM10/9/13

to project...@googlegroups.com

On Wed, Oct 9, 2013 at 3:06 PM, Reinier Zwitserloot <rei...@zwitserloot.com> wrote:

Okay, so we have some concerns here:

* @Memoize should maybe be a lombok feature; it has a few parameters. One of them is the strategy (LAZY or PRECOMPUTE). PRECOMPUTE is not legal unless the method has no arguments, or arguments have limited values: booleans, enums, bytes, chars, and shorts will be allowed, but that's it, and I guess a hardcoded limit on the combinatorial explosion (no @Memoizing a method with 3 'char' params). Probably 2^20 max. I can't imagine it's useful to precompute anything bigger than that.

Agreed.

Furthermore, @Memoize has an option to set a maximum cache size. LAZY caches will always run as an MRU (probably LinkedHashMap, I guess?).

Some ConcurrentLinkedHashMap would be better, but there's none in JDK and I don't think you want to generate it. :D Nor you want to depend an Guava's cache.

This option is not legal on PRECOMPUTE. Also, PRECOMPUTE will not use LinkedHashMap; I guess it should use an array + a function to map all inputs onto a single number from 0-maxcombinatorial.

Agreed.

About enums: Do we try and precompute 'null', or skip it?

I'd say yes, unless the argument is @NonNull.

About short/byte/etc: Do we also allow Byte, Short, and Character?

Currently, I don't care.

If so, do we also memoize the results when calling with 'null'?

As above.

About exceptions: Do we memoize the exception?

No. Exceptions are slow anyway and a performance-aware code should avoid them. Caching them looks like a needless work to me.

It would have a funky stacktrace. Or do we just throw the type and message of the exception, losing the cause and breaking if the type thrown does not have a constructor with 1 parameter (String), or do we just not care about it, thus causing an exception on class load for PRECOMPUTE,

This won't happen in class loading but in the constructor (unless you mean static methods). I'd go for the solution which is fastest for the exception-less path. This is most probably throwing during precomputation.

and never memoizing the exception for LAZY?

* @EqualsAndHashCode will gain an option to memoize hashCode. The parameter is LAZY or PRECOMPUTE; something like @EqualsAndHashCode(cacheHashCode=CacheStrategy.PRECOMPUTE).

* If hashCode() is handwritten but @Memoized, @EqualsAndHashCode will not generate an equals method anyway, because it either generates both, or neither.

That's why I suggested acceptExisting=true.

However, if (cacheHashCode=something) is provided, equals() will automatically call .hashCode() on itself and on the argument and if they don't return the same number, it returns false immediately.

This is the very idea which lead me to the second snippet above. It looked like calling hashCode early could speed things up, but it was slower. I guess the usual prelude (identity and instanceof checks and cast) is necessary. After the prelude, comparing the cached hashes is cheap and should be done before looking at the regular fields.

Martin Grajcar

unread,

Aug 28, 2014, 1:03:35 PM8/28/14

to project...@googlegroups.com

Any news on this? I'm afraid, the general precomputation/caching could get pretty complicated, but I'd like to use it for hashCode soon. Actually, I don't care about caching as I know that the hashCode gets used a lot for the classes concerned, so I'd be happy with