caffeine v3 feedback

99 views
Skip to first unread message

Benjamin Manes

unread,
Jan 4, 2021, 2:44:16 AM1/4/21
to guava-discuss
Hi all,

I am making progress on a Java 11 release of Caffeine, which is currently Java 8 based. Since this is a major version bump we can make some incompatible changes if we restrict those to small ones with a relatively low impact. Are there any aspects of Guava's Cache / Caffeine that you would reconsider?

The current list of changes are,
1. Replace Unsafe with VarHandles
2. Use the Platform Logger api instead of j.u.Logger
3. Linearizable cache refresh if intermediate removal (Guava inserts a potentially stale value)
4. LoadingCache#refresh now returns the future
5. If a refresh is in-flight, an explicit refresh should return that future rather than trigger another reload (now matches Guava) 
6. Minor api tidiness

These changes are nearing completion and reside in the v3 branch.

A change that I am considering is to introduce Caffeine#evicttionListener(RemovalListener), which performs the callback atomically when the entry is being evicted. This would resolve Guava #1481 and #1991, where the removal listener notification outside of the cache's lock was problematic. Caffeine currently offers this using its CacheWriter, but it was a mistake to borrow that concept from other libraries. To serve this purpose it is verbose, confusing, and is incompatible with weak keys and asynchronous caches. The interception of writes is not very useful, thanks to Map#compute methods which should be preferred. Since it is not commonly used, do you think we should softly deprecate and keep it or use the major version bump to allow removing the class entirely? If we did the latter, then an intermediate 2.x release would deprecate and add the feature to help smooth the transition.

Please let me know if you have suggestions for further refinements.

Thanks and happy new year,
Ben

Chris Povirk

unread,
Jan 8, 2021, 2:05:23 PM1/8/21
to Benjamin Manes, guava-discuss
Hi, Ben,

I haven't kept up closely with the Caffeine API, but from what I've seen, you've fixed a number of the issues I'm aware of in Guava's Cache. The remaining weirdnesses I can think of are mostly surface-level things that are hard to improve upon:
  • the assumption that maximumSize is exact: I suppose you could use names like "approximateMaximumSize," but I doubt it's worth changing, especially when you would want to continue to recognize the old name in string specs. (And then users will think there's a difference between "maximumSize" and "approximateMaximumSize"... :))
  • the assumption that expiration -- especially notification of listeners -- takes effect immediately after the given delay (rather than during cleanup): You already pick different terms to distinguish the 2 concepts ("expire" vs. "removal"), and you already offer the ability to attach a scheduler for periodic cleanup, so I'm not sure there's much you can do.
  • confusion around "weak values": If memory serves, we've heard from confused users who have heavyweight values that they want to have go away, so they reach for weakValues(). But what they actually usually want is weakKeys(): Yes, they want the value to disappear (the way that weak references disappear, so they think "weakValues"), but they want it to happen when the value will no longer be needed again, as determined by when the key disappears. It's possible that a longer name could express this better ("evictAfterKeyUnreachable" or something???), but again that leads to concerns about existing string specs.
  • confusion about how weakKeys() triggers == behavior: But again, are we really going to try to change the method name to something longer -- way longer if we simultaneously try to solve the previous bullet's problem? :)
  • "get": I occasionally feel bad about how many things are called "get": Optional.get (which is cheap but may throw), Supplier.get (which typically is implemented to be cheap and not to throw), (Loading)Cache.get (which can trigger very expensive operations), Future.get (which can block and be interrupted), .... But it's hard to muster excitement for changing the name, especially when there are so many variants of get* that would also need to be changed. And what would we change it to? "request?" That doesn't suggest a new name for getIfPresent....
  • tighter generics: I still wonder from time to time if the world is really better with getIfPresent(Object) than it would be with getIfPresent(K): It does permit some additional correct calls, but it seems much more likely to trigger bugs. Arguably the right way to solve that is with static analysis (though static analysis misses things), but I sometimes wish we had done things differently with some of our types. (And we do have an example of using the tighter type in Multimap.get, which works pretty well. And Multimap, like Cache, offers the escape hatch of an asMap view for people who really want to pass an Object.) Overall, I am conflicted.
Other, more knowledgeable cache users: Please do chime in!

--Chris

Benjamin Manes

unread,
Jan 8, 2021, 7:14:13 PM1/8/21
to Chris Povirk, guava-discuss
Thanks Chris!

Those are all great points. To a large extent our hands are tied due to 4M monthly downloads. That is a pittance compared to Guava, but large enough to be mindful of what users can tolerate.

I agree with you about unclear naming. We were perhaps too technical by using these implementation terms rather than behavioral ones or too vague by keeping those terse. If Guava deprecates or adds aliases then I would likely mirror those new conventions. It may not be worth us worrying about.

The key equality is unfortunately very confusing. While not a fix, I am considering supporting custom equivalences. Guava does but made it unavailable for end users.

Another oversight is CacheLoader's loadAll(Iterable<K> keys). This is unfriendly because most data fetching code works with collections as the base type. This might require copying the collection to be usable to those apis. As only LocalCache calls this method and supplies a Set, it should be oriented towards implementers who would prefer a more specific data type.

From your list, I think only changing the generic argument is actionable as it matches the vast majority of existing code. The rest are all concerns that, in hindsight, we should have spent more time bikeshedding.

Best,
Ben

Joachim Durchholz

unread,
Jan 9, 2021, 3:29:50 AM1/9/21
to guava-...@googlegroups.com
Some peanut gallery feedback from my side:

Am 08.01.21 um 20:05 schrieb 'Chris Povirk' via guava-discuss:
> # "get": I occasionally feel bad about how many things are called "get":
> Optional.get (which is cheap but may throw), Supplier.get (which
> /typically/ is implemented to be cheap and not to throw),
> (Loading)Cache.get (which can trigger very expensive operations),
> Future.get (which can block and be interrupted), .... But it's hard to
> muster excitement for changing the name, especially when there are so
> many variants of get* that would also need to be changed. And what would
> we change it to? "request?" That doesn't suggest a new name for
> getIfPresent....

I believe that people expect "get" to be a getter just like with Java Beans.
They expect getters to be cheap and not throw, but then they are aware
of exceptions (getting an item from a linear list *can* be expensive,
getting data from a file-/network-based Bean *can* throw).

So I believe it's fine if a get() method has performance or exception
specialty, provided the method Javadoc alerts them to the situation.

Regards,
Jo
Reply all
Reply to author
Forward
0 new messages