I thought a fun way to kick off the new year would be to see what a Java 8 version of Guava's cache might look like. It took about two weeks to write a feature compatible implementation built on top of Java 8's rewrite of ConcurrentHashMap (CHM). I call this new cache
Caffeine.
The design is based on my original proposal when Kevin, Charles, and I began to discuss a caching library. At the time significant effort had already been put into soft & weak reference based caching, as there had been a belief that it was ideal to delegate caching to the garbage collector and obtain the concurrency for free. As that is no longer the recommended strategy, this implementation is a fork of ConcurrentLinkedHashMap (CLHM) where a decorator sits on top of a computing map. An unexpected advantage is that the new ConcurrentHashMap was designed to not be vulnerable to the denial-of-service via hash collision attack, which Guava still is. On the API side, Caffeine uses a fork of Guava's with some Java 8 idioms and exposes a few advanced features.
While there is a lot of work to do before switching focus to performance, the initial benchmarks shows a clear improvement. Of course, the quick prototyping of features significantly degraded performance compared to CLHM, but hopefully that can be restored. There are also a number of optimizations ideas worth exploring that never made it into CLHM, so with effort the final implementation may outperform it.
The benchmarks below use JMH on a 4c/8HT macbook. The read/write thread count was set to the low value of 4 due to a long standing Guava bug that can cause an OutOfMemoryError (crash) at high concurrency for synthetic workloads. The compute benchmark was set to 32 threads to show the impact of lock contention and the benefits of prescreening. Past benchmarking with CLHM demonstrated that it scaled in line with the backing hash table, which is now based on per-node locks. Guava retains the segmented locking strategy, which is known to degrade noticeably at around 8-12 cores.
| random reads / sec | random writes / sec | compute same key / sec | compute random key / sec |
| ConcurrentLinkedHashMap | 76,569,121 (4t) | 16,566,954 (4t) | - | - |
| Caffeine | 45,437,638 (4t) | 18,228,124 (4t) | 723,280,326 (32t) | 90,563,072 (32t) |
| Guava | 22,740,644 (4t) | 11,087,604 (4t) | 45,185,485 (32t) | 66,620,017 (32t) |
So far most of the work was building a robust test suite based on specification metadata. This matches my proposal on the internal wiki back when the tests were a mix of Bob's, Charles', and mine (those that I ported from over). Guava's current suite shows a lot of work by Charles, et. al. but is also fairly brittle in regards to combinational testing. In contrast by using the @CacheSpec annotation to specify the constraints, injecting contextual information, and moving low-level validation into assert matchers the tests are very generic. There are currently 180,000 tests that execute in 24s.
While there are benefits like a true maximum size threshold (single LRU chain), there are some disadvantages in this design. Guava's weak/soft reference caching is primary with all other features tacked on. Bob's goal was to reduce the memory footprint by doing this, as ReferenceMap's wrappers were replaced with embedded values in the hash entry object. By having forked the hash table, we were able to add LRU, expiration, and weights directly to the entry as well. By choosing not to fork the new Java 8 CHM, all of these fields must be held in a wrapper on top of the value. To maintain a single maximum size / expiration policy (not segmented), the weight field (an integer) must always be present. This is because its unused bits are leveraged in a state machine that resolves race conditions between the hash table and the LRU policy.
There is a lot of work left to do before I would call this production ready. I'd estimate 2-4 work weeks to mature the implementation code from a prototype sprint to something that is clean and elegant. Another 3-4 work weeks should be dedicated to performance improvements: small optimizations, improving buffer management, and reducing memory overhead. A future release should focus on adding tracing and a simulator for both users (cache sizing) and developers (evaluating alternatives to LRU).
Hopefully this work will factor into the larger discussion of what Guava's future is as the team begins to tackle Java 8.
Happy new year,
Ben