LIRS Cache

436 views
Skip to first unread message

pron

unread,
Feb 19, 2012, 5:25:16 PM2/19/12
to guava-...@googlegroups.com
Hi. 
I just watched the interesting "Concurrent Caching at Google" talk on infoq.com, and I was wondering if there's any news regarding when the LIRS eviction policy will be available in Guava. Does anyone know?

Kevin Bourrillion

unread,
Feb 21, 2012, 12:31:09 AM2/21/12
to pron, guava-...@googlegroups.com
I'm afraid it's a very low priority.

Just implementing it would be the easiest part.  Doing the Science[TM] required to actually prove it's worthwhile is hard, and time-consuming.  At this point we have more to gain by just helping users to more easily tune->monitor->tune using the existing algorithm.

On Sun, Feb 19, 2012 at 2:25 PM, pron <ron.pr...@gmail.com> wrote:
Hi. 
I just watched the interesting "Concurrent Caching at Google" talk on infoq.com, and I was wondering if there's any news regarding when the LIRS eviction policy will be available in Guava. Does anyone know?



--
Kevin Bourrillion @ Google
Java Core Libraries Team
http://guava-libraries.googlecode.com

Pat Farrell

unread,
Feb 21, 2012, 12:42:30 AM2/21/12
to Kevin Bourrillion, pron, guava-...@googlegroups.com
On Tue, Feb 21, 2012 at 12:31 AM, Kevin Bourrillion <kev...@google.com> wrote:
I'm afraid it's a very low priority.

Just implementing it would be the easiest part.  Doing the Science[TM] required to actually prove it's worthwhile is hard, and time-consuming.


I thought that the "Science" was done, and it showed that Denning's Working Set model was close to theoretically optimal. Way better than LRU or most of the other common contenders.

Ben Manes

unread,
Feb 21, 2012, 1:15:53 AM2/21/12
to guava-discuss
The paper "The Performance Impact of Kernel Prefetching on Buffer
Cache Replacement Algorithms" and my synthetic benchmarks indicated
that LIRS is a clear win. It is low-priority because it can be viewed
as an internal optimization, whereas previous work has focused on core
algorithms, features, and apis. Performance optimizations were
deferred, which LIRS is compared to LRU.

LIRS has additional complexities in a concurrent setting compared to
LRU. The constant resizing of the stack and queue can cause race
conditions with weighted entries. This isn't an issue in the near term
due to CacheBuilder using a segmented hash-table. The removal of write
segments, such as in CLHM and CHMv8, causes the race condition to be
an implementation challenge. This caused me troubles in my efforts to
implement LIRS in CLHM, which is my playground before helping Charles
bring the approaches into Guava.

There are a lot of performance-related areas to explore. I like LIRS,
but I also dislike making design decisions that may not be forward
compatible if ever ported to CHMv8's hash-table design.

-Ben

Louis Wasserman

unread,
Feb 21, 2012, 10:53:31 AM2/21/12
to Ben Manes, guava-discuss
Hooray for science!

Kevin Bourrillion

unread,
Feb 21, 2012, 12:52:09 PM2/21/12
to Pat Farrell, pron, guava-...@googlegroups.com
Yeah, I was totally cryptic.  I'm sure the theoretical Science[TM] is quite sound.  However, we still have to make an attempt to

(a) quantify how much it will actually help our actual services X Y Z in production (ideally, capture real-life trace logs and run simulations against them)
(b) have the means to effectively demonstrate in retrospect that it really did that (CacheStats paves the way...)

Obviously we don't want to present to our superiors, "This paper said LIRS would be better, so we did what it said, done."

Neither of these things are prohibitively difficult, it's just that we have many things competing for our time.

pron

unread,
Feb 21, 2012, 2:04:05 PM2/21/12
to guava-...@googlegroups.com, Pat Farrell, pron
I see, thanks.
JBoss have implemented LIRS eviction for Infinispan here (though it's lacking weighted entries. I'll try and add that myself when I get around to it). They're using the same approach as the Guava cache of a segmented hash-table with "amortized locking" (acces queues), which they've discussed here. Thy're reporting that "One of our partner companies replaced their crucial caching component with BoundedConcurrentHashMap and realized a 54% performance improvement on the Berlin SPARQL benchmark for their flagship product".

From reading Ben Manes's response here as well as on the CLHM dev-list, I realize LIRS eviction is hard-to-virtually-impossible to implement in a lock-free manner, and so will be incompatible with CLHM when CHMv8 arrives (from my understanding CHMv8 is pretty similar to Cliff Click's NBHM, right?), because CLHM wraps a CHM instance, so the whatever performance benefits CHMv8 provides will be lost if used in conjunction with a cache eviction algorithm which uses striped locks.

But Guava cache (as well as Infinispan's BoundedCHM) don't wrap CHM but re-implement the current version entangled with their own code, so it would be interesting to see if LIRS benefits could outweigh those of a lock-free, LRU CLHM.

Benjamin Manes

unread,
Feb 21, 2012, 2:28:59 PM2/21/12
to pron, guava-...@googlegroups.com, Pat Farrell
To clarify, LIRS is fine in CLHM if all entries have the same weight: 1. It is incompatible if entries vary in weights, in particular if an entry's weight changes over its lifetime. To resolve this, multiple policies must be implemented based on the configuration constraints. This would allow LIRS for singleton weights and LRU / GDSF / ??? if weighted entries are specified.

CLHM is a wrapper that is compatible with any ConcurrentMap implementation, so it will work with CHM / CHMv8 / NBHM / etc. It does not use striped locking internally so it inherits all of the concurrency characteristics of the hash-table design. It provides a global eviction policy managed by a non-blocking lock and spreads work across multiple buffers to avoid contention for hot entries.

CHMv8 is a different design from NBHM. It is essentially lock-free. I recall correctly buckets are locked, but usually only contain a single entry or span just a few. The design tries to maintain better performance than NBHM when scaling from low-core to high-core systems, whereas NBHM is designed primarily for high-core machines (e.g. 1024).

The goal for CLHM is to provide a reference implementation for Guava, JBoss, JSR-166, etc with respect to algorithmic designs. It therefore is more researchy by trying to provide the algorithmic ideals by solving the hardest problems (e.g. no striped locks). Guava and JBoss then took a more pragmatic approach to provide their additional feature sets. A JSR-166 implementation will hopefully be a marriage between Doug's excellent CHMv8, CLHM, and CacheBuilder.

-Ben

--

Pat Farrell

unread,
Feb 21, 2012, 4:42:34 PM2/21/12
to Benjamin Manes, pron, guava-...@googlegroups.com
I just read the quoted paper. The Science of this is perhaps well defined for the specific set of requirements for a file-system cache in the kernel of a Linux-like operating system.

From the discussion, I was thinking that this was a candidate for an optimal cache strategy in general, which it clearly is not. Belady's paper addresses how one approaches an optimal strategy, and its trivial to see that the commonly used LRU fails seriously in common use cases.

The Science says that most strategies differ in small amounts of performance for typical usages, so worrying about the last 5% of performance is, in general, premature optimization.

pron

unread,
Feb 21, 2012, 4:56:08 PM2/21/12
to guava-...@googlegroups.com, Benjamin Manes, pron
Well, it is certainly true that one of the problems LIRS tries to tackle, that of sequential page access, is very common in file systems, and quite rare in Java business caches.

Benjamin Manes

unread,
Feb 21, 2012, 5:09:25 PM2/21/12
to pron, guava-...@googlegroups.com
In my synthetic benchmark I use a scrambled Zipfian distribution which matches real-world Java server behavior. I think it provided a 5-10% hit rate on average. This is a win when caches are in front of I/O calls, such as in Cassandra's key/row caches. However in most cases the application caches can be layered in-front of a larger remote cache (memcache) minimizing the penalty significantly.

As Kevin said, a tool for tuning cache sizes is a pre-condition to both provide hints for sizing any eviction policy and the ability to evaluate based on real-world applications. The complexity and potential gains of LIRS didn't outweigh the higher priority tasks of core features, api, and knowledge transfer.

Min

unread,
Feb 27, 2012, 7:32:55 PM2/27/12
to guava-discuss
Have you considered using the ARC algorithm? What about using the ARC,
which seem to outperform both LRU and LIRS:

http://www.almaden.ibm.com/cs/people/dmodha/ARC.pdf

--min

Benjamin Manes

unread,
Feb 27, 2012, 10:56:07 PM2/27/12
to Min, guava-discuss
ARC is patented by IBM and, to the best of our knowledge, is not licensed for use in open-source projects.

Reply all
Reply to author
Forward
0 new messages