LoadingCache refresh query

4,800 views
Skip to first unread message

Wallace Wadge

unread,
Dec 23, 2011, 6:27:03 AM12/23/11
to guava-discuss
Let's say that our LoadingCache refresh op needs to be performed
regularly. Currently in v12 the refresh op only kicks in whenever you
attempt to get the key in question. This is fine but as far as I can
see there's currently no way to force the map to perform a refresh on
any items that have expired, for example by calling cleanUp
periodically as I would do to force expireAfterWrite/Access to
trigger.

Or rather, no way except for doing something silly like:

for (K k: cache.asMap().keySet()){
cache.getIfPresent(k);
}


Is this a deliberate omission?

Louis Wasserman

unread,
Dec 23, 2011, 8:02:47 AM12/23/11
to Wallace Wadge, guava-discuss
Are you looking for refreshAfterWrite?

Wallace Wadge

unread,
Dec 23, 2011, 9:00:17 AM12/23/11
to guava-discuss
No, I'm looking for something to trigger the refreshAfterWrite say after 30 minutes and not wait for a cache.get() request to kick in first.

Wallace

Louis Wasserman

unread,
Dec 23, 2011, 9:24:18 AM12/23/11
to Wallace Wadge, guava-discuss
For a Cache configured with refreshAfterWrite, Cache.cleanUp() would trigger a refresh on any entries that are more than thirty minutes old.  ScheduledExecutor.scheduleAtFixedRate could call Cache.cleanUp() every <interval>.  Does that work for you?

Louis Wasserman

unread,
Dec 23, 2011, 9:30:15 AM12/23/11
to Wallace Wadge, guava-discuss
For reference, the reasoning here is that continuous cleanup -- as opposed to the "maintenance is occasionally performed" -- would require a separate thread, which would require you to make many more choices -- do it on a regular schedule?  In what thread?  Rather than try to make these decisions for you, Guava gives you the cleanUp() method and lets you set up scheduled calls to it if (and only if) you want to, how you want to.

Louis Wasserman

unread,
Dec 23, 2011, 10:22:32 AM12/23/11
to Greg Steffensen, Wallace Wadge, guava-discuss
To clarify, I quote from the CacheBuilder javadoc:

Certain cache configurations will result in the accrual of periodic maintenance tasks which will be performed during write operations, or during occasional read operations in the absence of writes. The Cache.cleanUp() method of the returned cache will also perform maintenance, but calling it should not be necessary with a high throughput cache. Only caches built with removalListener, expireAfterWrite, expireAfterAccess, weakKeys, weakValues, or softValues perform periodic maintenance. 

To summarize, most of the time, this sort of maintenance is performed in small increments in the course of normal cache usage.  Only if cache access is extremely rare does this sort of thing need to be performed manually.
On Fri, Dec 23, 2011 at 2:34 PM, Greg Steffensen <greg.st...@gmail.com> wrote:
You don't need to call cleanup though, right?  Even without calling it, cleanup operations will be performed eventually, I assume?

Raymond Rishty

unread,
Dec 23, 2011, 11:10:33 AM12/23/11
to Louis Wasserman, Greg Steffensen, Wallace Wadge, guava-discuss
Okay, I thought I understood, but now I think I'm confused. If I want to guarantee that a user will never trigger a database call, will expireAfterWrite do that, or do I need to have an executor that periodically invalidates and refreshes the cache? The latter is (essentially) what we're doing now, and has been a success for our high-throughput (millions of hits / day) web app. Of course, if I could accomplish this with a simple expireAfterWrite, that would be awesome.

Wallace Wadge

unread,
Dec 23, 2011, 11:17:34 AM12/23/11
to Louis Wasserman, guava-discuss
I understand this and it makes perfect sense (I use this technique already) however as far as i can tell, calling cleanup will only care about access or write expiry and ignores the refreshOnWrite setting which is why it looks like an omission to me.


Wallace


Louis Wasserman

unread,
Dec 23, 2011, 11:59:55 AM12/23/11
to Raymond Rishty, Greg Steffensen, Wallace Wadge, guava-discuss
Raymond, I'm not clear on which bit of code could potentially trigger a database call...

Let me try to pick this apart and give a complete explanation, so you can figure out the correct answer for your situation.

Refreshing and expiration are different things.  

When an entry gets refreshed, CacheLoader.refresh(key, oldValue) is called, which returns a ListenableFuture<V>.  The default implementation of CacheLoader.refresh synchronously calls CacheLoader.load, but if you want this to be asynchronous, you should override CacheLoader.refresh.  If you do this, the cache continues to return the old value when it is queried, until the new value is finished being loaded.  Once the new value is loaded, it replaces the old value atomically.

When an entry gets expired, it is removed from the cache.  No new computation is triggered, until the key is requested again, at which point the value is loaded from scratch.

CacheBuilder supports refreshAfterWrite and expireAfterWrite, which specify that cache entries should be refreshed or expired (respectively) after some specified duration has passed since the entry's creation, or since the most recent replacement of its value.

Wallace, it does look to me like the situation as you describe.  From the CacheBuilder javadoc:

Currently automatic refreshes are performed when the first stale request for an entry occurs. The request triggering refresh will make a blocking call to CacheLoader.reload(K, V) and immediately return the new value if the returned future is complete, and the old value otherwise.

The alternative you seem to want is more analogous to expireAfterWrite, which accrues maintenance operations which are performed on all keys, not just keys that have been requested.

I'm not sure why this was done the way it was; fry would know more.  As it stands, you should consider filing an issue -- I'm curious to learn the explanation myself.  That said, the code you propose is basically how I would deal with things for the moment, though your solution has disadvantages -- most prominently, ruining the LRU ordering.

Wallace Wadge

unread,
Dec 23, 2011, 12:28:23 PM12/23/11
to Louis Wasserman, Raymond Rishty, Greg Steffensen, guava-discuss
Louis, that's exactly what I meant. In my use case, I have a handle to a tcp connection that goes wonky if i leave it idle for too long but is expensive to reconnect, therefore i want to be able to refresh the connection in a separate thread whether it has been used or not. The current setup forces me to give back a potentially stale connection while asynchronously I go about creating a new one.

I will file a new issue.

Wallace

Raymond Rishty

unread,
Dec 23, 2011, 12:43:12 PM12/23/11
to Wallace Wadge, Louis Wasserman, Greg Steffensen, guava-discuss
Oh goodess. I saw refreshAfterWrite, and mentally replaced the old method I was familiar with--expireAfterWrite. Consider me shamed.

To ask more directly, it seems as though the refreshAfterWrite value causes the cache entry to be refreshed "after" a given amount of time, but that it is in fact triggered by a call to .get, as opposed to having a task scheduled to refresh the key when the countdown ends. Which means that the next user to try to access that key will have to wait for the CacheLoader.load (which, in my case, is invariably a database call). Our requirement is to get that cache refreshed before it expires and any user "takes the hit", which means we have another thread working to keep the cache up to date.

Louis Wasserman

unread,
Dec 25, 2011, 5:29:46 PM12/25/11
to Raymond Rishty, Wallace Wadge, Greg Steffensen, guava-discuss
What you state is not correct.
To ask more directly, it seems as though the refreshAfterWrite value causes the cache entry to be refreshed "after" a given amount of time, but that it is in fact triggered by a call to .get, as opposed to having a task scheduled to refresh the key when the countdown ends.
Correct.
Which means that the next user to try to access that key will have to wait for the CacheLoader.load (which, in my case, is invariably a database call).
This does not follow.  Override the method

ListenableFuture<V> CacheLoader.reload(K key, V oldValue) 

to do the database call asynchronously.  The default is to do it synchronously, but as the Javadoc says, you should probably override it if you're using refreshOnWrite.

If you do this, then while the key is still being refreshed, the old value will be returned with no additional delays to any queries in the meantime.  No user will have to wait any longer than normal, not the first user, not the second user, and nowhere in between.

The result is that refreshes might not happen quite as immediately as you'd like, but no users will have to wait on a database call.

Is this satisfactory?

Greg Steffensen

unread,
Dec 23, 2011, 9:34:19 AM12/23/11
to Louis Wasserman, Wallace Wadge, guava-discuss
You don't need to call cleanup though, right?  Even without calling it, cleanup operations will be performed eventually, I assume?

Charles Fry

unread,
Jan 3, 2012, 11:31:18 AM1/3/12
to Wallace Wadge, Louis Wasserman, guava-discuss
Wallace, you are correct that refresh won't currently occur in the absence of a user request for a given key. The spec is that "Specifies that active entries are eligible for automatic refresh once a fixed duration has elapsed after the entry's creation, or the most recent replacement of its value." However it does not guarantee that a refresh will ever occur. Specifically, the cache will heuristically decide when it believes a refresh would be justified.

As currently implemented and specified in the javadoc "Currently automatic refreshes are performed when the first stale request for an entry occurs." Otherwise no refresh.

This was intentional, and we believe it is indeed the behavior we currently want.

Note that you can use expireAfterWrite in conjunction with refreshAfterWrite in order to remove cache entries that are not refreshed.

Charles

Wallace Wadge

unread,
Jan 3, 2012, 11:36:59 AM1/3/12
to guava-discuss

On Jan 3, 5:31 pm, Charles Fry <f...@google.com> wrote:
> As currently implemented and specified in the javadoc "Currently automatic
> refreshes are performed when the first stale request for an entry occurs."
> Otherwise no refresh.
>

I understand that and I like that behaviour.

What I'm asking is that, in addition to that standard behaviour we get
a way to tell the cache to pretend we've just hit all the entries as I
would do by calling cleanup for expiryOnWrite/Access.

At the moment, my only option is to loop thru all the entries and
issue a fake get to trigger a refresh where necessary.

Wallace

Louis Wasserman

unread,
Jan 3, 2012, 11:39:37 AM1/3/12
to Wallace Wadge, guava-discuss
Do you never have cache values expire or go away?

Louis Wasserman

unread,
Jan 3, 2012, 11:41:54 AM1/3/12
to Wallace Wadge, guava-discuss
Also, would the following work for you?

cache.getAll(cache.asMap().keySet())

Wallace Wadge

unread,
Jan 3, 2012, 11:52:40 AM1/3/12
to guava-discuss


On Jan 3, 5:39 pm, Louis Wasserman <wasserman.lo...@gmail.com> wrote:
> Do you *never* have cache values expire or go away?
>
> Louis Wasserman
> wasserman.lo...@gmail.comhttp://profiles.google.com/wasserman.louis
>

Not in this particular use case no.

In my case, I have to hack around VMWare's poor Java api
implementation by periodically refreshing a connection to a server.
Basically the service is meant to be up 24/7 but the connections to
the server go bad after a while, therefore I want to refresh them
periodically. Why not simply expire them? Because again thanks to a
lousy implementation out of my control, setup cost for each connection
is very expensive so I want to always have a hot connection at my
disposal. Why not wait till the hit comes in? Because the way refresh
works I would be giving back a potentially stale/broken connection
back to the first client that requests it.

Wallace



Wallace Wadge

unread,
Jan 3, 2012, 11:55:14 AM1/3/12
to guava-discuss


On Jan 3, 5:41 pm, Louis Wasserman <wasserman.lo...@gmail.com> wrote:
> Also, would the following work for you?
>
> cache.getAll(cache.asMap().keySet())
>
> Louis Wasserman
> wasserman.lo...@gmail.comhttp://profiles.google.com/wasserman.louis
>


That works and is roughly my workaround; I'm only voicing my concern
because it seems out of line, from an API point of view, to be able to
call cleanup() which handles expires but I need another call to handle
refreshes not to mention it all looks pretty inefficient to hit all
entries just in case one of them needs a refresh.

Wallace

Wallace Wadge

unread,
Jan 3, 2012, 11:57:11 AM1/3/12
to guava-discuss


On Dec 23 2011, 3:34 pm, Greg Steffensen <greg.steffen...@gmail.com>
wrote:
> You don't need to call cleanup though, right?  Even without calling it,
> cleanup operations will be performed eventually, I assume?

Cleanup ops will be performed when you next hit the entries. Calling
cleanup means: do it right now, don't wait for the next hit.

Charles Fry

unread,
Jan 3, 2012, 11:59:04 AM1/3/12
to Wallace Wadge, guava-discuss
Yeah, I see your pain. Unfortunately the benefit of adding something here may not outweigh the added api complexity. It sounds like you want something like LoadingCache.refreshStaleEntries(), but that is a meaningless notion for caches not configured with refreshAfterWrite. I don't think we want to provide multiple different types of refresh behavior, and I do think that most of the time it is right to not refresh entries that aren't actually used.

Still, feel free to file a feature request so that we can keep your use case in mind as refresh continues to evolve.

Charles

Louis Wasserman

unread,
Jan 3, 2012, 12:01:18 PM1/3/12
to Charles Fry, Wallace Wadge, guava-discuss
I claim that most users of refreshAfterWrite() still want entries to expire at some point.  Forcing a refresh on all keys on calls to cleanUp() would be problematic for all of those users, no?

Wallace Wadge

unread,
Jan 3, 2012, 12:05:48 PM1/3/12
to guava-discuss


On Jan 3, 5:59 pm, Charles Fry <f...@google.com> wrote:
> Yeah, I see your pain. Unfortunately the benefit of adding something here
> may not outweigh the added api complexity.

Why add a new call rather than add to what cleanup does? After all
you're already expecting cleanup to be slow. I mean let cleanup()
expiry entries and, if so configured, also trigger refreshes.


> Still, feel free to file a feature request so that we can keep your use
> case in mind as refresh continues to evolve.
>


It's here: http://code.google.com/p/guava-libraries/issues/detail?id=835

Wallace

Wallace Wadge

unread,
Jan 3, 2012, 12:08:08 PM1/3/12
to guava-discuss


On Jan 3, 6:01 pm, Louis Wasserman <wasserman.lo...@gmail.com> wrote:
> I claim that most users of refreshAfterWrite() still want entries to expire
> at some point.  Forcing a refresh on all keys on calls to cleanUp() would
> be problematic for all of those users, no?
>

Don't see why they conflict here. The cleanup routine will go thru the
keys, expire all that need to be expired first and if there are
entries that require refreshing, do so then.

Am I missing something else here?

Wallace

Charles Fry

unread,
Jan 3, 2012, 12:09:40 PM1/3/12
to Wallace Wadge, guava-discuss
> Yeah, I see your pain. Unfortunately the benefit of adding something here
> may not outweigh the added api complexity.

Why add a new call rather than add to what cleanup does? After all
you're already expecting cleanup to be slow. I mean let cleanup()
expiry entries and, if so configured, also trigger refreshes.

Well, that would be true if we believed that every stale entry should always be refreshed. But we strongly believe that stale entries which have never been accessed should just be allowed to die (if there is a path by which they could die).

That said, maybe it would be conceivable for the expected behavior to differ depending on whether or not any other type of expiration was also used. I could buy an argument that if you really wanted an entry to stay in the cache at all costs then it might as well be kept up to date.

Charles

Louis Wasserman

unread,
Jan 3, 2012, 1:09:00 PM1/3/12
to Charles Fry, Wallace Wadge, guava-discuss
To put it another way, most people expect a cache entry which never gets queried to die, to stop taking up memory and hash table space, and your proposed modification would cost all of those users.

The current behavior when you use both expireAfterWrite and refreshAfterWrite is that:

* Entries become available for refresh at time A
* If an entry is queried before time B, it starts refreshing and resets the clock on both expireAfterWrite and refreshAfterWrite.
* If an entry is not queried before time B, it expires and dies, since it's "not getting used."

Your proposed modification would have the following result:

* Entries become available for refresh at time A
* Entries get refreshed as part of cleanup, resetting the clock
* Entries never reach time B and never expire

--

Adrian Cole

unread,
Jan 3, 2012, 1:29:45 PM1/3/12
to Charles Fry, Wallace Wadge, guava-discuss
I think this concern is a fairly typical one in pooling resources.

I wonder if we could make a structure that shares code and/or targets
these sorts of issues (ex. refresh on interval, cleanup before/after
use, etc). Here's some rough-in of Pool.
http://code.google.com/p/guava-libraries/issues/detail?id=683

-A

Dimitris Andreou

unread,
Jan 3, 2012, 3:43:08 PM1/3/12
to Adrian Cole, Charles Fry, Wallace Wadge, guava-discuss
This might be able to be tackled more generally (as a fallback for cases not so regular as to be tackled with refreshAfterWrite), with something like: 
Cache<K, V> {
  Cache<K, Aged<V>> agedView();
}

Just a thought.

Charles Fry

unread,
Jan 3, 2012, 5:03:26 PM1/3/12
to Dimitris Andreou, Adrian Cole, Wallace Wadge, guava-discuss
But even that isn't really efficient, as it would require O(n) iteration. If there were a time when we all agreed refresh should be deterministically applied, then we could do that in O(1) time during locked cleanup using the linked list already used to track write order.

Dimitris Andreou

unread,
Jan 3, 2012, 5:36:31 PM1/3/12
to Charles Fry, Adrian Cole, Wallace Wadge, guava-discuss
Wasn't really following this thread, and I don't know which scenario you're talking about (probably it's somewhere in the ~20 messages I didn't read :)). If being able to access the age of the entry for an arbitrary key doesn't help (as you seem to imply), please ignore my comment.

Charles Fry

unread,
Jan 3, 2012, 5:37:59 PM1/3/12
to Dimitris Andreou, Adrian Cole, Wallace Wadge, guava-discuss
Not that it doesn't help, I'm just not yet convinced that the api complexity is justified...

Dimitris Andreou

unread,
Jan 3, 2012, 6:02:53 PM1/3/12
to Charles Fry, Adrian Cole, Wallace Wadge, guava-discuss
Yes, I guess we'll have to see how many applicable scenarios come along (e.g. any case where the frequency of refresh is not a constant) - enough at least to warrant an extra method and a simple type to expose the age. (Not sure if Wallace's scenario is such a case, perhaps not). 

Wallace Wadge

unread,
Jan 4, 2012, 5:14:53 AM1/4/12
to guava-discuss


On Jan 4, 12:02 am, Dimitris Andreou <jim.andr...@gmail.com> wrote:
> Yes, I guess we'll have to see how many applicable scenarios come along
> (e.g. any case where the frequency of refresh is not a constant) - enough
> at least to warrant an extra method and a simple type to expose the age.
> (Not sure if Wallace's scenario is such a case, perhaps not).
>

Not in my case, no, but making it more general is of course more
useful.

Actually I wonder if "refresh" is the right semantic word to use, I'd
stick to something like "action" instead. Implemented properly, this
would enable for example a connection pool to leave an item in the map
but just send out a connection keep alive (a simple ping). I have a
feeling that we're closing in on something generic enough that will
add quite some rich functionality to the map.



Stepping back a little, perhaps I could have gotten away with just
using expireOnWrite and upon eviction, re-insert the item again though
it sounds a bit of a kludge and a bit too implementation dependent for
my liking.

Wallace


Charles Fry

unread,
Jan 4, 2012, 7:26:25 AM1/4/12
to Wallace Wadge, guava-discuss
Actually I wonder if "refresh" is the right semantic word to use, I'd
stick to something like "action" instead. Implemented properly, this
would enable for example a connection pool to leave an item in the map
but just send out a connection keep alive (a simple ping). I have a
feeling that we're closing in on something generic enough that will
add quite some rich functionality to the map.

Note that refresh is backed by CacheLoader.reload, which has the option of simply keeping the existing value. 

Wallace Wadge

unread,
Jan 4, 2012, 11:44:54 AM1/4/12
to guava-discuss
>
> Note that refresh is backed by CacheLoader.reload, which has the option of
> simply keeping the existing value.

Wops, yeah, I missed that.

Louis Wasserman

unread,
Jan 16, 2012, 2:04:56 PM1/16/12
to guava-discuss
---------- Forwarded message ----------
From: CemoKoc <cemalet...@gmail.com>
Date: Mon, Jan 16, 2012 at 8:09 AM
Subject: Re: LoadingCache refresh query
To: Louis Wasserman <wasserm...@gmail.com>


Hi,

I am pretty newbie to Guava and its fancy concurrency library.

Can someone suggest me to write a best practice Guava asynchronous
implementation for refresh. I have overwritten as

public ListenableFuture<List<User>> reload(final Integer key,
List<User> oldValue) throws Exception {

 return ListenableFutureTask.create(new Runnable() {
       @Override
       public void run() {
          cache.put(key,load(key));
       }
 },oldValue);
}

I am not sure I should have my own executorService for asynchronous
operations.

Shortly I need little assistance :)

Thanks

On Dec 23 2011, 6:59 pm, Louis Wasserman <wasserman.lo...@gmail.com>
wrote:

>
> When an entry gets *refreshed*, CacheLoader.refresh(key, oldValue) is

Charles Fry

unread,
Jan 17, 2012, 7:33:35 AM1/17/12
to Louis Wasserman, guava-discuss
Can someone suggest me to write a best practice Guava asynchronous
implementation for refresh. I have overwritten as

public ListenableFuture<List<User>> reload(final Integer key,
List<User> oldValue) throws Exception {

 return ListenableFutureTask.create(new Runnable() {
       @Override
       public void run() {
          cache.put(key,load(key));
       }
 },oldValue);
}

You shouldn't call cache.put, but instead simply return a future of the new value. For example:

        ListenableFutureTask<V> task = ListenableFutureTask.create(new Callable<V>() {
          @Override
          public V call() throws Exception {
            return load(key);
          }
        });
        executor.execute(task);
        return task;

 
I am not sure I should have my own executorService for asynchronous
operations.

If you're not sure, then don't. Reload doesn't generally need to be overridden, unless you have good reason to do so. :-)

Charles

Anthony Manfredi

unread,
May 10, 2012, 10:55:02 AM5/10/12
to guava-...@googlegroups.com, Louis Wasserman
The documentation for refreshAfterWrite at http://code.google.com/p/guava-libraries/wiki/CachesExplained#Eviction should really say this. Otherwise it is easy to assume that the cache will take care of starting the task you returned.

If you use the example in the documentation (which is missing the executor), you will return a future that is never executed and your cache will never be refreshed.

-Anthony

Charles Fry

unread,
May 10, 2012, 11:00:22 AM5/10/12
to Anthony Manfredi, guava-...@googlegroups.com, Louis Wasserman
Good catch. Updated.

Reply all
Reply to author
Forward
0 new messages