Stale cache issues when using HazelcastLocalCacheRegionFactory

443 views
Skip to first unread message

jbellassai

unread,
Oct 22, 2013, 11:48:54 AM10/22/13
to haze...@googlegroups.com
Hi,

We've been experiencing stale cache issues when using HazelcastLocalCacheRegionFactory (Hazelcast 2.6.3, Hibernate 3.6.10.Final) in our project, specifically involving the query cache for sure.

What appears to be happening is that when a Hibernate update of any kind (insert, update, delete) is performed on one node, the cached timestamp for that table is updated on that node itself, but never gets propagated to any other node. The effect being that updates are seen on the node that performed the update, but all other nodes continue to see stale cached data.

For reference, we are setting the following properties:

hibernate.cache.use_second_level_cache=true
hibernate.cache.use_query_cache=true
hibernate.cache.use_minimal_puts=true
hibernate.generate_statistics=false
hibernate.cache.use_structured_entries=false

The one particular Entity that I have been testing with is annotated as such:

@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)


I mentioned that we are using Hazelcast 2.6.3 but I have also tested (same problem) with 2.6.4 and 2.6.5-SNAPSHOT.  

One thing I did notice in the code is that com.hazelcast.hibernate.local.LocalRegionCache only sends an invalidation message when the update() method is called. When put() is called, the value is updated in the local cache ONLY.  On a whim, I updated put() to invoke update() (exactly as it is done in com.hazelcast.hibernate.distributed.IMapRegionCache) and my stale cache issue seems to be fixed.

So am I doing something wrong, or is this a bug?

Attached is a patch made against the maintenance-2.x branch.

Thanks!
John
Stale_cache_fix.patch

Mehmet Dogan

unread,
Oct 22, 2013, 4:51:50 PM10/22/13
to haze...@googlegroups.com
Hi John,

Thanks for the patch. Let us investigate the problem. Do you have a unit test to reproduce the issue or is it just happening randomly during a load test or production? 


@mmdogan

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at http://groups.google.com/group/hazelcast.
For more options, visit https://groups.google.com/groups/opt_out.
<Stale_cache_fix.patch>

Barry Lagerweij

unread,
Oct 22, 2013, 5:14:45 PM10/22/13
to haze...@googlegroups.com
You might want to reconsider using the query cache in the first place, see http://tech.puredanger.com/2009/07/10/hibernate-query-cache/

Also, I really wonder whether it's useful to have a distributed map (such as Hazelcast) for your secondlevel cache. Usually it's faster to simply query the database instead of performing a remote call to fetch the data.

Finally, if you're performing many inserts/updates, you'll end up with a lot of cache invalidation messages on all nodes, which could reduce performance instead of improving it.

Just my 2 cents.
Barry


--

jbellassai

unread,
Oct 23, 2013, 9:57:41 AM10/23/13
to haze...@googlegroups.com
Mehmet,

Thanks! Unfortunately, I do not have a unit test at this time, though I might be able to put one together in the next couple days if I get some time... We just noticed this issue in a pre-production test environment and it appears to be happening quite consistently -- not randomly. 

John

jbellassai

unread,
Oct 23, 2013, 10:15:52 AM10/23/13
to haze...@googlegroups.com
Barry,

Thanks for your input.  I had in fact read that blog post recently and have considered what the author is saying.  

To address your points:

1. At this time, we are not using the distributed map implementation for the caches, but are instead using HazelcastLocalCacheRegionFactory which maintains local non-distributed caches and sends invalidation messages when necessary.  I had the same concern as you when doing the initial research and this seems to be working quite well so far.

2. In our case, the data for which we are using the query cache is generally updated very infrequently but is read VERY frequently. Our load tests show a very remarkable improvement when enabling the caches in terms of both response time and reduced load on the database server.  Even if we are introducing a significant amount of lock contention as a result of using the query cache, for us, the benefits seem to far outweigh the consequences at this point.

John

Barry Lagerweij

unread,
Oct 24, 2013, 2:13:33 AM10/24/13
to haze...@googlegroups.com
John,

I did not realize that Hazelcast already provides a local (invalidation only) cache region, thanks !
We are currently using Hazelcast for peer-discovery and use a hazelcast Topic for sending Eviction messages, the cache implementation that we use is EhCache. This seems to work out pretty well for us. One of the things we had to enhance was how to recover from a split brain: When nodes are reconnecting to the cluster (after being disconnected), we decided to evict all caches on all nodes. Without this, we saw a lot of stale cache issues. I'm not sure how the Hazelcast cache providers handle this situation ?!?

We faced lots of performance issues with the query cache, which is why we decided to disable it. But you're right, in some use-cases (many reads, occasional write) it might help to improve the performance.

Barry



Mehmet Dogan

unread,
Oct 25, 2013, 5:51:20 AM10/25/13
to haze...@googlegroups.com
Here is the bug report;


Only timestamps cache should send replication message when a put or update received. (By the way timestamps cache sends replication, query cache is local and other user caches send invalidation messages). Since, regular put is called only when an entry inserted into db.
 

@mmdogan



--

jbellassai

unread,
Oct 25, 2013, 9:47:35 AM10/25/13
to haze...@googlegroups.com
Thanks, Mehmet!  I see that the issue is tagged with the 3.x label and not 2.x.  Can you tell me if this fix will likely get merged back into the 2.6.x branch?  We cannot update to 3.x quite yet...

John

Mehmet Dogan

unread,
Oct 25, 2013, 10:17:35 AM10/25/13
to haze...@googlegroups.com
Added another pull request for 2.x branch;


@mmdogan

Reply all
Reply to author
Forward
0 new messages