jgroups ehcache replication TimeoutExceptions after upgrade

277 views
Skip to first unread message

mha...@visiblehealth.com

unread,
Mar 11, 2016, 8:53:17 AM3/11/16
to ehcache-users

  1. What version of Ehcache you are currently using;    3.6.2  
  2. Paste the configuration for the Cache/CacheManager you have an issue with;     
  3. Add any name and version of other library or framework you use Ehcache with (e.g. Hibernate);   Hibernate-Core: 4.3.11
  4. Providing JDK and OS versions maybe useful as well.  1.8

Firstly, let me ask:  Is ehcache-jgroupsreplication compatible with jgroups 3.6.1?  

I ask that question first because this all worked fine but started getting TimeoutException failures when upgrading hibernate from 4.1 to 4.3, and jgroups from 3.1 to 3.6.    

[WARN ] [2016.03.11 00:12:54] jgroups.JGroupsBootstrapManager - Bootstrap of some.Entity did not complete in 300000ms, giving up on bootstrap request to machine2-20306.


I've tried to tune the jgroups config, but we are forced to do pure unicast  (no udp, and no multicast) and currently am running with this config:

<!--
This is a copy of tcp.xml from the jgroups=3.6.2.Final.jar, with some changes
  for bind port, initial hosts and an increase to the FD timeout.
-->

<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       
xmlns="urn:org:jgroups"
       
xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
 
<TCP bind_port="${ehcacheDistributed.jgroups.bindPort}"
         
bind_addr="NON_LOOPBACK"
         
recv_buf_size="${tcp.recv_buf_size:5M}"
         
send_buf_size="${tcp.send_buf_size:5M}"
         
max_bundle_size="64K"
         
max_bundle_timeout="30"
         
enable_bundling="true"
         
use_send_queues="true"
         
sock_conn_timeout="300"


         
timer_type="new3"
         
timer.min_threads="4"
         
timer.max_threads="10"
         
timer.keep_alive_time="3000"
         
timer.queue_max_size="500"
         
         
thread_pool.enabled="true"
         
thread_pool.min_threads="2"
         
thread_pool.max_threads="8"
         
thread_pool.keep_alive_time="5000"
         
thread_pool.queue_enabled="true"
         
thread_pool.queue_max_size="10000"
         
thread_pool.rejection_policy="discard"


         
oob_thread_pool.enabled="true"
         
oob_thread_pool.min_threads="1"
         
oob_thread_pool.max_threads="8"
         
oob_thread_pool.keep_alive_time="5000"
         
oob_thread_pool.queue_enabled="false"
         
oob_thread_pool.queue_max_size="100"
         
oob_thread_pool.rejection_policy="discard"/>
                         
   
<TCPPING async_discovery="true"
             
initial_hosts="${ehcacheDistributed.jgroups.tcpping.initialhosts}"
             
port_range="1"
             
num_initial_members="10"/>
   
<MERGE3  min_interval="10000"
             
max_interval="30000"/>
   
<FD_SOCK/>
   
<FD timeout="35000" max_tries="6" />
   
<VERIFY_SUSPECT timeout="1500"  />
   
<BARRIER />
   
<pbcast.NAKACK2 use_mcast_xmit="false"
                   
discard_delivered_msgs="true"/>
   
<UNICAST3 />
   
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                   
max_bytes="4M"/>
   
<pbcast.GMS print_local_addr="true" join_timeout="2000"
               
view_bundling="true"/>
   
<UFC max_credits="2M"
         
min_threshold="0.4"/>
   
<FRAG2 frag_size="60K"  />
   
<pbcast.STATE_SOCK/>
   
<pbcast.FLUSH end_flush_timeout="4000"/>    
</config>

Here is my cache config

<?xml version="1.0" encoding="UTF-8"?>

 

<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

 
xsi:noNamespaceSchemaLocation="ehcache.xsd">

 
<diskStore path="java.io.tmpdir" />


 

 
<cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"/>


 

 
<cache name="org.hibernate.cache.spi.UpdateTimestampsCache"

 
maxElementsInMemory="10000" eternal="false" timeToIdleSeconds="120"  statistics="true"

 
timeToLiveSeconds="120">

 
<cacheEventListenerFactory

 
class="net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory"

 
properties="replicateAsynchronously=true, replicatePuts=true,replicateUpdates=true, replicateUpdatesViaCopy=false,replicateRemovals=true" />

 
<bootstrapCacheLoaderFactory

 
class="net.sf.ehcache.distribution.jgroups.JGroupsBootstrapCacheLoaderFactory"

 
properties="bootstrapAsynchronously=true" />

 
</cache>


 

 
<cache name="org.hibernate.cache.internal.StandardQueryCache"

 
maxElementsInMemory="10000" eternal="false" timeToIdleSeconds="120" statistics="true"

 
timeToLiveSeconds="120">

 
<cacheEventListenerFactory

 
class="net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory"

 
properties="replicateAsynchronously=true, replicatePuts=true,replicateUpdates=true, replicateUpdatesViaCopy=false,replicateRemovals=true" />

 
<bootstrapCacheLoaderFactory

 
class="net.sf.ehcache.distribution.jgroups.JGroupsBootstrapCacheLoaderFactory"

 
properties="bootstrapAsynchronously=true" />

 
</cache>

 

 
<defaultCache maxElementsInMemory="35000" eternal="true"

 
timeToIdleSeconds="0" timeToLiveSeconds="0" overflowToDisk="true"

 
maxElementsOnDisk="1000000" diskPersistent="false"

 
diskExpiryThreadIntervalSeconds="120" memoryStoreEvictionPolicy="LRU" statistics="true">

 
<cacheEventListenerFactory

 
class="net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory"

 
properties="replicateAsynchronously=true, replicatePuts=true,replicateUpdates=true, replicateUpdatesViaCopy=false,replicateRemovals=true" />

 
<bootstrapCacheLoaderFactory  

 
class="net.sf.ehcache.distribution.jgroups.JGroupsBootstrapCacheLoaderFactory"

 
properties="bootstrapAsynchronously=true" />

 
</defaultCache>

</ehcache>


Message has been deleted

mha...@visiblehealth.com

unread,
Mar 17, 2016, 1:19:08 PM3/17/16
to ehcache-users
I went back to using the flush-tcp.xml from the jgroups 3.6.1 jar, which didn't help directly, but removing <pbcast.FLUSH/> from that config stopped the timeoutexceptions.  I now get

[ERROR] [2016.03.16 16:15:35] jgroups.JChannel - JGRP000019: failed passing message to receiver
java
.lang.IllegalArgumentException: java.io.InvalidObjectException: Could not find a SessionFactory [uuid=cf87ba82-a0e8-4b78-8abf-2eabe810bec7,name=defaultS
essionFactory
]


Which is an issue I've seen before and reported here:   http://forums.terracotta.org/forums/posts/list/7336.page.  Alas, we have a hibernate. session_factory_name set now as can be seen in the error.  Nevertheless, the cache seems to be working fine, so I may just live with these errors, though I'd love to know what is causing them.

mha...@visiblehealth.com

unread,
Mar 20, 2016, 11:34:38 PM3/20/16
to ehcache-users
See https://sourceforge.net/p/javagroups/discussion/130427/thread/637179ca/  -- where the jgroups author states 

IMO ehcache working with 3.6.x is pure coincidence... 

I can't say I'd recommend using this combination.  We may limp along with it in production for a while.
Reply all
Reply to author
Forward
0 new messages