ehcache toolkit 3.4 terracotta server restarts with java.lang.ClassCastException

77 views
Skip to first unread message

Nadeem Amin

unread,
Feb 6, 2018, 4:21:56 PM2/6/18
to ehcache-users
  1. What version of Ehcache you are currently using; www.ehcache.org/documentation/3.4/
  2. Paste the configuration for the Cache/CacheManager you have an issue with;see the attached file (similar setting on both nodes)
  3. Add any name and version of other library or framework you use Ehcache with (e.g. Hibernate);  none
  4. Providing JDK and OS versions maybe useful as well.
    Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
    Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
with 2 nodes running terracotta and grouped correctly.  most of the time server is running fine. occasionally server crashes and loses all the cached data with class Cast (see the attached logs) 
.... by: java.lang.ClassCastException: org.ehcache.clustered.server.internal.messages.EhcacheStateRepoSyncMessage cannot be cast to org.ehcache.clustered.server.internal.messages.EhcacheStateRepoSyncMessage


The problem can be recreated with 2 node env. , simply stopping node b and restart after a min.  it will crash node a .   when you restart node -a , it will crash node b.   

we have startup script  on cron tab to restart in case the process it down. eventually both nodes starts but  with lost cashed data. 

anyone encountered this issue or know if i am doing anything wrong ?
dev-node-a-tc-config.xml
terracotta-server-class-cast-exp.log

Anthony Dahanne

unread,
Feb 6, 2018, 5:59:16 PM2/6/18
to ehcache-users
Hello Nadeem,
It's weird that your configuration IPs 
<server host="99.999.99.167" name="dev-node-a">
<server host="99.999.99.168" name="dev-node-b">


 do not match your logs IPs / names
<server host="99.999.99.115" name="sat-node-a" bind="0.0.0.0">
<server host="99.999.99.116" name="sat-node-b" bind="0.0.0.0">

Anyway, could you please remove this bind IP from your configuration :
<tsa-port bind="99.999.99.167">8090</tsa-port>

to only have : <tsa-port>8090</tsa-port>

I'm not saying it's gonna fix your issue, but at least it will be something less to check.

Oh, and please share terracotta server logs from both nodes A and B
Thanks,

Nadeem Amin

unread,
Feb 7, 2018, 9:01:13 AM2/7/18
to ehcache-users
i removed the bind as stated , did not seem to make any difference.  i have attached logs from both servers and the config file.  it has masked ips and packages , thanks!!
issuelogs.zip

Anthony Dahanne

unread,
Feb 7, 2018, 9:36:50 AM2/7/18
to ehcache-users
Thank you, that helps.

Unfortunately there's nothing obviously wrong in your set up, that should work. 
But for a reason I don't yet understand the active server does not want to sync with the passive, and that causes it to crash.

There's only thing that could help us understand now : a reproducible use case.

Can you try to narrow down the most simplistic setup for your client that triggers this issue ? (cache.put(myObject) with the most simplistic object that can trigger the issue for example)

If you can do that and put it on github with instructions on what to do to reproduce, we'll definitely investigate further.
Thanks,
Anthony

Nadeem Amin

unread,
Feb 20, 2018, 1:31:49 PM2/20/18
to ehcache-users
i spent some time and switched to the basic objects which are string based key and values.  the pattern of server shutdown is  still happening when i stop and restart the active node, it shuts down the passive node and shut itself down as well with the same classcast exception.  
Reply all
Reply to author
Forward
0 new messages