Fail Upgrading Wildfly Cluster from 23.0.2.Final to 24.0.0.Final

576 views
Skip to first unread message

D E

unread,
Jul 3, 2021, 8:39:26 AM7/3/21
to WildFly
I have a cluster of 2 hosts (host1 and host2), each with Wildfly 23.0.2.Final, which run under the standalone/configuration/standalone-ha.xml configuration.

Since we need high availability I upgrade one host at a time from 23.0.2.Final to 24.0.0.Final.  However upgrading host1 fails to deploy my app with the error I will paste at the end of this note.

If both hosts are brought down and then the upgrade happens (cluster cold start), all is well.

I am aware of some techniques that will allow a new cluster to form, such as providing unique addresses to a specific version of Wildfly to jboss.default.multicast.address.  However there are ways in which this is not seamless to currently active users of the standard web session cache.

We follow Wildfly Final version upgrades as they are published and I have seen this issue before, but I don't know where to see that this will happen in the release notes.  Could this issue be made a little more obvious please?  And also can it be minimized for Final releases?

Or maybe this is some bug, I'm not certain, but any information would be helpful.

Thanks!  Here is the error:

2021-07-02 21:45:30,828 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 89) MSC000001: Failed to start service org.wildfly.clustering.infinispan.cache.ejb.http-remoting-connector: org.jb
oss.msc.service.StartException in service org.wildfly.clustering.infinispan.cache.ejb.http-remoting-connector: org.infinispan.commons.CacheConfigurationException: Error starting component org.infinispan.
statetransfer.StateTransferManager
at org.wildfly.clu...@24.0.0.Final//org.wildfly.clustering.service.FunctionalService.start(FunctionalService.java:66)
at org.wildfly.clu...@24.0.0.Final//org.wildfly.clustering.service.AsyncServiceConfigurator$AsyncService.lambda$start$0(AsyncServiceConfigurator.java:117)
at org.jbos...@2.4.0.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jbos...@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1990)
at org.jbos...@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
at org.jbos...@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
at java.base/java.lang.Thread.run(Thread.java:829)
at org.jbos...@2.4.0.Final//org.jboss.threads.JBossThread.run(JBossThread.java:513)
Caused by: org.infinispan.commons.CacheConfigurationException: Error starting component org.infinispan.statetransfer.StateTransferManager
at org.inf...@12.1.4.Final//org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:572)
at org.inf...@12.1.4.Final//org.infinispan.factories.impl.BasicComponentRegistryImpl.access$700(BasicComponentRegistryImpl.java:30)
at org.inf...@12.1.4.Final//org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.running(BasicComponentRegistryImpl.java:787)
at org.inf...@12.1.4.Final//org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:354)
at org.inf...@12.1.4.Final//org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:250)
at org.inf...@12.1.4.Final//org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:213)
at org.inf...@12.1.4.Final//org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:1015)
at org.inf...@12.1.4.Final//org.infinispan.cache.impl.AbstractDelegatingCache.start(AbstractDelegatingCache.java:512)
at org.inf...@12.1.4.Final//org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:698)
at org.inf...@12.1.4.Final//org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:644)
at org.inf...@12.1.4.Final//org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:533)
at org.inf...@12.1.4.Final//org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:511)
at org.jboss.as.clus...@24.0.0.Final//org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:92)
at org.wildfly.cluste...@24.0.0.Final//org.wildfly.clustering.infinispan.spi.service.CacheServiceConfigurator.get(CacheServiceConfigurator.java:77)
at org.wildfly.cluste...@24.0.0.Final//org.wildfly.clustering.infinispan.spi.service.CacheServiceConfigurator.get(CacheServiceConfigurator.java:55)
at org.wildfly.clu...@24.0.0.Final//org.wildfly.clustering.service.FunctionalService.start(FunctionalService.java:63)
... 7 more
Caused by: java.util.concurrent.CompletionException: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 2 from host2
at org.inf...@12.1.4.Final//org.infinispan.util.concurrent.CompletionStages.join(CompletionStages.java:81)
at org.inf...@12.1.4.Final//org.infinispan.statetransfer.StateTransferManagerImpl.start(StateTransferManagerImpl.java:134)
at org.inf...@12.1.4.Final//org.infinispan.statetransfer.CorePackageImpl$2.start(CorePackageImpl.java:104)
at org.inf...@12.1.4.Final//org.infinispan.statetransfer.CorePackageImpl$2.start(CorePackageImpl.java:83)
at org.inf...@12.1.4.Final//org.infinispan.factories.impl.BasicComponentRegistryImpl.invokeStart(BasicComponentRegistryImpl.java:604)
at org.inf...@12.1.4.Final//org.infinispan.factories.impl.BasicComponentRegistryImpl.doStartWrapper(BasicComponentRegistryImpl.java:595)
at org.inf...@12.1.4.Final//org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:564)
... 22 more
Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 2 from host2
at org.inf...@12.1.4.Final//org.infinispan.remoting.transport.impl.SingleTargetRequest.onTimeout(SingleTargetRequest.java:85)
at org.inf...@12.1.4.Final//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
at org.inf...@12.1.4.Final//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)

Paul Ferraro

unread,
Jul 3, 2021, 12:12:28 PM7/3/21
to WildFly
In general, WildFly does not support heterogeneous clusters containing mixed major versions. 
Traditionally, one upgrades a WF servers by starting new server versions using an isolated cluster (e.g, distinct multicast address, bind ports, etc.)

D E

unread,
Jul 3, 2021, 3:07:23 PM7/3/21
to WildFly
Paul, thank you for responding!

Just a small follow-up please:

I believe I can automate this by changing...

<channel name="ee" stack="udp" cluster="ejb"/>

...to add the major version number, for example...

<channel name="ee" stack="udp" cluster="ejb-24"/>

My questions to understand if this is too much or not enough:
  1. Will each major version number change always bring incompatible clustering?
  2. Will all non-major version number changes provide compatible clustering?
  3. If the answer to either of these is "no" , then is there a better attribute with which to drive this?
Otherwise, I'll have to assume there is no way to predict this 100% of the time, and therefore should test, see, and potentially adapt to each  upgrade

Paul Ferraro

unread,
Jul 8, 2021, 8:06:46 PM7/8/21
to WildFly
On Saturday, July 3, 2021 at 3:07:23 PM UTC-4 D E wrote:
Paul, thank you for responding!

Just a small follow-up please:

I believe I can automate this by changing...

<channel name="ee" stack="udp" cluster="ejb"/>

...to add the major version number, for example...

<channel name="ee" stack="udp" cluster="ejb-24"/>

My questions to understand if this is too much or not enough:
  1. Will each major version number change always bring incompatible clustering?
Usually.
  1. Will all non-major version number changes provide compatible clustering?
Minors, probably not (though, since we moved to quarterly releases, we almost never do minor releases).  Micros, usually, but no explicit guarantees. 
  1. If the answer to either of these is "no" , then is there a better attribute with which to drive this?
Isolating clusters via cluster name is feasible so long as it is temporary.
That is, without bind port/address isolation, cluster members can still receive messages sent by members from another cluster, though such messages will be discarded if the cluster name does not match.
Otherwise, using a cluster name that includes the version is actually a pretty good idea.

D E

unread,
Jul 8, 2021, 8:23:13 PM7/8/21
to WildFly
Thanks for confirming!!!

JbossVT

unread,
May 3, 2022, 10:25:42 AM5/3/22
to WildFly
Hi, I have 2 servers in DEV, 2 servers in STG and 4 servers in PRD running on Jboss AS 7.1
I have created 2 vms and installed wildfly 26 and started them in standalone-ha.xml with default settings from install.
After the server started, all the DEV, STG, and PRD application started showing Wildfly home page.
As soon as we stopped the wf servers, all went back to normal.
We did not have this issue when I started wf with standalon.xml

Any suggestions please?
Thank you

On Saturday, July 3, 2021 at 8:39:26 AM UTC-4 D E wrote:

Paul Ferraro

unread,
May 3, 2022, 2:13:48 PM5/3/22
to WildFly
Your question does not relate to this topic.  Can you start a new topic and add enough detail that someone can diagnose your problem?
Reply all
Reply to author
Forward
0 new messages