WFLYCTL0348: Timeout after [300] seconds, missing START bootstrap > startInitialization in logs

636 views
Skip to first unread message

Kevin Washington

unread,
Jun 26, 2023, 9:56:22 AM6/26/23
to WildFly
This only seems to happen when trying to add a node to the cluster.  Initially servers that come up in the cluster do not have the issue.  Getting the error in the logs:

2023-06-24 23:35:38,301 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[
    ("core-service" => "management"),
    ("management-interface" => "http-interface")
]'

I compared some debug logs with a working environment and noticed that the failed server did not have DEBUG [org.jboss.weld.BootstrapTracker] (MSC service thread 1-3) START bootstrap > startInitialization in the log.

There is a data cache setup with this configuration:

             <cache-container name="replicated_cache" marshaller="JBOSS" modules="org.wildfly.clustering.server" statistics-enabled="true">
                <transport lock-timeout="60000"/>
                <replicated-cache name="DataCache" statistics-enabled="true">
                    <transaction locking="OPTIMISTIC" mode="FULL_XA"/>
<state-transfer timeout="${env.DataCache_STATE_TRANSFER_TIMEOUT:600000}"/>
                </replicated-cache>
            </cache-container>

I noticed on the working server that
DEBUG [org.infinispan.cache.impl.CacheImpl] (ServerService Thread Pool -- 91) Started cache DataCache on xx.xx.xx.xxx
is in the log, but not in the log of the failed server.  I do see it for the other caches like
Started cache org.infinispan.CONFIG on xxx.xxx.xx.xx

Question - Could the cache not being started prevent startInitialization, which then cause the deployment to fail?  Would something like setting the state-tranfer timeout to 0 help?

Kevin Washington

unread,
Jun 26, 2023, 9:58:24 AM6/26/23
to WildFly
Version is Wildfly 26.1.3

Kevin Washington

unread,
Jun 27, 2023, 4:31:33 PM6/27/23
to WildFly
State transfer by default was set to 10 minutes.  So I never saw the state transfer timeout error message because deployment timeout was 5 minutes.  When setting the state-transfer to 0, the servers joining the cluster would start.  The cause of the timeout was because an object could not be serialized by the marshaller:

14:47:39,310 WARN  [org.infinispan.PERSISTENCE] (thread-212,ejb,x.x.x.x) ISPN000559: Cannot marshall 'class org.infinispan.marshall.protostream.impl.MarshallableUserObject': java.io.NotSerializableException: com.sun.org.apache.xalan.internal.xsltc.runtime.Hashtable\n"

14:47:39,312 ERROR [org.infinispan.statetransfer.OutboundTransferTask] (thread-212,ejb,x.x.x.x) Failed to send entries to node x.x.x.x: com.sun.org.apache.xalan.internal.xsltc.runtime.Hashtable: org.infinispan.commons.marshall.MarshallingException: com.sun.org.apache.xalan.internal.xsltc.runtime.Hashtable\n"

Which then caused time state transfer to timeout.

Kevin Washington

unread,
Jun 30, 2023, 8:10:03 AM6/30/23
to WildFly
Does this still apply in Wildfly 26.1.3:

If no module is defined, the cache container will be configured to use
JBoss Marshalling.

We are migrating our application from Wildfly 21 to Wildfly 26.1.3.  The marshalling error:

ERROR [org.infinispan.statetransfer.OutboundTransferTask] (thread-212,ejb,x.x.x.x) Failed to send entries to node x.x.x.x: com.sun.org.apache.xalan.internal.xsltc.runtime.Hashtable: org.infinispan.commons.marshall.MarshallingException: com.sun.org.apache.xalan.internal.xsltc.runtime.Hashtable\n"

happens when a new pod joins the cluster.  This was not happening in Wildfly 21.  I assume that we need to use the JBoss marshaller vs ProtoStream.  I tried setting the marshaller and modules to:

marshaller="JBOSS" modules="org.wildfly.clustering.server"

But that did not work.  I also removed the marshaller and modules directory but it still get the same error.  Is there some other configuration I need to set to use the JBoss marshaller?
Reply all
Reply to author
Forward
0 new messages