Wildfly 26 -> 31 TimerService during --start-mode=suspend

247 views
Skip to first unread message

D E

unread,
Apr 15, 2024, 3:02:01 PM4/15/24
to WildFly
We have a Startup Singleton which has a PostConstruct that starts a 10 second timer:

@Resource private TimerService timerService;
...
    Collection<Timer> timers = timerService.getTimers();
    for (Timer timer : timers) {
      timer.cancel();
    }
...
    timerService.createSingleActionTimer(TIMER_INTERVAL_MILLISECONDS, new TimerConfig());
...

We also initially run standalone.sh --start-mode=suspend and then later ":resume" from the "cli".

In Wildfly 26 this has worked fine (meaning we think the timer fires after we resume).

In Wildfly 31, the timer does not fire (or fires during the time when Wildfly is in suspend mode and is thus ignored).

If we set the timer to a longer time such as 120 seconds, then we have already come out of suspend mode and the timer fires.

We have not significantly altered the standalone/configuration/standalone-full-ha.xml file that we are using in this regard.


James Perkins

unread,
Apr 22, 2024, 11:46:44 AM4/22/24
to WildFly
Did you migrate the package names for the imports from javax -> jakarta?

Everly, David W

unread,
Apr 22, 2024, 11:48:35 AM4/22/24
to James Perkins, WildFly
Absolutely!  And I expect it would not work with a longer setting if I hadn't.

--
You received this message because you are subscribed to the Google Groups "WildFly" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wildfly+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wildfly/a6ef08e9-270c-421b-8017-5f8629c0d731n%40googlegroups.com.

James Perkins

unread,
Apr 22, 2024, 11:56:54 AM4/22/24
to WildFly
I had assumed you had :) I don't think the behavior here should have changed, but I'm not an EJB expert. Given you are using full-ha, I would if the issue could be a result of one of these:

James Perkins

unread,
Apr 22, 2024, 11:58:18 AM4/22/24
to WildFly
Sorry, forgot to add https://issues.redhat.com/browse/WFLY-15349 as well. 

Everly, David W

unread,
Apr 22, 2024, 12:30:11 PM4/22/24
to James Perkins, WildFly
None of those three seem to have words that align with our issue, which is:

  1. we start in suspend mode
  2. in a start singleton a 10 second single action timer is created
  3. 10 seconds later the timer should fire (and does nothing)
  4. 20 seconds later we ":resume" into normal mode
  5. the timer is basically lost

Everly, David W

unread,
Apr 22, 2024, 2:29:30 PM4/22/24
to James Perkins, WildFly
Here is a reproducer.  Start with wilfly-31.0.1.Final.zip and quickstart-31.0.1.Final.zip

Add the following dependency to quickstart-31.0.1.Final/ejb-timer/pom.xml:

    <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-api</artifactId>
      <scope>provided</scope>
    </dependency>

Delete all java files under ejb-timer except TimeoutExample.java and in that file should be exactly and only this:

package org.jboss.as.quickstarts.ejb.timer;


import java.util.Collection;


import org.slf4j.Logger;

import org.slf4j.LoggerFactory;


import jakarta.annotation.PostConstruct;

import jakarta.annotation.Resource;

import jakarta.ejb.Singleton;

import jakarta.ejb.Startup;

import jakarta.ejb.Timeout;

import jakarta.ejb.Timer;

import jakarta.ejb.TimerConfig;

import jakarta.ejb.TimerService;


@Singleton

@Startup

public class TimeoutExample {

private static final Logger LOGGER = LoggerFactory.getLogger(TimeoutExample.class);


@Resource private TimerService timerService;


@Timeout

public void timeout(Timer timer) {

cancelTimers();

LOGGER.info("10 second timer executed");

}


@PostConstruct

public void initialize() {

cancelTimers();

timerService.createSingleActionTimer(10000, new TimerConfig());

LOGGER.info("10 second timer created");

}


private void cancelTimers() {

Collection<Timer> timers = timerService.getTimers();

for (Timer timer : timers) {

timer.cancel();

}

}

}


Build the war file and drop it into wildfly-31.0.1.Final/standalone/deployments


Then run:


standalone.sh --start-mode=suspend --server-config=standalone-full-ha.xml


Wait 20 seconds, then start the cli, connect, and run: :resume


You will see that the timer is lost


Interestingly under this configuration, the timer is not lost: standalone.sh --start-mode=suspend --server-config=standalone.xml




Everly, David W

unread,
Apr 22, 2024, 3:48:12 PM4/22/24
to James Perkins, WildFly

Is there a "cli" configuration that we can apply against standalone-full-ha.xml that would get us back in business?

Paul Ferraro

unread,
Apr 23, 2024, 8:22:40 AM4/23/24
to WildFly
It appears that the distributed timer service lacks proper handling (and tests) for when the server is suspended.
While the local timer service will queue a timeout event to be invoked when the server resumes, the distributed timer service does not.
I've filed https://issues.redhat.com/browse/WFLY-19271 and will try to sort this out in the next few days.
In the meantime, you can apply the following CLI commands to switch to the local timer service for non-persistent timers:

start-batch
/subsystem=ejb3/service=timer-service:undefine-attribute(name=transient-timer-management)
/subsystem=ejb3/service=timer-service:write-attribute(name=thread-pool-name, value=default)
end-batch

Everly, David W

unread,
Apr 23, 2024, 9:19:11 AM4/23/24
to Paul Ferraro, WildFly
The provided cli gave me cli errors.  I tried this instead and it worked both at the cli level and for my reproducer, but I wonder if there are some side effects that I'm not aware of, given that other parts of the app remain clustered/ha:


embed-server --server-config=standalone-full-ha.xml
batch
/subsystem=ejb3/service=timer-service:undefine-attribute(name=default-transient-timer-management)
/subsystem=ejb3/service=timer-service:undefine-attribute(name=default-persistent-timer-management)
/subsystem=ejb3/service=timer-service:write-attribute(name=thread-pool-name, value=default)
/subsystem=ejb3/service=timer-service:write-attribute(name=default-data-store, value=default-file-store)
run-batch
stop-embedded-server


Paul Ferraro

unread,
Jun 19, 2024, 6:26:42 PM6/19/24
to WildFly
Something I meant to ask when reading the initial post...

In the @PostConstruct of your startup singleton, you first cancel all timers (presumably, these are persistent timers created by previous executions) and then recreate a persistent single action timer.  If you are going to cancel on restart, why use a persistent timer rather than a non-persistent timer?

There is a lingering issue with timers and suspended mode which is captured/described here: https://issues.redhat.com/browse/WFLY-19361

Everly, David W

unread,
Jun 20, 2024, 8:27:32 AM6/20/24
to Paul Ferraro, WildFly
I greatly simplified the reproducer from what was going on in the real-world case.  In the real-world use during runtime, a new timer could be started from many different paths/events, but there should only ever be zero or one timer running at any given time.  So I had a common startTimer method whose logic was to delete all timers and then start one.  I had copied this content into the reproducer.  And there may be a more elegant way of doing it.  The main issue I wanted to convey is that we start suspended and then resume after some time.  The issue for us using the newer Wildfly was that the timer firing during suspend mode is lost.

Reply all
Reply to author
Forward
0 new messages