Wildfly 26.1.1 Active MQ - Messages stuck in the queue

159 views
Skip to first unread message

Yashaswini Adiga

unread,
May 23, 2023, 4:19:02 AM5/23/23
to WildFly
Hi All,

  We are experiencing a strange problem with specific to a queue in our application. Few messages are stuck in queue for very long time. The other messages does get processed when but only few are get stuck.

The list-message command on the  queue doesn't returns empty, but the console shows the message count.
 
Any ideas on what could be the issue?

Yashaswini Adiga

unread,
May 23, 2023, 4:22:06 AM5/23/23
to WildFly
Hi All,

  We are experiencing a strange problem with specific to a queue in our application. Few messages are stuck in queue for very long time. The other messages does get processed when but only few get stuck.

The list-message command on the  queue returns empty, but the console shows the message count.
 
Any ideas on what could be the issue?

Miroslav Novak

unread,
May 24, 2023, 12:40:12 PM5/24/23
to WildFly
Hi Yashaswini, 

there might be more reasons for it. It might be that some client did not ack consumption of the messages so they remain in "in-flight" state or part of XA transaction which was not recovered for example because other participatns of XA are not reachable (for example DB is down) and recovery manager cannot rollback TXs in prepared state on Artemis. Could you provide more information about scenario in which queue is used and nature of the messages being consumed, please?

Thanks,
Mirek

Yashaswini Adiga

unread,
Jun 1, 2023, 6:04:52 AM6/1/23
to WildFly
Hi Mirek,

 Thank you for responding.
 We send Xml messages converted to TextMessage and pushed into the queue X.
 We have an MDB listening to the queue X. We have explicitly set the acknowledge mode to Auto Acknowledge for this MDB.
 MaxSession is set to 15. 
 This MDB listens to this message -> calls a SOAP API  -> puts the response to another queue Y.
 
         1) We receive a load of messages (10000) in 3 mins in queue X.
         2) 9500 get processed by the MDB
         3) 1 request of the remaining 500 gets stuck at the SOAP call (because of an issue with the system handling soap calls). (Meaning this request has reached onMessage of our MDB)
         4) However remaining 499 requests does not even reach onMessage of our MDB and are in the queue X.
         5) Eventually the request stuck in the 3rd point is re-triggered(onMessage)  as it reached the transaction timeout (set to 60 minutes using below configuration). 
              Along with this all other messages also get processed (all the requests reach onMessage of our MDB) but 8 are left behind. This number 8 has been consistent in all our tests.
                <coordinator-environment statistics-enabled="true" default-timeout="900"/>
        
         6) The 8 that are left behind are forever stuck in the queue until ofcourse we restart the wildfly. Please note this never triggered the onMessage of the MDB.

We are trying to understand 
    1) Why 499 requests did not trigger onMessage when 1 request was stuck at our SOAP call.
    2) Why are 8 messages orphaned when the other messages were processed.

Increasing the MaxSession to 200 (MDB-strict-pool-size also to 200) has helped us get rid of the messages being orphaned.
However, we want to understand the reason behind this so we can have an optimal configuration for our system.

Thanks and a lot for your help.

Miroslav Novak

unread,
Jun 2, 2023, 7:03:57 AM6/2/23
to WildFly
Hi Yashaswini,

thanks for the description. It helped me a lot to understand your use case. However I'm afraid I'll shooting blind without reproducer so sorry if below advices won't help. 

First you can try to increase maxSession activation config property on MDB to increase number of connections to Artemis broker. mdb-strict-pool-size increases total number of MDB instances however there might be not enough "healthy" Artemis connections to suplly them with messages which are set by maxSession property. 
@MessageDriven(name = "mdb1",
        activationConfig = {
                @ActivationConfigProperty(propertyName = "maxSession", propertyValue = "60"),
                ...
})
public class MdbWithRemoteOutQueueWithOutQueueLookups implements MessageListener {...

Your case might be explanied by livelock which might happen in special case with 1 CPU core. The thing is that Artemis RA is using Netty for connections to Artemis broker and there might not be enough threads in Netty thread pool. By default Netty is using 2*CPU cores and each Artemis connection can consume 2 threads. If you have just 1 CPU core then it might happen that just 1 MDB session can consume all the threads for Netty. Which would result that all other MDB sessions are stuck - waiting for freeing thread consume message for other MDB session. 

Another issue might be that there is another consumer on queue X (MDB or JMS client). In this case Artemis "allocates" part of the messages for this consumer and expects it to consume them. If the consumer is stuck then it will not free them until connection is closed. 

Another issue might be with BLOCK mode set on address-settings for the queue. If MDB hits max-size-bytes for queue Y then it might be blocked to send more messages. However I would expect lots of errors in server log in this case. 

Thanks,
Mirek

Yashaswini Adiga

unread,
Jun 4, 2023, 8:21:12 AM6/4/23
to WildFly
 Thank you for your inputs Mirek.

 We have used default address full policy which I believe is PAGE.
 Also we don't have any other MDB/JMS client listening to messages on queue X.

 We have an 8 CPU cores. 
 Is there any way I can check how the Netty threads are being consumed using a profiling tool or a configuration that can change the number of these threads in the pool ?
 
 Is it anything to do with transaction batch size?


Miroslav Novak

unread,
Jun 6, 2023, 4:53:07 AM6/6/23
to Yashaswini Adiga, WildFly
Hi Yashaswini, 

If there are 8 CPU cores then I think livelock should not be a problem. Do you think you might create a thread dump during "transaction timeout"? It might point us to problematic code. 

Thanks,
Mirek

--
You received this message because you are subscribed to a topic in the Google Groups "WildFly" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/wildfly/mZrmoQmmqXc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to wildfly+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wildfly/04c86afb-a0c2-4eca-82ad-aa8004ffabbcn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages