Transaction manager in-transit keeps increasing after a QueryHost Participant fails

528 views
Skip to first unread message

Mumuney Abdlquadri

unread,
Oct 1, 2017, 12:31:52 AM10/1/17
to jpos-...@googlegroups.com
Dear everyone,

I have a transaction manager in place. I have only 3 participants. MyCustom. QueryHost, SendRespone .

I noticed that if QueryHost could not reach the destination it fails but the transaction remains in-transit.

Transaction after transaction the number of in transit grows up to more than 5000.

Eventually the Transaction Manager stops processing transactions.

I planned to add custom AbortParticipant in between QueryHost and SendResponse that responds to the transaction that failed from QueryHost.

However it is dangerous for me to do this. It might be that a reversal from the initial client is more appropriate.

My question is how to effective remove pending transaction from in-transit if it fails from QueryHost.


<txnmgr name="card-control-tm" class="org.jpos.transaction.TransactionManager" logger="Q2">
<property name="queue" value="ZZZ"/>
<property name="sessions" value="2"/>
<property name="max-sessions" value="128"/>
<property name="debug" value="true"/>

<participant class="trx.mngr.Card"/>
<participant class="org.jpos.transaction.participant.QueryHost"/>
<participant class="org.jpos.transaction.participant.SendResponse"/>

</txnmgr>



<key count='1'>-cba-on-us-channel.ready</key>
        <key count='1'>-cba-remote-on-us-channel.ready</key>
        <key count='1'>$TAILLOCK.1215387873</key>
        <key count='1'>-cba-inbranch-pos-channel.ready</key>
         key-count: 4
            gcinfo: 0,0
      -remote-on-us-server: org.jpos.q2.iso.QServer
      tspace:org.jpos.transaction.TransactionManager@487158e1: org.jpos.space.TSpace
        <key count='1'>txnmgr.$CONTEXT.403</key>
        <key count='1'>txnmgr.$CONTEXT.407</key>
        <key count='1'>txnmgr.$STATE.409</key>
        <key count='1'>txnmgr.$STATE.408</key>
        <key count='1'>txnmgr.$STATE.407</key>
        <key count='1'>$TAIL</key>
        <key count='1'>txnmgr.$STATE.406</key>
        <key count='1'>txnmgr.$STATE.405</key>
        <key count='1'>txnmgr.$STATE.404</key>
        <key count='1'>txnmgr.$STATE.403</key>
        <key count='1'>$HEAD</key>
        <key count='1'>txnmgr.$CONTEXT.413</key>
        <key count='1'>txnmgr.$CONTEXT.414</key>
        <key count='1'>txnmgr.$STATE.419</key>
        <key count='1'>txnmgr.$STATE.418</key>
        <key count='1'>txnmgr.$STATE.417</key>
        <key count='1'>txnmgr.$STATE.416</key>
        <key count='1'>txnmgr.$STATE.415</key>
        <key count='1'>txnmgr.$STATE.414</key>
        <key count='1'>txnmgr.$STATE.413</key>
        <key count='1'>txnmgr.$STATE.412</key>
        <key count='1'>txnmgr.$STATE.411</key>
        <key count='1'>txnmgr.$STATE.410</key>
         key-count: 23



channel.-cba-remote-on-us-channel: org.jpos.iso.channel.PostChannel
      server.-fep-on-us-server: org.jpos.iso.ISOServer
        connected=1, rx=1, tx=1, last=1506723934736, idle=40943ms
          10.119.32.84:58993: rx=1, tx=1, last=1506723934736
      txnmgr: org.jpos.transaction.TransactionManager
        in-transit=17, head=420, tail=403, paused=0, outstanding=0, active-sessions=7/128, tps=0, peak=4, avg=0.04
        ng..trx.mngr.Card-prepare max=439, min=0, mean=1.6787 stddev=21.6390 0.5%=0 99.5%=11, 99.9%=439
        org.jpos.transaction.participant.QueryHost-prepare max=1, min=0, mean=0.2000 stddev=0.4000 0.5%=0 99.5%=1, 99.9%=1
        org.jpos.transaction.participant.QueryHost-prepare-for-abort max=1, min=0, mean=0.0052 stddev=0.0720 0.5%=0 99.5%=0, 99.9%=1
        org.jpos.transaction.participant.QueryHost-resume max=4511, min=34, mean=1012.9333 stddev=984.9684 0.5%=34 99.5%=4511, 99.9%=4511
        org.jpos.transaction.participant.QueryHost-snapshot max=1, min=0, mean=0.0052 stddev=0.0719 0.5%=0 99.5%=0, 99.9%=1
        org.jpos.transaction.participant.SendResponse-abort max=4, min=0, mean=0.2623 stddev=0.4740 0.5%=0 99.5%=1, 99.9%=4
        org.jpos.transaction.participant.SendResponse-commit max=1, min=0, mean=0.7667 stddev=0.4230 0.5%=0 99.5%=1, 99.9%=1
        org.jpos.transaction.participant.SendResponse-prepare max=1, min=0, mean=0.1000 stddev=0.3000 0.5%=0 99.5%=1, 99.9%=1
        org.jpos.transaction.participant.SendResponse-prepare-for-abort max=1, min=0, mean=0.0286 stddev=0.1666 0.5%=0 99.5%=1, 99.9%=1
      -cba-on-us-channel: org.jpos.q2.iso.ChannelAdaptor
        tx=384, rx=384, connects=1, last=1506723970864, idle=4816ms
      mux.-cba-on-us-mux: org.jpos.q2.iso.QMUX
        tx=384, rx=382, tx_expired=0, tx_pending=0, rx_expired=2, rx_pending=0, rx_unhandled=0, rx_forwarded=0, connected=true, last=1506723970864, idle=4816ms
        all max=10047, min=14, mean=175.2526 stddev=871.0654 0.5%=14 99.5%=6463, 99.9%=10047
        ok max=6463, min=14, mean=123.7304 stddev=503.0463 0.5%=14 99.5%=3807, 99.9%=6463
      -inbranch-pos-server: org.jpos.q2.iso.QServer
      --on-us-server: org.jpos.q2.iso.QServer
      logger.Q2: org.jpos.util.Logger
      logger.: org.jpos.util.Logger
      logger.Q2.buffered: org.jpos.util.BufferedLogListener
      -cba-remote-on-us-channel: org.jpos.q2.iso.ChannelAdaptor
        tx=362, rx=362, connects=1, last=1506723970883, idle=4798ms
      channel.-cba-on-us-channel: org.jpos.iso.channel.PostChannel


in-transit=230, head=633, tail=403, paused=0, outstanding=0, active-sessions=6/128, tps=0, peak=4, avg=0.02

Mumuney Abdlquadri

unread,
Oct 1, 2017, 7:07:24 AM10/1/17
to jpos-...@googlegroups.com



Or a better questions is how to remove 0200 messages from the in-transit queue when an 0420 comes in?

chhil

unread,
Oct 2, 2017, 8:26:49 AM10/2/17
to jpos-...@googlegroups.com
the framework takes care if that.
In your query host what's your timeout period for the mux request? 
You need to increase your sessions, 2 is too less,  you have practically made the requests sequential. Make the sessions  100. 

-chhil 


--
--
jPOS is licensed under AGPL - free for community usage for your open-source project. Licenses are also available for commercial usage. Please support jPOS, contact: sa...@jpos.org
---
You received this message because you are subscribed to the Google Groups "jPOS Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jpos-users+unsubscribe@googlegroups.com.
To post to this group, send email to jpos-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jpos-users/CAGDHizn%2BytEyoJcTAZ_SObRbokuyS3Qd-919kTSN6v2AuXF_xw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Mumuney Abdlquadri

unread,
Oct 2, 2017, 9:35:43 AM10/2/17
to jpos-...@googlegroups.com
Hi Chhil,

Thanks for taking time to attend to this.

I am using the provided
https://github.com/jpos/jPOS/blob/master/jpos/src/main/java/org/jpos/transaction/participant/QueryHost.java

which has a timeout from code


private static final long DEFAULT_TIMEOUT = 30000L;


long t = Math.max(timeout - chronometer.elapsed(), 1000L); // give at least a second to catch a response

It also uses continuations which should not make it sequential.


"the framework takes care if that."

Did you mean if 0420 is recieved JPos removes the 0200s in-transit?

Regards.

chhil

unread,
Oct 2, 2017, 10:55:11 AM10/2/17
to jpos-...@googlegroups.com
The in transit or states are handled by the framework.

I don't know enough about continuations but sessions is the number if threads that will handle your requests in parallel. You are starving the system with 2 sessions. 

You will need to handle the 0200 0420 logic. The thing is if you send a 420 with a 0200 the host won't know what to reverse. 

-chhil

Mumuney Abdlquadri

unread,
Oct 2, 2017, 11:38:10 AM10/2/17
to jpos-...@googlegroups.com
Ok, Chhil,

I will increase the number of sessions and tighten the QueryHost timeout.
Regards.

Alejandro Revilla

unread,
Oct 2, 2017, 5:44:34 PM10/2/17
to jPOS Users
Mumuney,

May I ask you two things?

1) The jPOS version you're using (q2 --version will tell you, including build/branc)

2) The full context trace for one of these transactions (I'm talking about this: http://jpos.org/blog/2012/02/transactionmanager-debug-and-trace-events/)

We recently make a change to the TransactionManager related to the inTransit counter (observed by Barzilai - thank you!) and I want to make sure that's not the cause of your problem.

Thank you in advance.

Mumuney Abdlquadri

unread,
Oct 3, 2017, 12:00:48 AM10/3/17
to jpos-...@googlegroups.com
Hi Alejandro,

1. jPOS 2.1.0 master/6145612 (2017-05-27 12:39:12 ART)

2.

<log realm="org.jpos.transaction.TransactionManager" at="2017-09-29T20:27:23.516" lifespan="1307ms">
  <commit>
    txnmgr-1:idle:1
    <context>
      TIMESTAMP: Fri Sep 29 20:27:22 WAT 2017
      PROFILER:
       <profiler>
         end [1308.0/1308.0]
       </profiler>
     
      SOURCE: org.jpos.iso.channel.PostChannel@6c4dfd55
      REQUEST:
       <isomsg direction="outgoing">
         <!-- org.jpos.iso.packager.GenericPackager[cfg/post.xml] -->
         <field id="0" value="0200"/>
         <field id="2" value="removed pan"/>
         <field id="3" value="012000"/>
         <field id="4" value="000002000000"/>
         <field id="7" value="0929192621"/>
         <field id="11" value="000506"/>
         <field id="12" value="202621"/>
         <field id="13" value="0929"/>
         <field id="17" value="0929"/>
         <field id="28" value="D00000000"/>
         <field id="30" value="C00000000"/>
         <field id="32" value="639139"/>
         <field id="37" value="000321719160"/>
         <field id="41" value="Removed"/>
         <field id="42" value="000000000000904"/>
         <field id="43" value="removed location"/>
         <field id="49" value="566"/>
         <field id="60" value="2017201720172017"/>
         <field id="102" value="removed account number"/>
       </isomsg>
     
      DESTINATION: dest-on-us-mux
      RESULT:
       <result/>
     
      :paused_transaction:
       id: 1
     
      RESPONSE:
       <isomsg direction="outgoing">
         <!-- org.jpos.iso.packager.GenericPackager[cfg/post.xml] -->
         <field id="0" value="0210"/>
         <field id="2" value="removed pan"/>
         <field id="3" value="012000"/>
         <field id="4" value="000002000000"/>
         <field id="7" value="0929192621"/>
         <field id="11" value="000506"/>
         <field id="12" value="202621"/>
         <field id="13" value="0929"/>
         <field id="17" value="0929"/>
         <field id="28" value="D00000000"/>
         <field id="30" value="C00000000"/>
         <field id="32" value="639139"/>
         <field id="37" value="000321719160"/>
         <field id="38" value="545271"/>
         <field id="39" value="00"/>
         <field id="41" value="Removed"/>
         <field id="42" value="000000000000904"/>
         <field id="43" value="removed location"/>
         <field id="49" value="566"/>
         <field id="54" value="2002566C0000027212172004566C0000000000002001566C000002721217"/>
         <field id="60" value="2017201720172017"/>
         <field id="102" value="removed account number"/>
       </isomsg>
     
    </context>
            prepare: dest.trx.mngr.Card PREPARED READONLY NO_JOIN
            prepare: org.jpos.transaction.participant.QueryHost PREPARED PAUSE READONLY NO_JOIN
            prepare: org.jpos.transaction.participant.SendResponse PREPARED READONLY
             commit: org.jpos.transaction.participant.SendResponse
     in-transit=0, head=2, tail=2, paused=0, outstanding=0, active-sessions=2/128, tps=0, peak=0, avg=0.00, elapsed=1306ms
    <profiler>
      prepare: dest.trx.mngr.Card [438.8/438.8]
      prepare: org.jpos.transaction.participant.QueryHost [0.8/439.7]
      resume [865.1/1304.8]
      prepare: org.jpos.transaction.participant.SendResponse [0.2/1305.1]
       commit: org.jpos.transaction.participant.SendResponse [1.2/1306.4]
      end [1.2/1307.6]
    </profiler>
  </commit>
</log>


This is for a completed transaction.

I think I missed this info earlier. As in-transit grows the number of sessions increases.

<log realm="org.jpos.transaction.TransactionManager" at="2017-09-30T05:26:24.113" lifespan="870ms">
  <commit>
    txnmgr-4:idle:564
    <context>
      TIMESTAMP: Sat Sep 30 05:26:23 WAT 2017
      PROFILER:
       <profiler>
         end [870.2/870.2]
       </profiler>
     
      SOURCE: org.jpos.iso.channel.PostChannel@6dba7e45
      REQUEST:
       <isomsg direction="outgoing">
         <!-- org.jpos.iso.packager.GenericPackager[cfg/post.xml] -->
         <field id="0" value="0200"/>
         <field id="2" value="remove pan"/>
         <field id="3" value="012000"/>
         <field id="4" value="000000500000"/>
         <field id="7" value="0930042521"/>
         <field id="11" value="000508"/>
         <field id="12" value="052521"/>
         <field id="13" value="0930"/>
         <field id="17" value="0930"/>
         <field id="28" value="D00000000"/>
         <field id="30" value="C00000000"/>
         <field id="32" value="639139"/>
         <field id="37" value="000321788236"/>
         <field id="41" value="removed"/>
         <field id="42" value="000000000000039"/>
         <field id="43" value="removed"/>
         <field id="49" value="566"/>
         <field id="60" value="2017201720172017"/>
         <field id="102" value="removed"/>
       </isomsg>
     
      DESTINATION: dest-on-us-mux
      RESULT:
       <result/>
     
      :paused_transaction:
       id: 564
     
      RESPONSE:
       <isomsg direction="outgoing">
         <!-- org.jpos.iso.packager.GenericPackager[cfg/post.xml] -->
         <field id="0" value="0210"/>
         <field id="2" value="remove pan"/>
         <field id="3" value="012000"/>
         <field id="4" value="000000500000"/>
         <field id="7" value="0930042521"/>
         <field id="11" value="000508"/>
         <field id="12" value="052521"/>
         <field id="13" value="0930"/>
         <field id="17" value="0930"/>
         <field id="28" value="D00000000"/>
         <field id="30" value="C00000000"/>
         <field id="32" value="639139"/>
         <field id="37" value="000321788236"/>
         <field id="38" value="830085"/>
         <field id="39" value="00"/>
         <field id="41" value="removed"/>
         <field id="42" value="000000000000039"/>
         <field id="43" value="removed"/>
         <field id="49" value="566"/>
         <field id="54" value="2002566C0000046464322004566C0000000000002001566C000004646432"/>
         <field id="60" value="2017201720172017"/>
         <field id="102" value="removed"/>
       </isomsg>
     
    </context>
            prepare: dest.trx.mngr.Card PREPARED READONLY NO_JOIN
            prepare: org.jpos.transaction.participant.QueryHost PREPARED PAUSE READONLY NO_JOIN
            prepare: org.jpos.transaction.participant.SendResponse PREPARED READONLY
             commit: org.jpos.transaction.participant.SendResponse
     in-transit=162, head=565, tail=403, paused=0, outstanding=0, active-sessions=6/128, tps=0, peak=4, avg=0.02, elapsed=869ms
    <profiler>
      prepare: dest.trx.mngr.Card [137.6/137.6]
      prepare: org.jpos.transaction.participant.QueryHost [0.0/137.7]
      resume [731.2/868.9]
      prepare: org.jpos.transaction.participant.SendResponse [0.1/869.1]
       commit: org.jpos.transaction.participant.SendResponse [0.5/869.7]
      end [0.2/869.9]
    </profiler>
  </commit>
</log>

But a point the transaction manager just stops processing transaction.


--
--
jPOS is licensed under AGPL - free for community usage for your open-source project. Licenses are also available for commercial usage. Please support jPOS, contact: sa...@jpos.org
---
You received this message because you are subscribed to the Google Groups "jPOS Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jpos-users+unsubscribe@googlegroups.com.
To post to this group, send email to jpos-...@googlegroups.com.

Mumuney Abdlquadri

unread,
Oct 3, 2017, 1:32:27 AM10/3/17
to jpos-...@googlegroups.com
Another information:

This issue starts when QueryHost can't reach the destination. I get this in log:


<log realm="org.jpos.q2.iso.ChannelAdaptor" at="2017-09-29T23:11:57.185">
  <warn>
    channel-receiver-inbranch-channel-receive
    Read timeout / EOF - reconnecting
  </warn>
</log>


The transaction is not responded to, my client sends reversal which I forward to destination, because it is not reachable we don't get a response to that either.

And the in-transit grows.

I am using continuations=true, which I believe will call the expired methods at timeout. Not sure how this is called if the destination does not respond

public void expired (Object handBack) {
Context ctx = (Context) handBack;
ctx.resume();
}
Maybe the expired method does not have a way to fail the transaction if the destination does not respond?

Contrast to when continuations=false

if (continuations) {
mux.request(m, t, this, ctx);
return PREPARED | READONLY | PAUSE | NO_JOIN;
} else {
ISOMsg resp = mux.request(m, t);
if (resp != null) {
ctx.put(responseName, resp);
return PREPARED | READONLY | NO_JOIN;
} else {
return result.fail(CMF.HOST_UNREACHABLE, Caller.info(), "'%s' does not respond", muxName).FAIL();
}
}


Does it matter that the 0200 and 0420 are received on different servers?


Alejandro Revilla

unread,
Oct 3, 2017, 11:51:45 PM10/3/17
to jPOS Users
Thank you for the confirmation Mumunay.

I believe your problem would be fixed with the latest 2.1.1-SNAPSHOT, in particular, if you take a look at the commit history for the TransactionManager, you'll see a series of commits addressing this issue in the last couple months:


Can you give it a try? Just bump up your version to 2.1.1-SNAPSHOT.




Mumuney Abdlquadri

unread,
Oct 4, 2017, 6:55:56 AM10/4/17
to jpos-...@googlegroups.com
Hi Alejandro, That's lovely. I will test it and report back.

Inline image 1

Reply all
Reply to author
Forward
0 new messages