Multiple Transaction managers in the same JPOS instance

188 views
Skip to first unread message

shadei...@gmail.com

unread,
Jul 22, 2024, 8:35:43 AM7/22/24
to jPOS Users
Hi team,

I have set up 2 transaction manager XMLs in my single JPOS instance, changing the space and queue names from default to something else for one of them. I have multiple servers and channels and have configured some servers to use one transaction manager while the other servers use the other transaction server.
This change arose because I want the 2 different cards I'm working with to have separate queues. One of the cards need to be processed within 5 seconds and the other can take as long as 25 seconds.
From the logs, I noticed I have a larger transaction expiry from the shorter processing card than the other and I'm considering that the issue maybe because both cards use same queue and space. To stop this, I created another txnmgr file and changed the queue and space values to something different.

Is this something acceptable and do you think it can reduce the transaction expiry I'm getting from the card that requires a short processing time? I can see both queues started in the logs but have found myself with a situation in which the new transaction manager doesn't seem to send to request successfully to the channel and multiplexer. While I have told the user to get an application like Wireshark to check this out, I want to confirm that this approach is acceptable and could help.

Regards,
Shade

Chandrasekhar Rout

unread,
Jul 22, 2024, 9:18:05 AM7/22/24
to jpos-...@googlegroups.com

If you have two different card , you have to configure two mux and two channel adopter XML files .


--
--
jPOS is licensed under AGPL - free for community usage for your open-source project. Licenses are also available for commercial usage. Please support jPOS, contact: sa...@jpos.org
---
You received this message because you are subscribed to the Google Groups "jPOS Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jpos-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jpos-users/307c41d7-180f-4985-82e3-3ae7a9010472n%40googlegroups.com.

chhil

unread,
Jul 22, 2024, 10:03:21 AM7/22/24
to jpos-...@googlegroups.com

Your log's profiler will tell you which participants are taking how long. If the host you are talking to is taking longer to respond, then you really cannot do much.
If your participants internally are taking long, and you have crossed the 5 seconds before you send it out then you need to fine tune your participants to take less time.

As an example, we take in transactions and based on the bin route it to the endpoint using the same queue and timeouts are configured per endpoint in the mux/channel per endpoint config. So, our txn mgrs are source driven, i.e. source can send transactions that go to different endpoints using the single queue.

You can have transaction manager that are endpoint driven and based on the bin put it on the right queue that sends the request to the endpoint.

Either way is fine.

If you are seeing the new txn mgr not send out messages based on the mux configured, then you need to look at your config or debug your participants. Your jpos log file will pretty much tell you where the problem is.
You may want to share your log files and config/deploy files if you cannot figure it out.

-chhil

shadei...@gmail.com

unread,
Jul 23, 2024, 3:48:22 AM7/23/24
to jPOS Users
I have multiple servers, channels and multiplexers as required.
When I used a single transaction manager and queue, they all worked but the timeouts in the logs for the request that required a shorter time was way more than that of the other transaction type.

I have the queue's configured in the servers and transaction manager. Sample snippets are shown below:

Server.xml
<?xml version="1.0" encoding="UTF-8"?>
<server class="org.jpos.q2.iso.QServer" logger="Q2"
name="VerveCard_Server">

<attr name="port" type="java.lang.Integer">56215</attr>
<attr name="maxSessions" type="java.lang.Integer">200</attr>
<attr name="minSessions" type="java.lang.Integer">10</attr>

<channel name="VerveChannel" class="org.jpos.iso.channel.NACChannel"
logger="Q2" packager="org.jpos.iso.packager.GenericPackager">
<property name="packager-config" value="cfg/iso93_fin_fepascii.xml" />
</channel>


<request-listener class="com.iso8583.jpos.txnmanager.ProcessTxnManager"
logger="Q2">
<property name="space" value="transient:verve" />
<property name="queue" value="txnmgrverve" />
<property name="timeout" value="27000" />
<property name="Multiplexer" value="mux.jpos-verve-mux" />
<property name="ChannelTimeout" value="25" />
</request-listener>

</server>

Channel xml
<channel-adaptor name='jpos-verve' class="org.jpos.q2.iso.ChannelAdaptor"
logger="Q2">
<channel class="org.jpos.iso.channel.NACChannel" logger="Q2"
packager="org.jpos.iso.packager.GenericPackager">
<property name="packager-config" value="cfg/iso93_fin_fepascii.xml" />


<property name="host" value="X.X.X.X" />
<property name="port" value="56227" />
<property name="connection-timeout" value="15000" /> <!-- 15 seconds -->
        <property name="timeout" value="300000" />

</channel>
<in>jpos-verve-send</in>
<out>jpos-verve-receive</out>
<reconnect-delay>10000</reconnect-delay>
<space>tspace:verve</space>
</channel-adaptor>

Multiplexer XML
<?xml version="1.0" ?>

<mux class="org.jpos.q2.iso.QMUX" logger="Q2" name="jpos-verve-mux">
<in>jpos-verve-receive</in>
<out>jpos-verve-send</out>
<key>11, 12, 37, 41, 42 </key>
<unhandled>jpos-verve-mux.unhandled</unhandled>
</mux>

Part of txnmgr xml
<txnmgr name="txnmgrverve"
class="org.jpos.transaction.TransactionManager">

<property name="space" value="transient:verve" />
<property name="queue" value="txnmgrverve" />
<property name="max-sessions" value="200" />
<property name="sessions" value="30" />
<property name="debug" value="true" />

 
<participant class="com.iso8583.jpos.txnparticipant.SwitchParticipant" >
<property name="1200" value="Financial" />
<property name="1220" value="FinancialOthers"
.
.
.

I can see the threads started for txngmgrverve and it timed out in the participant that ought to send it to the channel via the multiplexer. 
Can you see anything in this configuration that is wrong?

The other working one uses the default setting, meaning, I have space as "transient:default" and queue as "default".


Regards,
Shade

murtuza chhil

unread,
Jul 23, 2024, 4:16:38 AM7/23/24
to jPOS Users

Channel 
<space>tspace:verve</space>

Txnmgr 
<property name="space" value="transient: verve" />

tspace and transient are the same, so there is a race condition between who picks it from the space and most likely is your issue.



If you have the jpos code you can try stepping into the code for 

SpaceFactory.getSpace("transient:xyz");

SpaceFactory.getSpace("tspace:xyz");


you will hit

 if (TSPACE.equals (scheme) || TRANSIENT.equals (scheme)) {
            sp = new TSpace();
        }


https://github.com/jpos/jPOS/blob/master/jpos/src/main/java/org/jpos/space/SpaceFactory.java#L129-L146


This is also how the channeladaptor creates the space defined by the property.

sp = grabSpace (persist.getChild ("space"));


protected Space grabSpace (Element e) {

return SpaceFactory.getSpace (e != null ? e.getText() : "");

}


-chhil

Andrés Alcarraz

unread,
Jul 23, 2024, 7:23:38 AM7/23/24
to jPOS Users

In addition to what Chill said, you don't need two spaces, just having different queues is enough for how you are trying to solve your problem.

But furthermore, I don't see how using different queues will improve the time to process the more critical service, unless you know the culprit is the other transaction manager to be under heavy load and it frequently has many transactions in the queue, and you expecting the other service have way less load.

You would need to verify that is the issue before trying to make your system more complex. But even if that is the issue using the default configuration for the Query host participant that uses continuations would be enough. And if it's not, then it is because the server isn't enough and you would need to split your jPOS in two instances running in different servers. But this is unlikely in a test environment unless you are testing heavy load, or your production service is under a so very heavy load that seems unlikely or the server is too modest for that load.

The issue may also be on the destination server not able to process the log, but we can't say nothing about that with the information you provided so far.

So, my advice, before trying to move forward with this configuration, really assess why your transactions are timing out.

----
Enviado desde mi móvil, disculpas por la brevedad.

Sent from my cellphone, I apologize for the brevity.

Andrés Alcarraz.

Alejandro Revilla

unread,
Jul 23, 2024, 7:25:28 AM7/23/24
to jpos-...@googlegroups.com
Chhil is right. I’d add that unless you have a good reason to use different spaces, you can remove the space configuration and let the system choose the default one, which is usually the fastest one. Just change the queue name for different TMs.



shadei...@gmail.com

unread,
Jul 23, 2024, 7:46:05 AM7/23/24
to jPOS Users
Thanks all for your responses.

@Andrés Alcarraz
I'm separating the queues because one of the transactions receives a lot more transactions and has a longer timeout session.
For instance, this transaction type could have as much as 500,000 transactions whereas, the other one would have barely 200,000 transactions. That's why I feel separating them would help.
I have looked repeatedly at the system monitor logs and my memory usage at peak is just above 1GB but usually lesser. I start up the service with a 2GB min and max memory size and I haven't really seen it go beyond that. The server has a lot more memory and so I can increase it if you feel that could help.

@Alejandro Revilla
I will remove the space configuration and allow it to use the default one as you suggested for both transaction managers and servers and see how that goes.

Regards,
Shade

Andrés Alcarraz

unread,
Jul 23, 2024, 8:12:07 AM7/23/24
to jpos-...@googlegroups.com
I don't see how this could help, unless you determined that load is the issue and memory the lacking resource, that as you mentioned it is not.

What I say is that I don't see how separating in different queues could solve the issue with the lesser timeout session transactions, unless you know the other transactions are the cause of the timeout. Did you determine that, if yes, how?

I feel like you are trying to solve a different problem that the one you actually have, or better said, a different cause for the problem. However, I can't tell for sure since I don't have enough data, it's just a feeling so far.
Andrés Alcarraz

shadei...@gmail.com

unread,
Jul 24, 2024, 10:10:07 AM7/24/24
to jPOS Users
@Andrés Alcarraz

Thanks for your response. it's more of a hunch than an actual knowing. I haven't had to use 2 transaction managers before, but I can't see any other reason for the expiration or timeouts experienced. 
From the logs I have looked at, I have transactions staying on the queue for quite a while, but they are really fine because they've been configured to take up 25 seconds. My thought process is that while the system is processing that request, I have transactions that would expire within 5 seconds come into that same queue and I'm concerned that those new transactions end up timing out before they can be processed. 
So, having 2 different transaction managers and queues is more about accessing if the queue could be the issue or not. I haven't anything else that could be the problem.

Regards,
Yetunde

Andrés Alcarraz

unread,
Jul 24, 2024, 10:59:27 AM7/24/24
to jpos-...@googlegroups.com

If the queue is always full, and increasing the number of session doesn’t help, then the problem is not that they are in the same queue, but that the hardware, or the usage of the hardware by the software, isn’t enough to process the load. In that case, splitting the transactions in two queues would not help.

That’s what I tried to explain in a previous response. The transaction manager process the transactions in parallel, and by using continuations, like it is the default in the QueryHost participant, don’t even block the queue.

I’m not saying you should not use different transaction managers, you should if the business logic is different, I’m not sure that you are doing it for the right reason.

So far, I don’t understand what is the exact issue you are facing. Are the transactions of the fast track being delayed? If you have a hunch that what ever problem you are facing is due to something, before changing the implementation, you better find a way to test your hypothesis. I cannot help you do that because you haven’t shared enough of your configuration to understand where the block could be happening.

If you have the same logic in the two transaction managers with the same participants, if one of those participants is the cause of the delay; having them in a different transaction manager with a different queue, will not solve your issue.

Kind regards.

Andrés Alcarraz
--
--
jPOS is licensed under AGPL - free for community usage for your open-source project. Licenses are also available for commercial usage. Please support jPOS, contact: sa...@jpos.org
---
You received this message because you are subscribed to the Google Groups "jPOS Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jpos-users+...@googlegroups.com.

Mark Salter

unread,
Jul 25, 2024, 1:34:45 AM7/25/24
to jpos-...@googlegroups.com

Reads like you are taking from the input queue with a single thread.
Perhaps share your Qbean configuration or describe it accurately  so we might see?

-- 
Mark



-------- Original Message --------
--
--
jPOS is licensed under AGPL - free for community usage for your open-source project. Licenses are also available for commercial usage. Please support jPOS, contact: sa...@jpos.org
---
You received this message because you are subscribed to the Google Groups "jPOS Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jpos-users+...@googlegroups.com.
signature.asc

shadei...@gmail.com

unread,
Aug 3, 2024, 7:07:58 AM8/3/24
to jPOS Users
Hello Andrés Alcarraz,

I thought of your words on the hardware and considered that it could actually be the issue as the application had been stable for quite some time before 
we suddenly had the repeated queues on Postillion.
I had the client provide the server configuration and count of transactions and want to share them here to get your thoughts on if we need to increase anything.

The below table show the quarterly count of their transactions for the 2 card types:

Quarterly                                      Transaction Volume
Q2 2023                                             69,992,579
Q3 2023                                             61,539,061
Q4 2023                                             84,804,158
Q1 2024                                             85,942,583
Q2 2024                                             71,187,508


The application was deployed in 2021 and we only started having issues late 2023 but then restarting the services was sufficient to clear the queue and 
get things working again.
In recent time however, the queue builds almost within 1 hour of restarting the service, so the hardware may really be the issue.

This is the hardware configuration the server has, though, I actually restricted the service to 2GB of memory (thinking that's a waste of resources now)
and looking at the System monitor, it doesn't seem to use it up at any time but I'm thinking of increasing that and allowing it to use more of server memory.
System Model - VMware Virtual Platform
Processor - Intel(R) Xeon(R) Platinum 8276M CPU @2.20GHz
Installed Physical Memory - 32GB
Available Physical Memory - 26.1GB
Page File Space - 4.75GB

I will appreciate your thought on this.

Regards,
Shade

chhil

unread,
Aug 3, 2024, 8:24:32 AM8/3/24
to jpos-...@googlegroups.com

What's the TPS of transactions coming in from position  and TPS of  your system handling those requests?
If you see postilion TPS way higher than your handling you know your system is not processing fast enough.

You should put a timestamp in your first participant (put it in the context) and timestamp before the transaction leaves your system, the difference tells you how long you are taking.

Your log should show you the head and tail of your queue, head should be as close to the tail , this indicates you are processing transactions at the rate they are coming in.

2GB does appear to be very low.
How many sessions are configured for your txn mgr?
Do you have sufficient connections to your outbound entity?

Is the transaction build up due to timing out and postilion is sending you saf'd reversal advices or repeats ?

-chhil


shadei...@gmail.com

unread,
Aug 3, 2024, 9:41:56 AM8/3/24
to jPOS Users
Hello Chhil,

Thanks so much for your response. 
From the logs, my TPS is usually around 20 to 30 though I noticed a peak of 1100. I have 800 sessions with maximum of 1200 sessions configured in the transaction manager.
The in-transit queue varies intermittently and is really good when all is well, but suddenly, I could have a transaction that seems to take a while, but still within the expected threshold and the difference between the head and tail would suddenly increase astronomically. I pick the transactions that seems to be in the tail longer than expected at times, but when I look at the logs for those transactions, they still fall within the expected time interval configured from PREPARE to COMMIT. Could it be that those transactions have resided in the queue for quite a while before they are picked up for processing and what I'm looking at is just the processing time, which then makes it look all okay? For instance, looking at the log below:

<log realm="txnmgr1" at="2024-06-18T17:22:21.011750600" lifespan="228ms">
  <commit>
    txnmgr1-38:idle:266461
    <context>
      c.i.j.u.Constants.REQUEST:
       <isomsg direction="outgoing">
         <!-- org.jpos.iso.packager.GenericPackager[cfg/iso93_fin_fepascii.xml] -->
         <field id="0" value="1200"/>
         <field id="2" value="XXXXXX_________2498"/>
         <field id="3" value="501020"/>
         <field id="4" value="0000000000090000"/>
         <field id="11" value="000000172139"/>
         <field id="12" value="202406____2140"/>
         <field id="15" value="20240618"/>
         <field id="17" value="20240618"/>
         <field id="24" value="200"/>
         <field id="32" value="627629"/>
         <field id="33" value="111111"/>
         <field id="37" value="172766172139"/>
         <field id="41" value="000000002ZBJ2766"/>
         <field id="42" value="2ZB001125190343"/>
         <field id="43" value="WT|RONIS GLOBAL          BARIGA       NG"/>
         <field id="46" value="70NGNC000000000000000000000001D0000000000000000NGN"/>
         <field id="49" value="566"/>
         <field id="59" value="2529807174|0200"/>
         <field id="63" value="00,627629"/>
         <field id="93" value="12345678901"/>
         <field id="94" value="23456712345"/>
         <field id="102" value="XXXXXXXX"/>
         <field id="103" value="XXXXXXXX"/>
         <field id="123" value="FEP"/>
         <field id="124" value="POS"/>
       </isomsg>
     
      c.i.j.u.Constants.RESPONSE:
       <isomsg direction="outgoing">
         <!-- org.jpos.iso.packager.GenericPackager[cfg/iso93_fin_fepascii.xml] -->
         <field id="0" value="1210"/>
         <field id="2" value="XXXXXX_________2498"/>
         <field id="3" value="501020"/>
         <field id="4" value="0000000000090000"/>
         <field id="11" value="000000172139"/>
         <field id="12" value="202406____2140"/>
         <field id="15" value="20240618"/>
         <field id="17" value="20240618"/>
         <field id="32" value="627629"/>
         <field id="33" value="111111"/>
         <field id="37" value="172766172139"/>
         <field id="38" value="UNI000"/>
         <field id="39" value="116"/>
         <field id="41" value="000000002ZBJ2766"/>
         <field id="42" value="2ZB001125190343"/>
         <field id="46" value="70NGND000000000000000000000001D0000000000000000NGN"/>
         <field id="48" value="+0000000000009087+0000000000000687+0000000000000000+0000000000000000+0000000000000687NGN              +0003140500851786+0003140500851786+0000000000000000+0000000000000000+0003140500851786NGN              "/>
         <field id="49" value="NGN"/>
         <field id="59" value="2529807174|0200"/>
         <field id="94" value="23456712345"/>
         <field id="123" value="FEP"/>
         <field id="124" value="POS"/>
         <field id="126" value="XXXXXXXXXX"/>
       </isomsg>
     
    </context>
            prepare: c.i.j.t.SwitchParticipant PREPARED READONLY NO_JOIN
           selector: 'Financial'
            prepare: c.i.j.t.MessageReadParticipant PREPARED
            prepare: c.i.j.t.GetTransactionDetailsFromDBParticipant PREPARED
            prepare: c.i.j.t.GetInformationfromPANDataCode PREPARED
            prepare: c.i.j.t.GetInformationfromPOSDataCodeParticipant PREPARED
            prepare: c.i.j.t.GetTransactionTypeCode:getTranType PREPARED
            prepare: c.i.j.t.GetTransactionAccount:getDebitCredit PREPARED
            prepare: c.i.j.t.GetCardBrandParticipant PREPARED
            prepare: c.i.j.t.ConvertCurrencyParticipant PREPARED
            prepare: c.i.j.t.BuildRateMessageParticipant2 PREPARED
            prepare: c.i.j.t.SendToRateParticipant PREPARED
            prepare: c.i.j.t.ReadRateResponseParticipant PREPARED
            prepare: c.i.j.t.CheckAmountLimitIfNoresponseParticipant PREPARED
            prepare: c.i.j.t.SendToConnect24Participant PREPARED
            prepare: c.i.j.t.AbortImplementationParticipant PREPARED
            prepare: o.j.t.ProtectDebugInfo:protect-debug PREPARED READONLY
            prepare: o.j.t.Debug:debug PREPARED READONLY
             commit: c.i.j.t.MessageReadParticipant
             commit: c.i.j.t.GetTransactionDetailsFromDBParticipant
             commit: c.i.j.t.GetInformationfromPANDataCode
             commit: c.i.j.t.GetInformationfromPOSDataCodeParticipant
             commit: c.i.j.t.GetTransactionTypeCode:getTranType
             commit: c.i.j.t.GetTransactionAccount:getDebitCredit
             commit: c.i.j.t.GetCardBrandParticipant
             commit: c.i.j.t.ConvertCurrencyParticipant
             commit: c.i.j.t. BuildRateMessageParticipant2 
             commit: c.i.j.t. SendToRateParticipant 
             commit: c.i.j.t. ReadRateResponseParticipant 
             commit: c.i.j.t.CheckAmountLimitIfNoresponseParticipant
             commit: c.i.j.t.SendToConnect24Participant
             commit: c.i.j.t.AbortImplementationParticipant
             commit: o.j.t.ProtectDebugInfo:protect-debug
             commit: o.j.t.Debug:debug
     in-transit=25/69, head=266467, tail=266398, paused=0, outstanding=0, active-sessions=800/1200, tps=13, peak=1110, avg=15.64, elapsed=226ms
    <profiler>
      prepare: c.i.j.t.SwitchParticipant [0.0/0.0]
      prepare: c.i.j.t.MessageReadParticipant [4.4/4.4]
      prepare: c.i.j.t.GetTransactionDetailsFromDBParticipant [17.8/22.3]
      prepare: c.i.j.t.GetInformationfromPANDataCode [0.0/22.4]
      prepare: c.i.j.t.GetInformationfromPOSDataCodeParticipant [0.0/22.5]
      prepare: c.i.j.t.GetTransactionTypeCode:getTranType [0.0/22.5]
      prepare: c.i.j.t.GetTransactionAccount:getDebitCredit [0.0/22.5]
      prepare: c.i.j.t.GetCardBrandParticipant [0.0/22.5]
      prepare: c.i.j.t.ConvertCurrencyParticipant [0.2/22.7]
      prepare: c.i.j.t. BuildRateMessageParticipant2  [0.0/22.8]
      prepare: c.i.j.t. SendToRateParticipant  [100.8/123.6]
      prepare: c.i.j.t. ReadRateResponseParticipant  [0.5/124.2]
      prepare: c.i.j.t.CheckAmountLimitIfNoresponseParticipant [0.0/124.3]
      prepare: c.i.j.t.SendToConnect24Participant [100.8/225.1]
      prepare: c.i.j.t.AbortImplementationParticipant [0.0/225.2]
      prepare: o.j.t.ProtectDebugInfo:protect-debug [0.0/225.2]
      prepare: o.j.t.Debug:debug [0.0/225.2]
       commit: c.i.j.t.MessageReadParticipant [0.0/225.2]
       commit: c.i.j.t.GetTransactionDetailsFromDBParticipant [0.0/225.2]
       commit: c.i.j.t.GetInformationfromPANDataCode [0.0/225.3]
       commit: c.i.j.t.GetInformationfromPOSDataCodeParticipant [0.0/225.3]
       commit: c.i.j.t.GetTransactionTypeCode:getTranType [0.0/225.3]
       commit: c.i.j.t.GetTransactionAccount:getDebitCredit [0.0/225.3]
       commit: c.i.j.t.GetCardBrandParticipant [0.0/225.3]
       commit: c.i.j.t.ConvertCurrencyParticipant [0.0/225.3]
       commit: c.i.j.t. BuildRateMessageParticipant2  [0.0/225.3]
       commit: c.i.j.t. SendToRateParticipant  [0.0/225.3]
       commit: c.i.j.t. ReadRateResponseParticipant  [0.0/225.3]
       commit: c.i.j.t.CheckAmountLimitIfNoresponseParticipant [0.0/225.3]
       commit: c.i.j.t.SendToConnect24Participant [0.0/225.3]
       commit: c.i.j.t.AbortImplementationParticipant [0.0/225.4]
       commit: o.j.t.ProtectDebugInfo:protect-debug [0.0/225.4]
       commit: o.j.t.Debug:debug [1.3/226.7]
      end [0.4/227.2]
    </profiler>
  </commit>
</log>

I would conclude that this transaction took 228ms from the time it was received and the response sent out. Is that my conclusion correct or am I missing something there?
Postillion queue starts to build once it's not getting a response from the system within the configured duration and then it starts to send reversals, which also adds to the queue. The queue on the Postilion system for the card with 5 seconds response time starts growing faster than the one with 25 seconds response time and that was why I considered splitting the queue for the 2 cards, even though they use exactly the same participants. 
I feel there are enough outbound connections for the backend system I send the request to because once they point directly to that system, the queue goes down and everything goes on well. Only when they backend system has a major issue, which will be known to all, do they experience such queues. I requested for transactions within same time for 2 days and noticed there were a lot of transactions that timed out when the Q2 application was used whereas there was next to none when it wasn't used. 
Since I have 32GB memory, I can increase to as much I feel is required because the server is used by the Q2 application alone.

I hope I haven't given an information overload, but I hope this gives more context to my concerns and question and you can advise me further on what to do.

Regards,
Shade

Mark Salter

unread,
Aug 3, 2024, 12:02:57 PM8/3/24
to jpos-...@googlegroups.com

You really need to ask the admins of the Postillion system you are serving responses for what they see from their perspective - that is what I think matters.

For me you have not yet described the flow or the perceived problem in any detail to help here.

I wonder if you are perhaps returning a response on a different network connection tonthat it arrived on - is the allowed and can the requestor do the matching, do they expect to have to?

-- 
Mark



-------- Original Message --------
signature.asc

chhil

unread,
Aug 3, 2024, 9:11:33 PM8/3/24
to jpos-...@googlegroups.com

The timing in the transaction you have does not have the time included for when you got the transaction, put it in a context and in the queue.

So what you need is getting the profiler when you are preparing the context put a checkpoint called "inqueue' so that the log includes the inqueue time . So when you see the end the inqueue time is added.

We also put a timestamp in there when populating the context. So before sending the transaction out we calculate the difference of now - timestamp and see if it's beyond the threshold and not send it. Threshold would be a time I know will result in postilion timing out and reversing , so there is no point sending it out.

If the difference between head and tail becomes large you are not processing fast enough.

I also think the overhead of your large number of sessions is causing an issue. You should  do a load test to determine it , I would try something much lower like 200.

-chhil


Andrés Alcarraz

unread,
Aug 3, 2024, 9:24:17 PM8/3/24
to jPOS Users
I wasn't suggesting that the problem is the hardware, also, we can't say anything from the average number of transactions per quarter, since what matters are the hot hours.

What I was stating is that the solution you were trying won't give you results if the problem is your system not processing fast enough. Regarding the hardware specs, I simply don't know, it depends of logs of things, and you didn't share enough about what your system does, but even if you would have, what matters is the tests you can perform on it.


----
Enviado desde mi móvil, disculpas por la brevedad.

Sent from my cellphone, I apologize for the brevity.

Andrés Alcarraz.

shadei...@gmail.com

unread,
Aug 4, 2024, 9:06:21 PM8/4/24
to jPOS Users
Thanks so much for your responses.
One thing it points to for me is that I need to put in a bit more logging and do a load/performance test to see what could be wrong and then try out various scenarios to see what would help.

@Mark
Postilion just ends up with a lot of 91 response codes, which means they didn't get a response back from the service within the expected time.  I'm using all the Q2 artefacts and so I'm not handling things like network connections directly within the code. But I will look at getting more information and doing some additional debugging, while leaving out the multiple queue configuration that I had been focused on.

@Chhil
Thanks for your suggestion on how to get the exact time difference between when the request came into the queue and when it leaves. That would actually help in seeing how long they stay on the queue before they are picked up for processing. I could use that to stop it from going to the downstream application if it's already outstayed the expected time in the queue. I understand your comment on the possibility of the system not processing fast enough and will look into understanding why that is happening. I increased the session because I hoped that would help but from the logs, it actually looked like a number of them are actually idle and so, I will play around with that as well. I have 5 servers with 5 equivalent channels and multiplexer. So, I divided the sessions across all servers and then put the total in the txnmgr.

@Andrés,
I understand your comment and agree that the important information is the peak period. I will look towards having a performance testing done to determine what the issue could be.

Regards,
Shade

Alejandro Revilla

unread,
Aug 5, 2024, 11:03:01 AM8/5/24
to jpos-...@googlegroups.com
One thing to remember: if your 'tail' gets locked and doesn't increase, it means one participant got stuck and never returned. You can easily see which participant is offending by looking at the next SystemMonitor dump, which usually happens every hour. That dump will show the participant's name in the thread dump. You can also force a thread dump using jcmd <pid> Thread.print.



Reply all
Reply to author
Forward
0 new messages