I am currently working on SAF process. I am modifying jpos-ee SAF from
QMUX synch request to QMUX asynch request and integrate with
ISOResponseListener. The ISOResponseListener class extends QBean so
that it will get deployed during Q2 startup and configurable.
Everything works fine until there are messages in QMUX waiting to get
expired and there is new messages put in the QMUX request. Once that
happens, the ISOResponseListener expired method get called later and
later. I have configured the timeout to be 50 seconds, but once this
happens it gets timeout like after 5 min, 13 min, 30 min...
Has anyone experiencing this problem before? I am suspecting it is
something to do with synchronized(ar) function in QMux request method.
Please share.
> Everything works fine until there are messages in QMUX waiting to get > expired and there is new messages put in the QMUX request. Once that > happens, the ISOResponseListener expired method get called later and > later. I have configured the timeout to be 50 seconds, but once this > happens it gets timeout like after 5 min, 13 min, 30 min...
I recall you previously asked about controlling or guaranteeing the timely expiration of messages - I think you had some processing you wanted to do in a timely fashion on timeout (no response?). Did you modify the QMUX code to this end or are you using 'base' jPOS?
Is the Expiration method being called only as messages flow - rather than independently?
> Has anyone experiencing this problem before? I am suspecting it is > something to do with synchronized(ar) function in QMux request method.
Do you have any logs, or traces showing the activity (or not)?
I'm wondering how we could replicate the problem outside of your environment?
LiNa wrote: > I am suspecting it is > something to do with synchronized(ar) function in QMux request method.
I don't see that synchronisation on an object instanced for the current thread could affect multiple threads. So I don't think the synchronisation is the problem - can you say what lead you to this conclusion at all?
However I also don't see why the synchronisation (QMUX.java line 256) on the instanced AsyncRequest object is even needed? I am happy to think it is there for a reason, I just don't see it *yet).
> However I also don't see why the synchronisation (QMUX.java line 256) on > the instanced AsyncRequest object is even needed? I am happy to think > it is there for a reason, I just don't see it *yet).
If I recall correctly, under some really heavy load situations involving message retransmissions, one could get a response that would trigger a responseReceived on the ISOResponseListener before scheduling the timer task, that's why we have that synchronized block.
Alejandro Revilla wrote: >> However I also don't see why the synchronisation (QMUX.java line 256) on >> the instanced AsyncRequest object is even needed? I am happy to think >> it is there for a reason, I just don't see it *yet).
> If I recall correctly, under some really heavy load situations involving > message retransmissions, one could get a response that would trigger a > responseReceived on the ISOResponseListener before scheduling the timer > task, that's why we have that synchronized block.
Ok, that would be 'interesting' to see that scenario in action - something the Gladiator team experienced?
Do you agree that the synchronisation is unlikely to be cause the 'increasing' delay 'LiNa' is seeing?
Thank you for your reply.
After posting this thread, I finally modified the QMUX as below and
the problem ceased.
I see that the DefaultTimer is a static class with only one thread
running. Any reason why it must be a static class? Do you foresee any
problem with my new QMux?
public class QMUX extends org.jpos.q2.iso.QMUX {
public void request (ISOMsg m, long timeout, ISOResponseListener
rl, Object handBack)
throws ISOException
{
String key = getKey (m);
String req = key + ".req";
m.setDirection(0);
AsyncRequest ar = new AsyncRequest (rl, handBack);
// synchronized (ar) {
// if (timeout > 0)
// DefaultTimer.getTimer().schedule (ar, timeout);
// }
Timer timer = new Timer();
timer.schedule(ar, timeout);
sp.out (req, ar, timeout);
sp.out (out, m, timeout);
}
Thanks
On Jun 26, 9:07 pm, Alejandro Revilla <a...@jpos.org> wrote:
Below is the trace log. The setting for expiry in Qmux is 10 seconds
while inter message delay is 50 seconds. You can see when 0300 message
is placed initially, the timer works fine but when there is another
message (0500) placed in the queue and no response from the host, the
timer get delayed and turn out to be the same for 0300 and 0500
message. In between these transactions there are other transactions
with response.
> LiNa wrote:
> > I am suspecting it is
> > something to do with synchronized(ar) function in QMux request method.
> I don't see that synchronisation on an object instanced for the current
> thread could affect multiple threads. So I don't think the
> synchronisation is the problem - can you say what lead you to this
> conclusion at all?
> However I also don't see why the synchronisation (QMUX.java line 256) on
> the instanced AsyncRequest object is even needed? I am happy to think
> it is there for a reason, I just don't see it *yet).
LiNa wrote: > Below is the trace log. The setting for expiry in Qmux is 10 seconds > while inter message delay is 50 seconds. You can see when 0300 message > is placed initially, the timer works fine but when there is another > message (0500) placed in the queue and no response from the host, the > timer get delayed and turn out to be the same for 0300 and 0500 > message. In between these transactions there are other transactions > with response.
May I ask how many times you will keep sending the repeat messages? Does the recipient expect you to keep sending (and expecting a response) forever? If you just keep repeating, why does the expiration need to be accurate - as long as you get to it, does it matter? Note, I'm not suggesting you don't have a problem (with the delay increasing); I'm just trying to understand your actual need; in the hope it is not unique and there is another - perhaps better - approach.
If the delay on the response coming back is 50 seconds, why don't we see any response at all?
I wonder if you have key clashes between the original and repeat messages, what is your QMUX key (which fields).
> <log realm="saf" at="Tue Dec 01 12:22:15 MYT 2009.604"> > <debug> > Timeout after sleep....: Stan : 000677 > </debug> > </log>
What does the message above indicate? How long is your 'saf' sleeping for - is the duration of this sleep changing or static? Perhaps include the duration of target sleep duration in this debug message so we can see?
On Jun 30, 3:01 pm, Mark Salter <marksal...@talktalk.net> wrote:
> LiNa wrote:
> > Below is the trace log. The setting for expiry in Qmux is 10 seconds
> > while inter message delay is 50 seconds. You can see when 0300 message
> > is placed initially, the timer works fine but when there is another
> > message (0500) placed in the queue and no response from the host, the
> > timer get delayed and turn out to be the same for 0300 and 0500
> > message. In between these transactions there are other transactions
> > with response.
> May I ask how many times you will keep sending the repeat messages?
[Lina] It is configurable in SAF process, currently configured to 100
times.
> Does the recipient expect you to keep sending (and expecting a response)
> forever? If you just keep repeating, why does the expiration need to be
> accurate - as long as you get to it, does it matter? Note, I'm not
> suggesting you don't have a problem (with the delay increasing); I'm
> just trying to understand your actual need; in the hope it is not unique
> and there is another - perhaps better - approach.
[Lina] Yes, the expectation is always store and forward until receive
a response from
host. This case is discovered acccidentally when the host did
not reply, and
the specs dictates that the retry is within 1 min interval.
If the delay is few seconds,
I think it's acceptable, but this case the delay is 13 min!
> If the delay on the response coming back is 50 seconds, why don't we see
> any response at all?
[Lina] Actually the timeout waiting for response is 10 seconds, after
that it will go to sleep for
for 50 seconds before re-transmission.
This case is discovered accidentally while host did not
reply. Under normal circumstances,
host should reply within 10 seconds.
> I wonder if you have key clashes between the original and repeat
> messages, what is your QMUX key (which fields).
[Lina] hm..you suspect this is the culprit? My QMux key is only field
11. But MTI for repeat is retransmission MTI
> > <log realm="saf" at="Tue Dec 01 12:22:15 MYT 2009.604">
> > <debug>
> > Timeout after sleep....: Stan : 000677
> > </debug>
> > </log>
> What does the message above indicate? How long is your 'saf' sleeping
> for - is the duration of this sleep changing or static? Perhaps include
> the duration of target sleep duration in this debug message so we can see?
[Lina] this is the 50 seconds sleep before retransmission (after 10
seconds timeout). The duration is static.
LiNa wrote: >> May I ask how many times you will keep sending the repeat messages?
> [Lina] It is configurable in SAF process, currently configured to 100 > times.
Ok
>> Does the recipient expect you to keep sending (and expecting a >> response) forever? If you just keep repeating, why does the >> expiration need to be accurate - as long as you get to it, does it >> matter? Note, I'm not suggesting you don't have a problem (with >> the delay increasing); I'm just trying to understand your actual >> need; in the hope it is not unique and there is another - perhaps >> better - approach.
> [Lina] Yes, the expectation is always store and forward until receive > a response from host. This case is discovered acccidentally when the > host did not reply, and the specs dictates that the retry is within 1 > min interval. If the delay is few seconds, I think it's acceptable, > but this case the delay is 13 min!
Ok - I see, but can't see why the delay might be increasing, I think you are going to have to find it in your environment, perhaps my suggestion below will have some benefit for you.
>> If the delay on the response coming back is 50 seconds, why don't >> we see any response at all?
> [Lina] Actually the timeout waiting for response is 10 seconds, after > that it will go to sleep for for 50 seconds before re-transmission. > This case is discovered accidentally while host did not reply. Under > normal circumstances, host should reply within 10 seconds.
In your testing set-up, shouldn't you also have responses arriving late - so they have timed out - perhaps this will become another test case. Something in your previous post suggested to me that the test system would respond after 50 seconds, so I was expecting to see responses - late or otherwise.
BTW, if your test responder never does respond, - are your testing that you stop sending repeats after the configured number of times (100)?
>> I wonder if you have key clashes between the original and repeat >> messages, what is your QMUX key (which fields).
> [Lina] hm..you suspect this is the culprit? My QMux key is only field > 11. But MTI for repeat is retransmission MTI
I wonder if this might come into play, but with no responses currently coming back I wonder what the QMUX thinks is going on? In my mind as it checks it's space to see if there is a request that matches the current response is the time the Expired method on an 'old' request object will be called.
To suggest a different approach:-
You are currently relying on the Expiration time out to trigger the need for a 'sleep and repeat send'...
Can't you drive your SAF queue by checking for a time out on :-
QMUX.request(ISOMsg request, int time out)
to populate and maintain your SAF queue?
If the original request times out, add a message to your 'SAF queue'.
In your SAF processing task, periodically try a repeat, if a response to the repeat arrives, delete the entry as done, if not requeue it.
If a response (repeat or orginal) arrives after your time out - perhaps the other system is also reimplementing a similar SAF process - then processing messages arriving on the QMUX.unhandled key will allow you to find and delete your SAF repeat as it is no longer needed. This is where key matching clashes might matter
Note, you need to take care when handling of this SAF queue to ensure that you don't remove an item but fail to requeue, perhaps an 'in-progress' marker and attempt count is needed.
You also probably want to avoid a 'repeat flood' controlling the frequency and speed of SAF queue scan might be easier if the SAF processing is decoupled from the primary message exchanges.
> Do you think the example code in my reply is not good, or that
> implementing a saf.request(request, timeout) method is harder than I
> imagine?
If I understand correctly, your code would send the message via the
SAF, but then would
pull the key entry from the space. That would confuse the MUX and the
SAF wouldn't
know that there's a reply for the message, and would in turn re-
transmit it.
apr wrote: >> Do you think the example code in my reply is not good, or that >> implementing a saf.request(request, timeout) method is harder than I >> imagine?
> If I understand correctly, your code would send the message via the > SAF, but then would > pull the key entry from the space. That would confuse the MUX and the > SAF wouldn't > know that there's a reply for the message, and would in turn re- > transmit it.
Perhaps I am missing something...
The responseKey parm on the send appears to be the space queue into which the saf will place the response (once received) - I assumed this was for the sender's benefit and so he could complete his 'response handling' process whenever it occurred?
Would one use SAF.send() as the way to send any request for which saf was required through the associate mux'd connection?
You are right. At first sight, I thought that would be a key managed
by QMUX,
but you suggested to use SAF's ability to provide the responses back
(which is a nice
one).
Please ignore my comment.
PS.- Do you have a recommendations on how to write smart answers? :) :)
> You are right. At first sight, I thought that would be a key managed > by QMUX, > but you suggested to use SAF's ability to provide the responses back > (which is a nice > one).
I would like to add a .request method (matching mux's) so that the sender has the option of sending and waiting as well...
> PS.- Do you have a recommendations on how to write smart answers? :) :)
Sorry for late reply, was so caught up with the project. The reason
why I don't use jposee SAF is because I want to use QMUX asynch
request. After I modified the timer, the ISOResponseListener.expired
get called perfectly on the exact timeout. Once the expired get
called, it will place the message back to SAF queue for
retransmission..
> Sorry for late reply, was so caught up with the project.
Understood.
> The reason why I don't use jposee SAF is because I want to use QMUX > asynch request.
Of course that is your choice. At present though you are using this to implement a SAF process, which the SAF implementation *could* handle for you.
> After I modified the timer, the ISOResponseListener.expired get > called perfectly on the exact timeout. Once the expired get called, > it will place the message back to SAF queue for retransmission..
If you are happy with your current position and approach, then great; I am glad you found a solution.