Under approximately 30 requests per second errors like
======
2011-06-27 12:34:55,484 ERROR
[org.mobicents.slee.runtime.eventrouter.routingtask.EventRoutingTaskImpl] Unable
to route event EventContext[event type id =
EventTypeID[name=...,vendor=...,version=1.0] , event = ... , local ac = ... ,
address = null , serviceID = null]. The activity context is gone
======
start to show up in the server.log. Log shows that the response event has been
fired, then endActivity has been invoked, followed by activityEnded callback
(less than 100 ms after invoking endActivity). After that ER complains being
unable to route the event. This leaves the impression that the activity has been
ended before the event has been processed, which was fired before invoking
endActivity. However, I would expect the ActivityEndEvent to be queued after the
event that has been fired before invoking endActivity.
This observation triggers the question whether fireEvent self-consistency
described in JSLEE 1.1 15.14.8 is as well true for endActivity (did not find
this stated clearly in the spec). In other words, are fireEvent and endActivity
invocations from the same thread guaranteed to be processed in the order that
they were invoked?
Having in mind the similarity of activity ending logic as compared to HTTP
Client RA, I tried to reproduce the issue using HTTP Client RA but failed due to
UnrecognizedActivityHandleException at 1-2 requests per second:
======
17:48:01,328 ERROR [HttpClientResourceAdaptor] Unable to fire
EventTypeID[name=net.java.client.slee.resource.http.event.ResponseEvent,vendor
=net.java.client.slee,version=1.0] on activity handle HttpClientSession ID:
564b0124-1a6c-41d6-9365-48b7f2cf0196 , the handle is not mapped
to an activity context
javax.slee.resource.UnrecognizedActivityHandleException: Unable to fire
EventTypeID[name=net.java.client.slee.resource.http.event.ResponseEv
ent,vendor=net.java.client.slee,version=1.0] on activity handle
HttpClientSession ID: 564b0124-1a6c-41d6-9365-48b7f2cf0196 , the handle is n
ot mapped to an activity context
at
org.mobicents.slee.resource.SleeEndpointImpl._fireEvent(SleeEndpointImpl.java:710)
at
org.mobicents.slee.resource.SleeEndpointFireEventNotTransactedExecutor.execute(SleeEndpointFireEventNotTransactedExecutor.java:82
)
at
org.mobicents.slee.resource.SleeEndpointImpl.fireEvent(SleeEndpointImpl.java:530)
at
org.mobicents.slee.resource.SleeEndpointImpl.fireEvent(SleeEndpointImpl.java:449)
at
org.mobicents.client.slee.resource.http.HttpClientResourceAdaptor.processResponseEvent(HttpClientResourceAdaptor.java:342)
at
org.mobicents.client.slee.resource.http.HttpClientResourceAdaptor$AsyncExecuteMethodHandler.run(HttpClientResourceAdaptor.java:40
8)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
======
Anyway, getting back to the original topic, I realize we could try to wait for
eventProcessingSuccessful, then endActivity and possibly some other workarounds,
but the described behaviour of the RA (fire an event, endActivity from the same
thread) seems valid to me.
Any advice greatly appreciated.
--
Vilius Panevėžys
Elitnet
-- Eduardo
..............................................
http://emmartins.blogspot.com
http://redhat.com/solutions/telco
2011/6/27 Vilius Panevėžys <vil...@elitnet.lt>:
-- Eduardo
..............................................
http://emmartins.blogspot.com
http://redhat.com/solutions/telco
As mentioned, the issue is reproducible during a load test, so I've stripped a
small relevant part from tens of megabytes of full log. Also please keep in mind
that I've removed obviously irrelevant parts (e.g. events fired by other RAs,
etc.) but there will still be some unrelated lines. However, every line that is
in the file is in the order that it appeared in the server.log. Symbols "<>"
used for some obfuscation.
ID of the incorrectly handled activity is 72669d48-0c67-4532-b5a9-065007d3a555
and activity with ID 832f855a-fba4-4709-a0e7-3e48cef85d97 represents what is in
my understanding the correct handling - response event routed before
ActivityEndEvent.
I've marked lines of particular interest with ">>>" and "<<<".
Please let me know if you need any additional info.
Thanks,
--
Vilius Panevėžys
Elitnet
For log4j, use this config:
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/"
debug="false">
<appender name="CONSOLE" class="org.apache.log4j.ConsoleAppender">
<errorHandler class="org.jboss.logging.util.OnlyOnceErrorHandler"/>
<param name="Target" value="System.out"/>
<param name="Threshold" value="DEBUG"/>
<layout class="org.apache.log4j.PatternLayout">
<param name="ConversionPattern" value="%d{ABSOLUTE} %-5p [%c{1}] %m%n"/>
</layout>
</appender>
<category name="org.mobicents.slee.runtime">
<priority value="DEBUG" />
</category>
<root>
<level value="INFO"/>
<appender-ref ref="CONSOLE"/>
</root>
</log4j:configuration>
-- Eduardo
..............................................
http://emmartins.blogspot.com
http://redhat.com/solutions/telco
I've reproduced the described issue of seemingly inconsistent response event and
ActivityEndEvent routing on 2.4.1.FINAL release. Please find the corresponding
log fragment attached, all previous provisions apply for this log fragment as well.
As previously mentioned a response event is fired by RA before invoking
endActivity, however, the log shows that ActivityEndEvent gets routed first and
activity context is removed before starting to route the response event,
therefore the routing fails. As many requests following the identical scenario
are handled correctly, this would seem to me as some concurrency issue.
Just for reference: ID of the activity illustrating the fault is
"e690c1e0-afda-403a-a668-45c34465df00".
Looking forward to any insights.
--
Thanks,
Vilius Panevėžys
Elitnet
On 2011-06-27 22:21, Eduardo Martins wrote:
> each=which
>
> -- Eduardo
> ..............................................
> http://emmartins.blogspot.com
> http://redhat.com/solutions/telco
>
>
>
> On Mon, Jun 27, 2011 at 8:20 PM, Eduardo Martins<emma...@gmail.com> wrote:
>> Villius, you will need to try out the latest SLEE release and see if
>> the issue still occurs, sorry but it is impossible for us to help with
>> old releases. I recall that were a few minor fixes on related code,
>> but not in each release(s).
>>
>> For log4j, use this config:
>>
>> <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
>>
>> <log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/"
>> debug="false">
>>
>> <appender name="CONSOLE" class="org.apache.log4j.ConsoleAppender">
>> <errorHandler class="org.jboss.logging.util.OnlyOnceErrorHandler"/>
>> <param name="Target" value="System.out"/>
>> <param name="Threshold" value="DEBUG"/>
>>
>> <layout class="org.apache.log4j.PatternLayout">
>> <param name="ConversionPattern" value="%d{ABSOLUTE} %-5p [%c{1}] %m%n"/>
>> </layout>
>> </appender>
>>
>> <category name="c">
--
Vilius Panevėžys
Elitnet