Activity ends before last event is processed: Unable to route event

82 views
Skip to first unread message

Vilius Panevėžys

unread,
Jun 27, 2011, 11:22:26 AM6/27/11
to mobicent...@googlegroups.com
Hello all,
we're having an issue with a RA which basically sends a request as requested by
SBB and routes the response as event. Activity ending logic is similar to HTTP
client RA: response event is fired followed by call to endActivity from the same
thread.

Under approximately 30 requests per second errors like

======
2011-06-27 12:34:55,484 ERROR
[org.mobicents.slee.runtime.eventrouter.routingtask.EventRoutingTaskImpl] Unable
to route event EventContext[event type id =
EventTypeID[name=...,vendor=...,version=1.0] , event = ... , local ac = ... ,
address = null , serviceID = null]. The activity context is gone
======

start to show up in the server.log. Log shows that the response event has been
fired, then endActivity has been invoked, followed by activityEnded callback
(less than 100 ms after invoking endActivity). After that ER complains being
unable to route the event. This leaves the impression that the activity has been
ended before the event has been processed, which was fired before invoking
endActivity. However, I would expect the ActivityEndEvent to be queued after the
event that has been fired before invoking endActivity.

This observation triggers the question whether fireEvent self-consistency
described in JSLEE 1.1 15.14.8 is as well true for endActivity (did not find
this stated clearly in the spec). In other words, are fireEvent and endActivity
invocations from the same thread guaranteed to be processed in the order that
they were invoked?

Having in mind the similarity of activity ending logic as compared to HTTP
Client RA, I tried to reproduce the issue using HTTP Client RA but failed due to
UnrecognizedActivityHandleException at 1-2 requests per second:

======
17:48:01,328 ERROR [HttpClientResourceAdaptor] Unable to fire
EventTypeID[name=net.java.client.slee.resource.http.event.ResponseEvent,vendor
=net.java.client.slee,version=1.0] on activity handle HttpClientSession ID:
564b0124-1a6c-41d6-9365-48b7f2cf0196 , the handle is not mapped
to an activity context
javax.slee.resource.UnrecognizedActivityHandleException: Unable to fire
EventTypeID[name=net.java.client.slee.resource.http.event.ResponseEv
ent,vendor=net.java.client.slee,version=1.0] on activity handle
HttpClientSession ID: 564b0124-1a6c-41d6-9365-48b7f2cf0196 , the handle is n
ot mapped to an activity context
at
org.mobicents.slee.resource.SleeEndpointImpl._fireEvent(SleeEndpointImpl.java:710)
at
org.mobicents.slee.resource.SleeEndpointFireEventNotTransactedExecutor.execute(SleeEndpointFireEventNotTransactedExecutor.java:82
)
at
org.mobicents.slee.resource.SleeEndpointImpl.fireEvent(SleeEndpointImpl.java:530)
at
org.mobicents.slee.resource.SleeEndpointImpl.fireEvent(SleeEndpointImpl.java:449)
at
org.mobicents.client.slee.resource.http.HttpClientResourceAdaptor.processResponseEvent(HttpClientResourceAdaptor.java:342)
at
org.mobicents.client.slee.resource.http.HttpClientResourceAdaptor$AsyncExecuteMethodHandler.run(HttpClientResourceAdaptor.java:40
8)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
======

Anyway, getting back to the original topic, I realize we could try to wait for
eventProcessingSuccessful, then endActivity and possibly some other workarounds,
but the described behaviour of the RA (fire an event, endActivity from the same
thread) seems valid to me.

Any advice greatly appreciated.


--
Vilius Panevėžys
Elitnet

Eduardo Martins

unread,
Jun 27, 2011, 1:20:24 PM6/27/11
to mobicent...@googlegroups.com
Debug log please, in particular it is of special interest when events
are fired (response vs activity end) and when routing starts. To
reduce the log try using INFO as root level, console as DEBUG, as 2
categories as DEBUG: org.mobicents.slee.runtime.activity and
org.mobicents.slee.runtime.eventrouter

-- Eduardo
..............................................
http://emmartins.blogspot.com
http://redhat.com/solutions/telco

2011/6/27 Vilius Panevėžys <vil...@elitnet.lt>:

Eduardo Martins

unread,
Jun 27, 2011, 1:21:12 PM6/27/11
to mobicent...@googlegroups.com
And mobicents SLEE version and environment...

-- Eduardo
..............................................
http://emmartins.blogspot.com
http://redhat.com/solutions/telco

Vilius Panevėžys

unread,
Jun 27, 2011, 3:07:49 PM6/27/11
to mobicent...@googlegroups.com
Sorry for not mentioning the env, I've assumed it does not matter :) as we're
able to reproduce the issue on Windows XP SP3 running Mobicents 2.4.0 and Centos
5 running Mobicents 2.1.2. Both machines have multiple cores if that matters.

As mentioned, the issue is reproducible during a load test, so I've stripped a
small relevant part from tens of megabytes of full log. Also please keep in mind
that I've removed obviously irrelevant parts (e.g. events fired by other RAs,
etc.) but there will still be some unrelated lines. However, every line that is
in the file is in the order that it appeared in the server.log. Symbols "<>"
used for some obfuscation.

ID of the incorrectly handled activity is 72669d48-0c67-4532-b5a9-065007d3a555
and activity with ID 832f855a-fba4-4709-a0e7-3e48cef85d97 represents what is in
my understanding the correct handling - response event routed before
ActivityEndEvent.

I've marked lines of particular interest with ">>>" and "<<<".

Please let me know if you need any additional info.


Thanks,
--
Vilius Panevėžys
Elitnet

server_trimmed.log

Eduardo Martins

unread,
Jun 27, 2011, 3:20:43 PM6/27/11
to mobicent...@googlegroups.com
Villius, you will need to try out the latest SLEE release and see if
the issue still occurs, sorry but it is impossible for us to help with
old releases. I recall that were a few minor fixes on related code,
but not in each release(s).

For log4j, use this config:

<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">

<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/"
debug="false">

<appender name="CONSOLE" class="org.apache.log4j.ConsoleAppender">
<errorHandler class="org.jboss.logging.util.OnlyOnceErrorHandler"/>
<param name="Target" value="System.out"/>
<param name="Threshold" value="DEBUG"/>

<layout class="org.apache.log4j.PatternLayout">
<param name="ConversionPattern" value="%d{ABSOLUTE} %-5p [%c{1}] %m%n"/>
</layout>
</appender>

<category name="org.mobicents.slee.runtime">
<priority value="DEBUG" />
</category>

<root>
<level value="INFO"/>
<appender-ref ref="CONSOLE"/>
</root>

</log4j:configuration>

Eduardo Martins

unread,
Jun 27, 2011, 3:21:07 PM6/27/11
to mobicent...@googlegroups.com
each=which

-- Eduardo
..............................................
http://emmartins.blogspot.com
http://redhat.com/solutions/telco

Vilius Panevėžys

unread,
Jun 28, 2011, 6:01:02 AM6/28/11
to mobicent...@googlegroups.com
I guess you're advising to eliminate intensive disk IO with the log4j config
sample without FileAppender. I enabled DEBUG level for
"org.mobicents.slee.runtime" but used DailyRollingFileAppender as it's a lot
more convenient to handle the log in the file (the relevant parts might scroll
out of console buffer and if I'd redirect output to a file, don't think that not
using a FileAppender would "save" anything). I think logging to file or to
console is not essential in this case.

I've reproduced the described issue of seemingly inconsistent response event and
ActivityEndEvent routing on 2.4.1.FINAL release. Please find the corresponding
log fragment attached, all previous provisions apply for this log fragment as well.

As previously mentioned a response event is fired by RA before invoking
endActivity, however, the log shows that ActivityEndEvent gets routed first and
activity context is removed before starting to route the response event,
therefore the routing fails. As many requests following the identical scenario
are handled correctly, this would seem to me as some concurrency issue.

Just for reference: ID of the activity illustrating the fault is
"e690c1e0-afda-403a-a668-45c34465df00".

Looking forward to any insights.


--
Thanks,
Vilius Panevėžys
Elitnet


On 2011-06-27 22:21, Eduardo Martins wrote:
> each=which
>
> -- Eduardo
> ..............................................
> http://emmartins.blogspot.com
> http://redhat.com/solutions/telco
>
>
>
> On Mon, Jun 27, 2011 at 8:20 PM, Eduardo Martins<emma...@gmail.com> wrote:
>> Villius, you will need to try out the latest SLEE release and see if
>> the issue still occurs, sorry but it is impossible for us to help with
>> old releases. I recall that were a few minor fixes on related code,
>> but not in each release(s).
>>
>> For log4j, use this config:
>>
>> <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
>>
>> <log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/"
>> debug="false">
>>
>> <appender name="CONSOLE" class="org.apache.log4j.ConsoleAppender">
>> <errorHandler class="org.jboss.logging.util.OnlyOnceErrorHandler"/>
>> <param name="Target" value="System.out"/>
>> <param name="Threshold" value="DEBUG"/>
>>
>> <layout class="org.apache.log4j.PatternLayout">
>> <param name="ConversionPattern" value="%d{ABSOLUTE} %-5p [%c{1}] %m%n"/>
>> </layout>
>> </appender>
>>

>> <category name="c">

server_trimmed_2.4.1.log

Vilius Panevėžys

unread,
Jun 28, 2011, 9:07:33 AM6/28/11
to mobicent...@googlegroups.com
Issue created:
http://code.google.com/p/mobicents/issues/detail?id=2689


--
Vilius Panevėžys
Elitnet

Vilius Panevėžys

unread,
Jul 15, 2011, 11:04:22 AM7/15/11
to mobicent...@googlegroups.com
For the sake of a historical record if anyone stumbles upon this thread: the
issue has been resolved by a patch attached to the issue.


--
Vilius Panevėžys
Elitnet

Reply all
Reply to author
Forward
0 new messages