TAO VERSION: 1.6a_p4
ACE VERSION: 5.6a_p4
HOST MACHINE and OPERATING SYSTEM: Windows 7 64 bit
COMPILER NAME AND VERSION (AND PATCHLEVEL): MS Visual C++ 2005
BUILD METHOD USED: project file?
DOES THE PROBLEM AFFECT: EXECUTION
SYNOPSIS: Deadlock
DESCRIPTION:
I have an application with 3 threads. Two from them run orb_run method
(ORB1, ORB2) and last thread is client (CLIENT). In context of these
threads executes 2 remote methods:
1) METHOD1 - always calls from CLIENT and call METHOD2
2) METHOD2 - calls bunch AMI methods and wait for AMI calls completion
There is two ways of execution. The good one:
1) CLIENT calls METHOD1
2) ORB1 or ORB2 get upcall for METHOD1, that run METHOD2
3) Last ORB thread get upcall for METHOD2 and execute it
4) CLIENT runs reactor's select method
The bad one (deadlock):
1) CLIENT calls METHOD1
2) ORB1 or ORB2 get upcall for METHOD1, that run METHOD2 (now follower,
waiting the event)
3) CLIENT get upcall for METHOD2 and try to execute it (locked in
waiting AMI responses, as no threads left to process AMI calls)
4) Last ORB thread run in
TAO_Leader_Follower::wait_for_client_leader_to_complete method.
I have opinion that in bad case the CLIENT thread doesn't give
leadership to another thread. Are there methods for correcting that
problem? I will very happy for any comments.
Best regards,
Vadim
For any issues related to the OCI distribution contact OCI. We did fix some
AMI deadlock problems on svn head, please download x.7.6 which you can
obtain from http://download.dre.vanderbilt.edu and give that a try.
Johnny
> _______________________________________________
> tao-users mailing list
> tao-...@list.isis.vanderbilt.edu
> http://list.isis.vanderbilt.edu/mailman/listinfo/tao-users
Hi Vadim,
> TAO VERSION: 1.6a_p4
> ACE VERSION: 5.6a_p4
> HOST MACHINE and OPERATING SYSTEM: Windows 7 64 bit
> COMPILER NAME AND VERSION (AND PATCHLEVEL): MS Visual C++ 2005
> BUILD METHOD USED: project file?
> DOES THE PROBLEM AFFECT: EXECUTION
> SYNOPSIS: Deadlock
> DESCRIPTION:
>
> I have an application with 3 threads. Two from them run orb_run method
> (ORB1, ORB2) and last thread is client (CLIENT). In context of these
> threads executes 2 remote methods:
> 1) METHOD1 - always calls from CLIENT and call METHOD2
> 2) METHOD2 - calls bunch AMI methods and wait for AMI calls completion
>
Can you clarify a couple of points? First, is the client thread making a
collocated invocation on METHOD1? Second, what is the motivation for
having METHOD2 make a bunch of AMI calls and then block waiting for
responses?
> There is two ways of execution. The good one:
> 1) CLIENT calls METHOD1
> 2) ORB1 or ORB2 get upcall for METHOD1, that run METHOD2
> 3) Last ORB thread get upcall for METHOD2 and execute it
> 4) CLIENT runs reactor's select method
>
So in 4) the CLIENT thread is waiting for the result from the METHOD1
upcall, and so it handles the AMI responses as they come in, the first
ORB thread could equally be handling these responses as it too is a
client thread waiting for the result of an upcall. The second ORB thread
is blocked in some wait state for the result of all the AMI calls.
> The bad one (deadlock):
> 1) CLIENT calls METHOD1
> 2) ORB1 or ORB2 get upcall for METHOD1, that run METHOD2 (now follower,
> waiting the event)
> 3) CLIENT get upcall for METHOD2 and try to execute it (locked in
> waiting AMI responses, as no threads left to process AMI calls)
> 4) Last ORB thread run in
> TAO_Leader_Follower::wait_for_client_leader_to_complete method.
>
With the leader-follower wait strategy, threads that make invocations
enter the thread pool along with ORB run threads. This allows heavily
loaded ORBS to make use of as many resources as possible. Whenever an
event occurs, such as an incoming request or reply, the thread currently
waiting in select() takes it and a new thread is pulled from the pool to
wait in select. The threads in the pool are either clients waiting for
replies or threads given to the ORB via a call to ORB::run(). If a
client thread is selected to be the "leader" and call select, the other
threads have to defer to it because of the problem of nested upcalls.
The description at this point gets a little hairy so I'll stop here. I'm
sure you can find papers describing the theory behind the leader
follower strategy under $TAO_ROOT/docs or on line. Or in the TAO
Developer's Guide.
> I have opinion that in bad case the CLIENT thread doesn't give
> leadership to another thread. Are there methods for correcting that
> problem? I will very happy for any comments.
>
Sure. There are a few alternative strategies. I would first look at
refactoring your METHOD2 code so that you do not block waiting for the
AMI responses before returning. I assume AMI is used because you have to
invoke a number of long running calls on other services, and the results
have to be gathered before proceeding. See if you can push that wait
back down to the originating client method. That way you don't have
servant implementation code blocking and leading to a deadlock.
If that isn't possible, I suggest you use the CSD_ThreadPool strategy
for your servant implementations. This can be used in conjunction with
the Leader Follower wait strategy in that invocations are handled by a
threadpool distinct from your CLIENT, ORB1, ORB2 threads.
I suggest you open a support contract with OCI and we can work further
with you to resolve these issues.
Best regards,
Phil
--
Phil Mesnier
Principal Software Engineer and Partner, http://www.ociweb.com
Object Computing, Inc. +01.314.579.0066 x225