Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

WaitForSingleObject On Thread Handle In DLL_PROCESS_DETACH

80 views
Skip to first unread message

Le Chaud Lapin

unread,
Nov 22, 2006, 8:23:54 PM11/22/06
to
Hello,

I know the documentation says that waiting for death of a thread should
not be done in DLL_PROCESS_DETACH, but the software model I have would
really benefit from being able to do so, so I am doing it, and I
believe it is the source of hang. Why does DisableThreadLibraryCalls
not help?

Is there a way to have one thread wait on the death of another in
DLL_PROCESS_DETACH?

TIA,

-Le Chaud Lapin-

Skywing [MVP]

unread,
Nov 22, 2006, 8:49:32 PM11/22/06
to
This just will not work in most circumstances (as far as waiting for a
thread in *this process* to exit).

Here's the trouble: There is an internal NTDLL critical section that is
acquired whenever a call-out to DllMain happens (the so-called loader lock).
Now, consider that you are already in DllMain, which means that the current
thread already owns the loader lock.

If you have a thread that is cleanly terminating, that thread will (when you
call ExitThread, or return from the thread function) make a call to an
internal NTDLL routine which acquires the loader lock and then scans the
list of loaded DLLs, and for each DLL that does not have thread call-outs
disabled, makes a DllMain(DLL_THREAD_DETACH) call.

With this information, you should see the problem. Effectively, you have
this situation:

Thread-A: Has LoaderLock acquired, in DllMain at
WaitForSingleObject(Thread-B)
Thread-B: Inside the NTDLL routine to make call-outs for DLL_THREAD_DETACH
(called by ExitThread) and waiting on the LoaderLock

As you can see, neither thread will be able to make forward progress and a
deadlock results.

If you find yourself needing to be able to wait on threads in DllMain, then
your design is just not done correctly. There is not really any good way to
recover from this situation.

Note that a similar problem happens with thread creation, too. When a
thread is created, the first thing it does is call out to NTDLL, acquire the
LoaderLock, and make DllMain(DLL_THREAD_ATTACH) calls. As a result, you
can't block on a new thread in DllMain or you'll deadlock, just the same as
you can't block on waiting for a thread in the current process to exit
cleanly in DllMain.

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net
"Le Chaud Lapin" <jaibu...@gmail.com> wrote in message
news:1164245034.8...@k70g2000cwa.googlegroups.com...

Le Chaud Lapin

unread,
Nov 22, 2006, 9:57:16 PM11/22/06
to
Skywing [MVP] wrote:
> Note that a similar problem happens with thread creation, too. When a
> thread is created, the first thing it does is call out to NTDLL, acquire the
> LoaderLock, and make DllMain(DLL_THREAD_ATTACH) calls. As a result, you
> can't block on a new thread in DllMain or you'll deadlock, just the same as
> you can't block on waiting for a thread in the current process to exit
> cleanly in DllMain.

Thanks for the thorough and clear explanation.

What is strange is that the hang happens only occasionally. The host
process is Explorer.EXE. The DLL is a shell namespace extension. I am
calling a two C++ member functions on a static global C++ object:

p.enregister() - DLL_PROCESS_ATTACH
p.deregister() - DLL_PROCESS_DETATCH

p.enregister() spawns a thread and goes about its business.

p.deregister() signals to that same thread to terminate, and waits for
it to terminate.

Because p is a global static object, the destructor of p will be
automatically invoked at program termination. The destructor of p, by
design, will call p.deregister() if the DLL is being unloaded
(absolutely necessary to call p.deregister() at some point before
unloading the DLL).

If I simple avoid doing a p.deregister() in DLL_PROCESS_DETACH, an
exception will be thrown, because, as it turns out, p.deregister()
depends on a global static mutex being in existence to work. That
mutex is not guaranteed to be remain in existence until all global
objects that need it have been destructed, per C++ translation unit
rules. You can rest assure that explicity controlling the order of
destruction of global objects by placing all global
constructible/destructible objects in one translation unit will ruin
the form of my code in ways I cannot describe.

So I figured the logical place to do enregister()/deregister() was in
DllMain.

But this still does not explain why hanging occurs only occasionally.
If it is true that the critical section is acquired before traversal of
the module list, then that would be true each time, so hanging should
occur every time. Just to be sure, I will run under the debugger to
make sure it works as I have it most times.

Any other ideas appreciated.

Regards,

-Le Chaud Lapin-

Le Chaud Lapin

unread,
Nov 22, 2006, 11:45:21 PM11/22/06
to
Le Chaud Lapin wrote:
> But this still does not explain why hanging occurs only occasionally.
> If it is true that the critical section is acquired before traversal of
> the module list, then that would be true each time, so hanging should
> occur every time. Just to be sure, I will run under the debugger to
> make sure it works as I have it most times.

Update:

A more methodical debugging of my system (Explorer.exe + 3 custom DLLs)
revealed that waiting for a thread to terminate using
WaitForSingleObject in DLL_PROCESS_DETACH would work sometimes because
the thread that is being waited on had already pseudo-exited and
entered the "signaled" state (0 was being returned from
WaitForSingleObject). So it appears that I have a race condition: If
I cause Explorer.exe to exit, it is only in rare cases that I effect
the exit at the precise moment that the waiting thread, after perhaps a
few context switches, will enter DllMain while the waitable thread has
not yet had a chance to terminate, whereupon waiting on that thread
will be possible, but result in deadlock, which is good (bad).

Come to think of it, there exists a posibility of disparity between
which thread executes the DLL_PROCESS_ATTACH and which executes the
DLL_PROCESS_DETACH. Sigh. Going to have to rethink this.

Thanks for clearing things up.

-Le Chaud Lapin-

adeb...@club-internet.fr

unread,
Nov 23, 2006, 4:12:10 AM11/23/06
to

On 23 nov, 03:57, "Le Chaud Lapin" <jaibudu...@gmail.com> wrote:
> Skywing [MVP] wrote:
> > Note that a similar problem happens with thread creation, too. When a
> > thread is created, the first thing it does is call out to NTDLL, acquire the
> > LoaderLock, and make DllMain(DLL_THREAD_ATTACH) calls. As a result, you
> > can't block on a new thread in DllMain or you'll deadlock, just the same as
> > you can't block on waiting for a thread in the current process to exit

> > cleanly in DllMain.Thanks for the thorough and clear explanation.


>
> What is strange is that the hang happens only occasionally. The host
> process is Explorer.EXE. The DLL is a shell namespace extension. I am
> calling a two C++ member functions on a static global C++ object:
>
> p.enregister() - DLL_PROCESS_ATTACH
> p.deregister() - DLL_PROCESS_DETATCH
>
> p.enregister() spawns a thread and goes about its business.
>
> p.deregister() signals to that same thread to terminate, and waits for
> it to terminate.
>
> Because p is a global static object, the destructor of p will be
> automatically invoked at program termination. The destructor of p, by
> design, will call p.deregister() if the DLL is being unloaded
> (absolutely necessary to call p.deregister() at some point before
> unloading the DLL).
>
> If I simple avoid doing a p.deregister() in DLL_PROCESS_DETACH, an
> exception will be thrown, because, as it turns out, p.deregister()
> depends on a global static mutex being in existence to work. That
> mutex is not guaranteed to be remain in existence until all global
> objects that need it have been destructed, per C++ translation unit
> rules. You can rest assure that explicity controlling the order of
> destruction of global objects by placing all global
> constructible/destructible objects in one translation unit will ruin
> the form of my code in ways I cannot describe.

Beware than when DLLs come into play, VC does NOT respect C++ rules for
order of destruction of global objects. What happens is that each
DLL/EXE has it's own stack of objects to be destroyed (the same stack
that can be accessed through atexit). When a global object is created,
it's destruction is queued in the stack of whatever module is running
at construction time.

The "destruction stack" of each module is iterated over and destructors
are called during DLL_PROCESS_DETATCH, after the user DLLMain function
has been called.

Arnaud
MVP - VC

Pavel Lebedinsky [MSFT]

unread,
Nov 23, 2006, 6:52:03 AM11/23/06
to
The usual approach is to perform all non-trivial cleanup before
DLL_PROCESS_DETACH (and global destructors) are invoked.

I'm not familiar with shell extensions but I assume they are COM
server DLLs. If the interface between the extension and its host
doesn't provide an explicit cleanup method, you can cleanup when
the last object from the DLL is released. Most likely you already
have some kind of module refcount (I think ATL uses one to
implement DllCanUnloadNow), so you can cleanup when it
becomes zero.

Note that COM can release your DLL (call FreeLibrary on it) even
if you return FALSE from DllCanUnloadNow. This means that each
worker thread created by the DLL must have its own refcount
to prevent the DLL from being unloaded while the thread is still
running. The way this is usually done is to addref the DLL (by calling
LoadLibrary or GetModuleHandleEx) before you create the thread.
When the worker thread exits it releases its refcount by calling
FreeLibraryAndExitThread.

--
This posting is provided "AS IS" with no warranties, and confers no
rights.

Ben Voigt

unread,
Nov 24, 2006, 12:10:28 PM11/24/06
to

"Skywing [MVP]" <skywing_...@valhallalegends.com> wrote in message
news:esIjgGqD...@TK2MSFTNGP02.phx.gbl...

> This just will not work in most circumstances (as far as waiting for a
> thread in *this process* to exit).
>
> Here's the trouble: There is an internal NTDLL critical section that is
> acquired whenever a call-out to DllMain happens (the so-called loader
> lock). Now, consider that you are already in DllMain, which means that the
> current thread already owns the loader lock.
>
> If you have a thread that is cleanly terminating, that thread will (when
> you call ExitThread, or return from the thread function) make a call to an
> internal NTDLL routine which acquires the loader lock and then scans the
> list of loaded DLLs, and for each DLL that does not have thread call-outs
> disabled, makes a DllMain(DLL_THREAD_DETACH) call.
>
> With this information, you should see the problem. Effectively, you have
> this situation:
>
> Thread-A: Has LoaderLock acquired, in DllMain at
> WaitForSingleObject(Thread-B)
> Thread-B: Inside the NTDLL routine to make call-outs for DLL_THREAD_DETACH
> (called by ExitThread) and waiting on the LoaderLock

Couldn't this be resolved by waiting on an event instead of the thread
handle, because SetEvent won't need the LoaderLock as ExitThread does? If
you call SetEvent as the last line of your thread procedure this should be
just as good (the thread has not exited but it has finished using any
resources).

Skywing [MVP]

unread,
Nov 24, 2006, 2:51:32 PM11/24/06
to
You are taking your life into your own hands with respect to that, though.
Is the thread function part of your DLL? If so, then it absolutely must
have completely finished execution and exited before DLL_PROCESS_DETACH, or
you have a race condition where the thread function may get unloaded out
from under the thread (even if there is just the "ret" instruction remaining
after a SetEvent call).

There are also a number of things that the second thread could be doing
which might inadvertently acquire the loader lock. Obviously, anything
dealing with the loaded module list (GetModuleHandle, GetModuleFileName, and
soforth) do this, but there are a number of API that may internally make
calls to things which acquire the loader lock. For example, CreateProcess
might try to load DLLs related to enforcement of software restriction
policies on certain systems with that group policy setting configured in a
certain way.

The bottom line is you should really try hard to avoid getting yourself into
this situation. There are a number of things preventing you from cleanly
and easily accomplishing that, and many of them are dependant on
environmental considerations. You don't want to get yourself into a sticky
situation where your program fails in strange ways at a customer site, but
never in your test environment due to this kind of thing.

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Ben Voigt" <r...@nospam.nospam> wrote in message
news:OlwwNt%23DHH...@TK2MSFTNGP03.phx.gbl...

Vladimir Scherbina

unread,
Nov 25, 2006, 8:05:01 AM11/25/06
to
I think that signalling object at the end of a thread routine and waiting on
that object in DllMain should work. At the same time I agree with your
bottom line.

--
Vladimir (Windows SDK MVP)

"Skywing [MVP]" <skywing_...@valhallalegends.com> wrote in message

news:ODf9xHAE...@TK2MSFTNGP02.phx.gbl...

Le Chaud Lapin

unread,
Nov 26, 2006, 12:14:52 AM11/26/06
to
Vladimir Scherbina wrote:
> I think that signalling object at the end of a thread routine and waiting on
> that object in DllMain should work. At the same time I agree with your
> bottom line.

I just took a look at the specification for ExitProcess
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/exitprocess.asp
and, I could be wrong, but it appears that there is no easy solution to
this problem, not even using the SetEvent trick (in general).

The problem has to do with the way threads are brutally preempted when
ExitProcess is invoked. Each thread does, indeed, get its turn at the
module list when ExitProcess is invoked, but the process manager takes
no regard whatsover for whatever they were doing just before
preemption. This means that each preemptive thread could have an
arbitrary thread context just after invocation of ExitProcess (and
while inside DLL) main, and there can be no assurance that any thread
will have a chance to exit cleanly. In fact, they simply sit in a
zombie state, suspended, waiting for their chance to have at the module
list. By the time the controlling thread has entered DllMain with
DLL_PROCESS_DETACH, ExitProcess has already been called, so there is no
opportunity for it to tell the other threads, "Hey all, it is time to
die, please set your events and exit."

-Le Chaud Lapin-

Skywing [MVP]

unread,
Nov 26, 2006, 11:42:14 AM11/26/06
to
ExitProcess does a hard terminate of all threads that aren't the current
thread and then calls NTDLL for process detach. At least on Windows Vista,
this involves acquiring the loader, PEB, and heap (for the default process
heap) locks, hard terminating all thread in the process but the current
thread, then invoking DLL deinitializers and finally terminating the current
thread. This is why the documentation says to not call ExitProcess if you
don't know the state of other threads in your process. The acquisition of
the loader, PEB, and process heap locks before other threads are killed
ensure that after every thread is called (assuming all DllMains are well
behaved), there won't be any deadlock between the exiting thread and the
terminated threads. However, there is no guarantee about the validity of
any locks besides these three. Additionally, as far as I know, the whole
logic of acquiring the three critical locks before terminating threads is
new to Vista; downlevel systems are subject to races when you call
ExitProcess with other threads still running.

If the process is damaged (such that it is impossible to acquire one of
those three locks), then ExitProcess will immediately deadlock.

The bottom line is that there is no graceful cleanup for threads other than
the current thread when ExitProcess is called, and that you should have
really already let those threads exit gracefully before you called
ExitProcess in the first place. There is no magical way that ExitProcess
can tell every other thread in the process to finish what it was doing and
clean up properly. Resist the temptation to treat it as such.

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net
"Le Chaud Lapin" <jaibu...@gmail.com> wrote in message

news:1164518092....@l12g2000cwl.googlegroups.com...

Le Chaud Lapin

unread,
Nov 26, 2006, 12:40:51 PM11/26/06
to
Skywing [MVP] wrote:
> The bottom line is that there is no graceful cleanup for threads other than
> the current thread when ExitProcess is called, and that you should have
> really already let those threads exit gracefully before you called
> ExitProcess in the first place. There is no magical way that ExitProcess
> can tell every other thread in the process to finish what it was doing and
> clean up properly. Resist the temptation to treat it as such.

It's not my doing, it's Microsoft's. They are the one's who wrote the
application.

D.S.,

-Le Chaud Lapin-

Pavel Lebedinsky [MSFT]

unread,
Nov 26, 2006, 10:44:58 PM11/26/06
to
"Vladimir Scherbina" wrote:

>I think that signalling object at the end of a thread routine and waiting
>on that object in DllMain should work. At the same time I agree with your
>bottom line.

No this doesn't work reliably, because of the race condition that
Skywing mentioned (assuming we're talking about the dynamic
unload case, not the ExitProcess case. With ExitProcess you know
that no other threads are running when you receive DLL_PROCESS_DETACH,
so there's no point in waiting for them).

There is a similar problem with dynamically unloading DLLs and
thread pool callbacks. To solve it, Vista thread pool added a few
cleanup functions like SetEventWhenCallbackReturns and
FreeLibraryWhenCallbackReturns.

Ben Voigt

unread,
Dec 4, 2006, 10:13:24 AM12/4/06
to

"Skywing [MVP]" <skywing_...@valhallalegends.com> wrote in message
news:ODf9xHAE...@TK2MSFTNGP02.phx.gbl...

> You are taking your life into your own hands with respect to that, though.
> Is the thread function part of your DLL? If so, then it absolutely must
> have completely finished execution and exited before DLL_PROCESS_DETACH,
> or you have a race condition where the thread function may get unloaded
> out from under the thread (even if there is just the "ret" instruction
> remaining after a SetEvent call).

Sorry, I missed that the OP was talking about PROCESS_DETACH, I was replying
to your earlier response about THREAD_DETACH.

If one of the application's threads exits (cleanly, via call to ExitThread
or return from ThreadProc), then the library may need to clean up additional
resources associated with that thread, which could include helper threads.

The correct sequence for that is:

Call LoadLibrary and hEvent = CreateEvent before creating helper thread.
When processing DLL_THREAD_DETACH, use WaitForSingleObject(hEvent).
Helper thread procedure exits calling SetEvent followed by
FreeLibraryAndExitThread.


0 new messages