Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

SuspendThread ... GetThreadContext returns "false" error code 5 ... why?

658 views
Skip to first unread message

Ira Baxter

unread,
Aug 9, 2010, 4:57:51 PM8/9/10
to
We have an application in which one thread can stop another to inspect its
state,
by doing SuspendThread/GetThreadContext/ResumeThread.

Extremely rarely, on a multicore system,
GetThreadContext returns error code 5 (Windows system error code "Access
Denied").
We are checking the return status of SuspendThread and ResumeThread; they
aren't complaining, ever.

How can it be the case that I can suspend a thread, but can't access its
context?

This blog
http://www.dcl.hpi.uni-potsdam.de/research/WRK/2009/01/what-does-suspendthread-really-do/

suggests that SuspendThread, when it returns, may have *started* the
suspension of the other

thread, but that thread hasn't yet suspended. In this case, I can kind of
see how GetThreadContext

would be problematic, but this seems like a stupid way to define
SuspendThread.

(How would the call of SuspendThread know when the target thread was
actaully suspended?)

Any help appreciated.

-- IDB


Daniel Terhell

unread,
Aug 11, 2010, 6:28:22 AM8/11/10
to
The suspended thread might be temporarily "borrowed" for APC execution. This
might or might not be the problem and it might even stop threads from
suspending. What I would do is try it N times in a loop and sleep if it
fails.

//Daniel


"Ira Baxter" <idba...@semdesigns.com> wrote in message
news:e$QsMWAOL...@TK2MSFTNGP02.phx.gbl...

Ira Baxter

unread,
Aug 21, 2010, 3:11:56 PM8/21/10
to
Daniel,

"Might"? Are you educated-guessing this could happen, (I'm an old OS
designer,
and I can understand how one might guess this as a plausible
implementation),
or are you asserting this is a real possibility?

Even if it were, why would the behavior of the external GetThreadContext
function be affected
(e.g., if this were the case, why wouldn't MS have hidden your loop inside
the GetThreadContext call)?

Assuming you are right, *and* assuming that it takes "some time" for the APC
execution
to occur, and knowing that the amount of work is dependent on CPU speeds and
whatever
code happens to be hiding in APC processing and how much other work there is
to do,
how one sensibly choose an appropriate N? (Yes I could pick an arbitrarily
big one, but
that isn't "design", its witchcraft and just sets me up for failure in the
future).

The MSDN documentation on GetThreadContext is terrible. It does say that
"access denied"
is possible, but it gives no clues as to *why*.

[Can a MS person look into this, please?]


-- IDB

"Daniel Terhell" <dan...@resplendence.com> wrote in message
news:2EC67687-76FF-4277...@microsoft.com...

Hector Santos

unread,
Aug 21, 2010, 8:08:03 PM8/21/10
to
I think you made assumption (no reason to expect it ain't valid) that
you have serial procedure of events:

SuspendThread()
GetThreadContext()
ResumeThread()

Since its technically possible SuspendThread could be competing with
ResumeThread, it seems pretty reasonable to expect when calling a
"data" access function (of any kind), that they might be
synchronization issue.

I trust when you say you check the status of
SuspendThread()/ResumeThread, that means that under an assumed serial
non-competing model, it would be:

if (SuspendThread(h) == 1) {
if (GetThreadContext(h,ctx)) {
.....
}
if (ResumeThread(h) == 0) {
... out of sync ...
}
}

if you can't not guarantee that Suspension returns N and resumption
return N-1, then you have a indeterminate environment making more
difficult to predict. You would simply have to accept this
possibility and use a loop or something with an access = 5.

On the other hand, if we are trying to figure out the kernel and APC
related issues, I think what begins to happen here is that you run
into OS related issues. Is this VISTA, W7, XP, 2003, 2008, NT or even
95? We had a long thread here within the last year or so regarding
how parent/childs threads start/terminate, the timing involved and
with the OP asking the same type of question expecting a certain
status and errors were different depending on a few factors include
the OS type and # of CPUs. The unpredictability of the timing was
pretty much the consensus of that long long long thread.

--

--
HLS

Daniel Terhell

unread,
Aug 22, 2010, 3:42:53 AM8/22/10
to
APCs are rather complicated topic but a kernel mode APC can temporarily
borrow the context of a user mode thread that's in an alertable wait state
for its execution so that it's temporarily removed from it's wait state and
from the list of waiters of a dispatcher object . For this reason
ResumeThread will not work in case the thread is borrowed for APC execution
(also this can fail). Also for this reason PulseEvent is not reliable, it
might not wake a thread that's waiting on the event because it might be
borrowed for APC execution and any synchronization algorithm that relies on
this is inherently broken, you can look this up in MSDN (PulseEvent). It
makes sense to me that you cannot call GetThreadContext on a thread that was
removed from its wait state and used for APC delivery given the fact you
cannot call it on a running thread.

I would choose N to be a high number (say 100). Given the low probability
of APC execution of your thread then it becomes statistically impossible for
this to become a show stopper. This is definitely a design bug in the OS
dispatcher but given the low probability and the warnings given something
that can be easily worked around.

//Daniel


"Ira Baxter" <idba...@semdesigns.com> wrote in message

news:ebEh6SWQ...@TK2MSFTNGP04.phx.gbl...

Daniel Terhell

unread,
Aug 22, 2010, 4:20:06 AM8/22/10
to
BTW for a MS article and how this also affects Get/SetThreadContext read
the comments below
http://blogs.msdn.com/b/oldnewthing/archive/2005/01/05/346888.aspx

//Daniel

0 new messages