Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to recover from a EXCEPTION_STACK_OVERFLOW?

390 views
Skip to first unread message

Aurelien Regat-Barrel

unread,
Nov 20, 2006, 5:13:18 AM11/20/06
to
Hi,
I am writting an exception filter for my app that displays a message box
and kills the app. The problem is that it does nothing if the exception
is a stack overflow, because some stack space is needed to execute my
error handling code...
I tried to use _resetstkoflw() but it doesn't help. Is there a way to
recover some stack space in order to execute my handler before killing
the app?
I am thinking about:
- modifying ESP by hand (risky)
- unblocking a thread in my handler, and execute my error handling code
in that thread

Do you have better idea ?

Thanks.

--
Aurélien Regat-Barrel

Jochen Kalmbach [MVP]

unread,
Nov 20, 2006, 5:23:39 AM11/20/06
to
Hallo Aurelien!

> - modifying ESP by hand (risky)

static LONG __stdcall CrashHandlerExceptionFilter(EXCEPTION_POINTERS*
pExPtrs)
{
if (pExPtrs->ExceptionRecord->ExceptionCode == EXCEPTION_STACK_OVERFLOW)
{
static char MyStack[1024*128]; // be sure that we have enought
space...
// it assumes that DS and SS are the same!!! (this is the case for
Win32)
// change the stack only if the selectors are the same (this is the
case for Win32)
//__asm push offset MyStack[1024*128];
//__asm pop esp;
__asm mov eax,offset MyStack[1024*128];
__asm mov esp,eax;
}

// TODO: ...
}

Aurelien Regat-Barrel

unread,
Nov 20, 2006, 8:11:00 AM11/20/06
to
Jochen Kalmbach [MVP] a écrit :
> Hallo Aurelien!

Hello,

Thanks. I guess I can no longer use pExPtrs after modifying esp, nor any
local variable even if declared after the change?

--
Aurélien Regat-Barrel

Jochen Kalmbach [MVP]

unread,
Nov 20, 2006, 8:23:56 AM11/20/06
to
Hi Aurelien!

>> static LONG __stdcall CrashHandlerExceptionFilter(EXCEPTION_POINTERS*
>> pExPtrs)
>> {
>> if (pExPtrs->ExceptionRecord->ExceptionCode ==
>> EXCEPTION_STACK_OVERFLOW)
>> {
>> static char MyStack[1024*128]; // be sure that we have enought
>> space...
>> // it assumes that DS and SS are the same!!! (this is the case for
>> Win32)
>> // change the stack only if the selectors are the same (this is
>> the case for Win32)
>> //__asm push offset MyStack[1024*128];
>> //__asm pop esp;
>> __asm mov eax,offset MyStack[1024*128];
>> __asm mov esp,eax;
>> }
>>
>> // TODO: ...
>> }
>
> Thanks. I guess I can no longer use pExPtrs after modifying esp, nor any
> local variable even if declared after the change?

Local variables and paramaters are addressed via "ebp". And this points
to the "correct" value (because I have not changed "ebp".

But you should not use any local variables, because this increases the
(unchanged) stack!

By the way: There is *no* reliable way to catch unhandled exceptions
in-process!
The only reliable way is to let your ptogram run under a debugger...

Greetings
Jochen

Aurelien Regat-Barrel

unread,
Nov 20, 2006, 10:06:03 AM11/20/06
to
Jochen Kalmbach [MVP] a écrit :

> Local variables and paramaters are addressed via "ebp". And this points


> to the "correct" value (because I have not changed "ebp".
>
> But you should not use any local variables, because this increases the
> (unchanged) stack!
>
> By the way: There is *no* reliable way to catch unhandled exceptions
> in-process!
> The only reliable way is to let your ptogram run under a debugger...

You might guessed that I am writing a "graceful" exit of my app for the
end user version. The goal is to display a "sorry for this" message and
log some infos of the crash. I found a way to get the name of the
unhandled expception type with VC++. I am not yet sure what I will do
with it, but I give it here, it might help someone (provided "as is"):

#include <typeinfo>

#define EXCEPTION_MSC 0xE06d7363 // '?msc'

if ( ExceptionInfo->ExceptionRecord->ExceptionCode == EXCEPTION_MSC &&
ExceptionInfo->ExceptionRecord->NumberParameters == 3 )
{
ULONG_PTR param0 =
ExceptionInfo->ExceptionRecord->ExceptionInformation[ 0 ];
ULONG_PTR param1 =
ExceptionInfo->ExceptionRecord->ExceptionInformation[ 1 ];
ULONG_PTR param2 =
ExceptionInfo->ExceptionRecord->ExceptionInformation[ 2 ];

DWORD magicNumber = param0;
if ( magicNumber == 0x19930520 ) // 1999/05/20
{
void *pExceptionObject = reinterpret_cast<void*>( param1 );
const _s__ThrowInfo * pThrowInfo =
reinterpret_cast<_s__ThrowInfo *>( param2 );
_TypeDescriptor * pType =
pThrowInfo->pCatchableTypeArray->arrayOfCatchableTypes[ 0 ]->pType;

type_info *info = reinterpret_cast<type_info*>( pType );
std::cout << "Microsoft C++ Exception: " << info->name() << '\n';
}
}

It's ugly, but it works :-)

--
Aurélien Regat-Barrel

Günter Prossliner

unread,
Nov 20, 2006, 10:38:28 AM11/20/06
to
Hi Jochen!

> By the way: There is *no* reliable way to catch unhandled exceptions
> in-process!

I have plans to create a managed c++ assembly, which uses the
SetUnhandledExceptionFilter API to create a thread which uses
"MiniDumpWriteDump" to create a ("adplus" like) memory dump (and performs
additional things like compressing, upload to a server, ...).

According to your statement above, this is no good idea. Why? On MSDN I have
not read about calling in-process is not reliable. It just says, that it may
not produce a valid stack trace for the calling thread. This is no problem
for me, since the calling thread (which does nothing but calling
"MiniDumpWriteDump") will be filtered out by using the
MINIDUMP_CALLBACK_INFORMATION parameter.

Would this work?
Or can I forget about it?

> The only reliable way is to let your ptogram run under a debugger...

ON of the first questions that I asked myself when thinking about this
project was: "Would it not be better to to this from within another
processes (a debugger)?". But AFAIK in WinNT (XP) Debuggers can only be
registered from within the Registry, and global for all applications. Or am
I wrong?

It would be really cool to have a function like "SetDebuggerOnCrash(PCTSTR
debuggerCommandLine)", which specifies the debugger that will be used (just
for this application) on crash.


GP


Dan Mihai [MSFT]

unread,
Nov 20, 2006, 11:37:37 AM11/20/06
to
I recommend leveraging whenever possible the Error Reporting mechanisms
provided by the Operating System, instead of developing home-grown
solutions. http://msdn.microsoft.com/isv/resources/wer/ has information
about that.

The OS developers encountered many of these fragility problems around error
reporting, solved some of them already and keep working on improving this
application crash feedback loop.

Dan

--

This posting is provided "AS IS" with no warranties, and confers no rights.


"Günter Prossliner" <g.prossliner/gmx/at> wrote in message
news:eRX%23unLDH...@TK2MSFTNGP04.phx.gbl...

Günter Prossliner

unread,
Nov 20, 2006, 12:24:18 PM11/20/06
to
Hi Dan!

> I recommend leveraging whenever possible the Error Reporting
> mechanisms provided by the Operating System, instead of developing
> home-grown solutions. http://msdn.microsoft.com/isv/resources/wer/
> has information about that.

In general, I would agree with your statement.

But IMO the WER has some significant drawbacks:

* the data is send to Microsoft instead of the own server
* you have no control over the UI being displayed ("Microsoft" instead of
"Company X").

Not that I think that MS is doing something "evil" with the data. The
problem is that the user so not think that MS is responsible for "Product
X", and will propably never send the Error Report.

> The OS developers encountered many of these fragility problems around
> error reporting, solved some of them already and keep working on
> improving this application crash feedback loop.

I think that this is the right direction. But I hope that there will be more
customizations possible in future versions.


GP


Oleg Starodumov

unread,
Nov 20, 2006, 1:24:56 PM11/20/06
to
> I have plans to create a managed c++ assembly, which uses the SetUnhandledExceptionFilter API to create a thread which
> uses "MiniDumpWriteDump" to create a ("adplus" like) memory dump (and performs additional things like compressing,
> upload to a server, ...).
>
> According to your statement above, this is no good idea. Why?

It's not necessarily bad idea, IMO. You simply should be aware that when
you run heavy operations (like MiniDumpWriteDump) in the process
that has just exhibited unhandled exception, the process state can be
corrupted badly enough for those operations to fail.
In managed applications, probability of such fatal state corruptions
is lower than in native ones, though.

Various issues related with reliability of exception filters and just-in-time
debugging have been discussed in this newsgroup before, e.g. you can find
this thread interesting:
http://groups.google.fi/group/microsoft.public.win32.programmer.kernel/browse_thread/thread/aa0bff4829bf3a44

>> The only reliable way is to let your ptogram run under a debugger...
>
> ON of the first questions that I asked myself when thinking about this project was: "Would it not be better to to this
> from within another processes (a debugger)?". But AFAIK in WinNT (XP) Debuggers can only be registered from within the
> Registry, and global for all applications. Or am I wrong?
>

There are two basic options:

1) Use just-in-time debugger configured in Registry, system-wide.
The problem with reliability of just-in-time debugging is the same as with MiniDumpWriteDump -
JIT debugger has to be started from the inside of the crashed process, using CreateProcess
function, and CreateProcess itself can fail because of corruption of the process' state
(e.g. process heap).

2) Run the application under debugger all the time (that's what Jochen meant, probably,
and that's also what ADPlus (crash mode) and similar tools do).
This is much more reliable than JIT debugging, since it is not affected by the possibly
corrupted internals of the crashed process. Also, it does not depend on Registry.

There are other approaches exist, see discussion by the link above (e.g. launch your own
debugger using CreateProcess, use a non-debugging watchdog process that creates
the dump, etc.).

--
Regards,
Oleg
[VC++ MVP http://www.debuginfo.com/]


Skywing [MVP]

unread,
Nov 20, 2006, 2:08:03 PM11/20/06
to
As Oleg pointed out, calling MiniDumpWriteDump is extremely likely to fail
in a corrupted process. I would either use WER or start some kind of
watcher process ahead of time which looks for exceptions from the watched
process and makes the MiniDumpWriteDump call out of process. With Windows
Vista, the WER support is *much* expanded from XP and is actually a viable
solution for error reporting now.

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net


"Günter Prossliner" <g.prossliner/gmx/at> wrote in message
news:eRX%23unLDH...@TK2MSFTNGP04.phx.gbl...

Skywing [MVP]

unread,
Nov 20, 2006, 2:28:16 PM11/20/06
to
You cannot rely on this unless the exception filter function itself uses
SEH, has C++ objects with unwinding, or you have explicitly disabled FPO
optimizations...

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Jochen Kalmbach [MVP]" <nospam-Joch...@holzma.de> wrote in message
news:%235x7FcK...@TK2MSFTNGP02.phx.gbl...

RichK

unread,
Nov 20, 2006, 3:14:37 PM11/20/06
to

Some ideas that have worked for me are:
A) Reduce the amount of local variables in your exception handling, to get
the handling to fit in what little stack is available.
B) Start a different thread to perform the shutdown.
C) On Windows 2003 there is an API SetThreadStackGuarantee, which arranges
for a specific amount of stack space to be available during the exception
processing of stack overflows. Call this API early in the life of each
thread, and you will have a more stack space to work with when a stack
overflow occurs. (This API causes the stack overflow to occur sooner,
leaving more room for the exception handler)

As you have noticed, gracefully recovering (or even gracefully failing)
from a stack overflow can be difficult and unreliable.

"Aurelien Regat-Barrel" <nospam....@yahoo.fr.invalid> wrote in message
news:uxyOHzID...@TK2MSFTNGP04.phx.gbl...

Jochen Kalmbach [MVP]

unread,
Nov 21, 2006, 1:45:17 AM11/21/06
to
Hi Günter!

>> By the way: There is *no* reliable way to catch unhandled exceptions
>> in-process!
>

> According to your statement above, this is no good idea. Why?

The reason is simple:
If an unhandled exception is thrown you do not know what really happend.
Of course this is normal.
Let use create one scenario:
Some API have overwritten some region. If this region is in a
data-regions not much damage would occur and the OS should be enable to
call the unhandled exception filter.
Scenario 2:
Your app has (for some reason) overwritten code-regions (for example
your unhandled exception filter). Then maybe the OS was able to call
your function, but you are not able to do any successfull action.
Scenario 3:
The app has overwritten some code-region in some OS DLLs
(copy-on-write). Then you call some functions in this DLL and a third
exception will occur...

Conclusion: There is *no* reliable way to catch *every* unhandled
exception within the same process.


Of course: The same is true for "Dan"s suggestion. The OS is also not
possible to catch all exceptions because it relies on the
default-implementation of the unhandled exception filter inside the
"die-ing" process. If scenario 2 or 3 was the case, then WER is also not
able to report this problem.


See also other answers...

Greetings
Jochen

Aurelien Regat-Barrel

unread,
Nov 21, 2006, 3:41:08 AM11/21/06
to
> Scenario 2:
> Your app has (for some reason) overwritten code-regions (for example
> your unhandled exception filter). Then maybe the OS was able to call
> your function, but you are not able to do any successfull action.
> Scenario 3:
> The app has overwritten some code-region in some OS DLLs
> (copy-on-write). Then you call some functions in this DLL and a third
> exception will occur...

Is it possible to write in code region without having called
VirtualProtect, i.e. without explicitly wanted to do so?

--
Aurélien Regat-Barrel

Jochen Kalmbach [MVP]

unread,
Nov 21, 2006, 3:49:21 AM11/21/06
to
Hi Aurelien!

> Is it possible to write in code region without having called
> VirtualProtect, i.e. without explicitly wanted to do so?

AFAIK: At least in W2k...

Greetings
Jochen

Aurelien Regat-Barrel

unread,
Nov 21, 2006, 7:17:56 AM11/21/06
to
RichK a écrit :

> Some ideas that have worked for me are:
> A) Reduce the amount of local variables in your exception handling,
to get
> the handling to fit in what little stack is available.
> B) Start a different thread to perform the shutdown.
> C) On Windows 2003 there is an API SetThreadStackGuarantee, which
arranges
> for a specific amount of stack space to be available during the
exception
> processing of stack overflows. Call this API early in the life of each
> thread, and you will have a more stack space to work with when a stack
> overflow occurs. (This API causes the stack overflow to occur sooner,
> leaving more room for the exception handler)

Thanks for your reply. I will have a look at SetThreadStackGuarantee,
which I didn't know the existence.

> As you have noticed, gracefully recovering (or even gracefully failing)
> from a stack overflow can be difficult and unreliable.

Do you know what is _resetstkoflw() good for ?

--
Aurélien Regat-Barrel

Skywing [MVP]

unread,
Nov 21, 2006, 10:04:36 AM11/21/06
to
It is meant to be used with _alloca allocation failures, at the point of
failure (not as a UEF).

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Aurelien Regat-Barrel" <nospam....@yahoo.fr.invalid> wrote in message

news:OUG1bdWD...@TK2MSFTNGP04.phx.gbl...

Skywing [MVP]

unread,
Nov 21, 2006, 10:03:47 AM11/21/06
to
Depends on the protection specified for that subsection in the PE header.

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net
"Aurelien Regat-Barrel" <nospam....@yahoo.fr.invalid> wrote in message

news:el57RkUD...@TK2MSFTNGP03.phx.gbl...

Günter Prossliner

unread,
Nov 22, 2006, 6:56:40 AM11/22/06
to
Hi Oleg!

> You simply should be aware that when you run heavy operations (like
> MiniDumpWriteDump) in the process
> that has just exhibited unhandled exception, the process state can be
> corrupted badly enough for those operations to fail.

I thought about it.

> In managed applications, probability of such fatal state corruptions
> is lower than in native ones, though.

It is exactly what I tought too. The primary purpose is to catch unhandled
_managed_ exceptions. In most cases it sould be possible to create a
MiniDimp even if there has been an unhandled managed Exception.

In case of an StackOverflowException, I will try to use Jochens Assembler
Routine to recover from that.

> Various issues related with reliability of exception filters and
> just-in-time debugging have been discussed in this newsgroup before, e.g.
> you can
> find this thread interesting:
> http://groups.google.fi/group/microsoft.public.win32.programmer.kernel/browse_thread/thread/aa0bff4829bf3a44

Your posting within this thread (the last one) is quite informative! Maybe I
can implement more than one method to create the MiniDump:

Method A: The registered custom Exception Filter calls MiniDumpWriteDump
from a new thread within the same process
Method B: The registered custom Exception Filter signals a Watchdog Process
which creates the dump
Method C: The Watchdog Process is attached as a Debugger to the "real"
Process and creates the dump without the need to setup an custom Exception
Filter.

The user of the Component may choose what is nessersary in the concrete
case.

Maybe you can answer me the following question too:
If Method C will be implemented, how much performance does it cost? And one
which operations? Where does the performance - degradation come from? The
Debugger-Events?

> There are two basic options:
>
> 1) Use just-in-time debugger configured in Registry, system-wide.
> The problem with reliability of just-in-time debugging is the same as
> with MiniDumpWriteDump - JIT debugger has to be started from the
> inside of the crashed process, using CreateProcess function, and
> CreateProcess itself can fail because of corruption of the process'
> state (e.g. process heap).

Ok. If I understand this correctly, choosing the method will not be any
safer than creating a Thread in-process which creates the dump (new Process
vs. new Thread).

> 2) Run the application under debugger all the time (that's what
> Jochen meant, probably, and that's also what ADPlus (crash mode) and
> similar tools do).
> This is much more reliable than JIT debugging, since it is not
> affected by the possibly corrupted internals of the crashed process.
> Also, it does not depend on Registry.

This would be Method C.


A last question: Can Method C be implemented without installing the
"Debugging Tools for Windows" on the users machine? This is the main reason
for me to implement an own component. The first solution I have tried to
implement is to create a "mini-installation" of the Debugging Tools for
Windows (just with cdb, adplus and the needed dlls). But when looking at
redist.txt, you can see that only "dbghelp.dll", "symsrv.dll" and
"srcsrv.dll" may be redistributed.

"MiniDumpWriteDump" is implemented within "dbghelp.dll", so the distribution
would be possible. But may I implement an "Watchdog Debugger" by using just
the redist dll's?

GP


Günter Prossliner

unread,
Nov 22, 2006, 7:09:31 AM11/22/06
to
Hi Jochen!

> The reason is simple:


>
> ...
>
>
> Conclusion: There is *no* reliable way to catch *every* unhandled
> exception within the same process.

Ok. I do understand this now. But the primary purpose is to create Dumps
from managed Exceptions. I think that nearly all managed exceptions can be
caught by the unhandled exception filter (when not using unsafe code).

In case of a StackOverflowException, I will try to use your assembler
routine. This should be possible, or not? In case of a OutOfMemory
operation, there _may_ be not enoght memory for creating the dump.

> Of course: The same is true for "Dan"s suggestion. The OS is also not
> possible to catch all exceptions because it relies on the
> default-implementation of the unhandled exception filter inside the
> "die-ing" process. If scenario 2 or 3 was the case, then WER is also
> not able to report this problem.

What do you think about the Method A, Method B, Method C implementation (see
my answer to Oleg)?


GP


Oleg Starodumov

unread,
Nov 22, 2006, 9:06:14 AM11/22/06
to

> Maybe you can answer me the following question too:
> If Method C will be implemented, how much performance does it cost? And one
> which operations? Where does the performance - degradation come from? The
> Debugger-Events?
>

Yes, the performance hit is coming from dispatching debug events to the debugger.
So if the application generates lots of events (usually exceptions, also debug output,
module load/unload, thread start/exit, etc.), it can be slowed down. If there is not
too many debug events, performance will not be seriously affected.

This is if you attach debugger to an already running application. If you start the app
under debugger (which you shouldn't do IMO), there will be lots of various debug
checks enabled by default (heap, etc.), which will hurt performance too.

> > There are two basic options:
> >
> > 1) Use just-in-time debugger configured in Registry, system-wide.
> > The problem with reliability of just-in-time debugging is the same as
> > with MiniDumpWriteDump - JIT debugger has to be started from the
> > inside of the crashed process, using CreateProcess function, and
> > CreateProcess itself can fail because of corruption of the process'
> > state (e.g. process heap).
>
> Ok. If I understand this correctly, choosing the method will not be any
> safer than creating a Thread in-process which creates the dump (new Process
> vs. new Thread).
>

Yes, though creating a new thread from the filter is not a good idea IMO,
I would better recommend to create the helper thread beforehand, then
the failing thread would only need to set one event and wait for another -
same as with external watchdog.

> A last question: Can Method C be implemented without installing the
> "Debugging Tools for Windows" on the users machine?

Yes, it can. Win32 debugging API does not depend on presence of Debugging Tools.
You only need dbghelp.dll to create the dump.

Oleg


Aurelien Regat-Barrel

unread,
Nov 23, 2006, 4:12:20 AM11/23/06
to
Günter Prossliner a écrit :

> In case of a OutOfMemory
> operation, there _may_ be not enoght memory for creating the dump.

I want to catch C++ bad_alloc exception too. To increase the chance to
have enought memory for the execution of my filter, I am thinking about
doing this:

if ( it_is_a_bad_alloc )
{
HeapDestroy( (HANDLE)_get_heap_handle() );
}

If frees all CRT heap memory in a very abrupt way :-)

According to Oleg Starodumov article:
http://www.debuginfo.com/articles/effminidumps.html#minidumpnormal
MiniDumpWriteDump( MiniDumpNormal ) will not try to access the
(destroyed) heap memory of my process, so it should be ok. Can you
confirm that point ? Or do you see a potential pitfall ?

Thanks.

--
Aurélien Regat-Barrel

Oleg Starodumov

unread,
Nov 24, 2006, 2:30:03 AM11/24/06
to

> I want to catch C++ bad_alloc exception too. To increase the chance to
> have enought memory for the execution of my filter, I am thinking about
> doing this:
>
> if ( it_is_a_bad_alloc )
> {
> HeapDestroy( (HANDLE)_get_heap_handle() );
> }
>
> If frees all CRT heap memory in a very abrupt way :-)
>

HeapDestroy could have undesirable side effects if the heap header is corrupted.
IMO a much safer option would be to reserve a range of virtual memory
beforehand, and release it before calling the filter.

Oleg


Günter Prossliner

unread,
Nov 24, 2006, 9:05:57 AM11/24/06
to
Hello Oleg!


Thank you for your informative answer!

>> Maybe you can answer me the following question too:
>> If Method C will be implemented, how much performance does it cost?
>> And one which operations? Where does the performance - degradation
>> come from? The Debugger-Events?
>>
>
> Yes, the performance hit is coming from dispatching debug events to
> the debugger.
> So if the application generates lots of events (usually exceptions,
> also debug output, module load/unload, thread start/exit, etc.), it
> can be slowed down. If there is not too many debug events,
> performance will not be seriously affected.

Ok. In the "normal program flow" there shall be not too many of them. How
are the calls serialized between the debuggee and the debugger? Over shared
memory?

When I do nothing within the most event-procedures (only unhandled
exceptions will be processed) is it possible to say how much overhead it
will be? Is it possible to subscribe to the needed event(s) only?

> This is if you attach debugger to an already running application. If
> you start the app under debugger (which you shouldn't do IMO), there
> will be lots of various debug checks enabled by default (heap, etc.),
> which will hurt performance too.

The application will not be started under the debugger.

>> Ok. If I understand this correctly, choosing the method will not be
>> any safer than creating a Thread in-process which creates the dump
>> (new Process vs. new Thread).
>>
>
> Yes, though creating a new thread from the filter is not a good idea
> IMO,
> I would better recommend to create the helper thread beforehand, then
> the failing thread would only need to set one event and wait for
> another -
> same as with external watchdog.

This is a very good idea! I will go on with that.


I will implement the following methods:

You can configure the DumpHelper by using two modes:

Mode "Fast": It creates a thread which waits for an event to be set until it
calls "MiniDumpCreateDump" in process. The event will be set from the custom
unhandled exception filter.

Mode "Safe": It starts an watchdog-process (actually rundll32.exe with an
exported "DebuggerMain" from my dll (by using rundll32 I must not deploy
anything else but the dll), which attaches to the application as a Debugger,
and calls "MiniDumpCreateDump" within the watchdog process.

I will forget about the third method (creating a debugger-process within the
unhandled exception filter which creates the Minidump) because according to
your posting it is not any safer than the "Fast" mode.

What do you think about it?


GP


Aurelien Regat-Barrel

unread,
Nov 24, 2006, 10:14:23 AM11/24/06
to
Oleg Starodumov a écrit :

I guess you mean reseve + commit VM.

I do not really care about memory corruption, because I consider it as
unrecoverable. The heap header could be corrupted yes, but my pointer
returned by VirtualAlloc could be altered too. You may say that the risk
of accidentally modifying this little pointer is lower than modifying
the big heap header, but I consider that both cases are unlikely to
occur within my software :-)
An important point here is that I am writing C++ software with very few
Win32 direct interaction. And none of them a tied to memory management.
All memory allocation is done through the CRT functions. So the heap
corruption would have to successfully bypass the VC++ 8 "safe library",
then the CRT checked heap, then the checking features of the Win32 debug
heap, and finally a bad_alloc exception should be thrown somewhere.
That's why I consider that scenario as very unlikely to occur. Maybe I
could call HeapValidate before HeapDestroy, but ergh, I am not writing
software for the Space Shuttle :-)
If my app crashes because of a memory corruption, creating a minidump
won't help me. I think the best way to handle a memory corruption is to
crash the app as soon as possible. As I said, I consider it as an
unrecoverable error. So, if HeapDestroy makes my app to be killed, it is
an acceptable behaviour for me.

However, there are specific errors that I am interested in catching and
reporting, and for which writing a unhandled exception filter is difficult:
- stack overflow
- out of memory
For the first one, Jochen gave me an acceptable solution. The second is
a very difficult one, as I need memory to report the error, but I don't
have that memory. So, how can I increase the chance to have sufficient
memory for at least writing a mini dump ?

I first had the same idea as yours : allocate memory beforehand, and
free it before calling MiniDumpWriteDump. Once you have decided how much
memory to reserve (not a so easy question to answer), this only solves
the case when the memory limit has been filled by your process. Since
its execution is suspended as we are in a filter, the reserved memory
that you just freed should still be available afterward. Okay.

But what happens if the memory is widely allocated by an other process
which asks for memory faster than you can release it ? In such a case,
you have to free a lot of memory in order to increase your chance to
successfully report the bad_alloc failure. I can see two ways of doing it :
- reserve beforehand a lot of memory (the problem of how much to reserve
still has to be solved) and free it when you need it
- force your process to release as much memory as it is possible to do
by destroying the CRT heap

The second approch is radical, and forces you to kill to application
afterward. As we are in an unhandled exception filter, I think it is an
acceptable behaviour.

I don't like to reserve a lot of needless memory, and I think the
probability to get an other process competing with yours for memory is
greater than a Win32 heap header corruption. And I also have to admit
that I find the second solution much funny than the first one :-)

Sorry for this so brief reply, next time I will not forget to give
detailed infos about the origin and the creation of the universe :-)

--
Aurélien Regat-Barrel

Aurelien Regat-Barrel

unread,
Nov 24, 2006, 10:30:23 AM11/24/06
to
Hi,

> Mode "Safe": It starts an watchdog-process (actually rundll32.exe with an
> exported "DebuggerMain" from my dll (by using rundll32 I must not deploy
> anything else but the dll), which attaches to the application as a Debugger,
> and calls "MiniDumpCreateDump" within the watchdog process.

You can also use your program that you can start in debugger mode.
Depending on command line, it can attach itself to the process whose PID
is given in argument:

int main( int argc, char *argv[] )
{
crash_guard guard;

if ( !guard.start_watchdog( argc, argv ) )
{
return 0;
}

// normal code goes here
}

The first time start_watchdog() is called, it doesn't find any special
command line argument, so it starts itself as a child process giving it
its PID (a kind of fork()). Then it returns true.
The child process will also call start_watchdog(), find the command line
argument(), and starts executing without returning. When it does (after
the given process exited), it returns true and execution stops.

Depending on the configuration/command line/registry settings/...,
start_watchdog() can also start a new thread, or simply install a filter.

--
Aurélien Regat-Barrel

Ben Voigt

unread,
Nov 24, 2006, 11:11:40 AM11/24/06
to

"Günter Prossliner" <g.prossliner/gmx/at> wrote in message
news:%23fUlrG9...@TK2MSFTNGP03.phx.gbl...

My worry here would be that the process might die before the other thread
gets scheduled. Your unhandled exception filter will have to wait on the
other thread somehow.

>
> Mode "Safe": It starts an watchdog-process (actually rundll32.exe with an
> exported "DebuggerMain" from my dll (by using rundll32 I must not deploy
> anything else but the dll), which attaches to the application as a
> Debugger, and calls "MiniDumpCreateDump" within the watchdog process.
>
> I will forget about the third method (creating a debugger-process within
> the unhandled exception filter which creates the Minidump) because
> according to your posting it is not any safer than the "Fast" mode.
>
> What do you think about it?
>

A better version of option 3 is to combine the other two solutions. Run
your watchdog process all the time, but wait for an event before attaching
the debugger. The problem of keeping the process alive until the watchdog
wakes is still present.

>
> GP
>


Aurelien Regat-Barrel

unread,
Nov 24, 2006, 11:36:41 AM11/24/06
to
Hi Ben,

Ben Voigt a écrit :


> A better version of option 3 is to combine the other two solutions. Run
> your watchdog process all the time, but wait for an event before attaching
> the debugger. The problem of keeping the process alive until the watchdog
> wakes is still present.

By starting the watchdog as a child process, you could make it to
inherit a kernel object (an event for example) to set once it is ready ?

--
Aurélien Regat-Barrel

Ben Voigt

unread,
Nov 24, 2006, 12:04:34 PM11/24/06
to

"Aurelien Regat-Barrel" <nospam....@yahoo.fr.invalid> wrote in message
news:O1RfBc%23DHH...@TK2MSFTNGP06.phx.gbl...

Named events can be used for communication between processes. You might
need one event set by the exception handler "APPNAME_BEGINDUMP" and one set
by the watchdog "APPNAME_DUMPCOMPLETE"

>
> --
> Aurélien Regat-Barrel


Ben Voigt

unread,
Nov 24, 2006, 12:15:32 PM11/24/06
to

> could call HeapValidate before HeapDestroy, but ergh, I am not writing
> software for the Space Shuttle :-)

I wish you were, as I write software that processes the telemetry data
downlinked via satellites, and trust me, some of the "non-critical" parts of
the on-board software don't do nearly as good a job of error handling.


Oleg Starodumov

unread,
Nov 24, 2006, 3:31:43 PM11/24/06
to
>> Yes, the performance hit is coming from dispatching debug events to
>> the debugger.
>> So if the application generates lots of events (usually exceptions,
>> also debug output, module load/unload, thread start/exit, etc.), it
>> can be slowed down. If there is not too many debug events,
>> performance will not be seriously affected.
>
> Ok. In the "normal program flow" there shall be not too many of them. How are the calls serialized between the
> debuggee and the debugger? Over shared memory?
>

There is a debug port the debugger is listening on. When a debug event occurs,
a message will be delivered to that port. While the message is being delivered
and the debugger processes it, the debuggee is suspended. I am not sure that
I understand all the details of this process (I am not a kernel expert and can
be wrong), but the main point here is that delivery of debug events requires
thread context switch, which affects performance the most, even if
the debugger itself processes the event quickly.

> When I do nothing within the most event-procedures (only unhandled exceptions will be processed) is it possible to say
> how much overhead it will be?

IMO the best way would be to measure how your application works
under debugger (use a custom debugger, or WinDbg, which also processes
debug events very quickly (unlike VS debugger)).

I did some tests in the past and found that on an application that raises
lots of exceptions the performance hit was very significant (several times slower
for that particular application). But for most normal applications there should
be no significant slowdown.

> Is it possible to subscribe to the needed event(s) only?

No.

> Mode "Fast": It creates a thread which waits for an event to be set until it calls "MiniDumpCreateDump" in process.
> The event will be set from the custom unhandled exception filter.
>
> Mode "Safe": It starts an watchdog-process (actually rundll32.exe with an exported "DebuggerMain" from my dll (by
> using rundll32 I must not deploy anything else but the dll), which attaches to the application as a Debugger, and
> calls "MiniDumpCreateDump" within the watchdog process.
>
> I will forget about the third method (creating a debugger-process within the unhandled exception filter which creates
> the Minidump) because according to your posting it is not any safer than the "Fast" mode.
>
> What do you think about it?
>

I think it should be OK.

Oleg


Oleg Starodumov

unread,
Nov 24, 2006, 3:38:48 PM11/24/06
to

>> Mode "Fast": It creates a thread which waits for an event to be set until
>> it calls "MiniDumpCreateDump" in process. The event will be set from the custom unhandled exception filter.
>
> My worry here would be that the process might die before the other thread gets scheduled. Your unhandled exception
> filter will have to wait on the other thread somehow.
>

Yes, it has to signal and wait (usually using two events as you have later described).

Oleg

Oleg Starodumov

unread,
Nov 24, 2006, 3:44:47 PM11/24/06
to
> > HeapDestroy could have undesirable side effects if the heap header is corrupted.
> > IMO a much safer option would be to reserve a range of virtual memory
> > beforehand, and release it before calling the filter.
>
> I guess you mean reseve + commit VM.
>

No, I meant only reserve.

> I do not really care about memory corruption, because I consider it as unrecoverable. The heap header could be
> corrupted yes, but my pointer returned by VirtualAlloc could be altered too. You may say that the risk of accidentally
> modifying this little pointer is lower than modifying the big heap header,

And the consequences of modifying this little pointer are much less serious, too :)

> but I consider that both cases are unlikely to occur within my software :-)

Absolutely agree, seriously :)
IMO with proper approach to SW testing we can eliminate most of the situations
that can lead to really bad corruptions. That's why I still think that creating minidumps
in process is an acceptable solution.

> However, there are specific errors that I am interested in catching and reporting, and for which writing a unhandled
> exception filter is difficult:
> - stack overflow
> - out of memory
> For the first one, Jochen gave me an acceptable solution. The second is a very difficult one, as I need memory to
> report the error, but I don't have that memory. So, how can I increase the chance to have sufficient memory for at
> least writing a mini dump ?
>
> I first had the same idea as yours : allocate memory beforehand, and free it before calling MiniDumpWriteDump. Once
> you have decided how much memory to reserve (not a so easy question to answer),

Exactly.

> this only solves the case when the memory limit has been filled by your process. Since its execution is suspended as
> we are in a filter, the reserved memory that you just freed should still be available afterward. Okay.
>

I considered only this case.

> But what happens if the memory is widely allocated by an other process which asks for memory faster than you can
> release it ? In such a case, you have to free a lot of memory in order to increase your chance to successfully report
> the bad_alloc failure. I can see two ways of doing it :
> - reserve beforehand a lot of memory (the problem of how much to reserve still has to be solved) and free it when you
> need it
> - force your process to release as much memory as it is possible to do by destroying the CRT heap
>

Are you sure that your CRT heap will be big enough?

Are you sure that the other process is allocating memory not fast enough
to outperform HeapDestroy? And MiniDumpWriteDump after that?

And what is the use of minidump in this case? What extra information
will you get from it, if the reason of the problem is in an external process?

> Sorry for this so brief reply, next time I will not forget to give detailed infos about the origin and the creation of
> the universe :-)
>

It was "brief" but interesting.

Oleg

Skywing [MVP]

unread,
Nov 26, 2006, 11:48:10 AM11/26/06
to
I would recommend the rundll32 option. I have done something very similar
before for a product that I worked on (and in fact also used rundll32 for
this purpose). There are a couple of gotchas to consider:

- I would try to make most of the state relating to how to communicate with
the watchdog process readonly after initialization to maximize the chance
that it won't be overwritten and later prevent a successful error reporting
run.
- Make sure you do *all* memory allocations up-front at startup. Never ever
touch the heap (even a private heap!) from your unhandled exception handler.
- Specific to rundll32, make sure that you dispatch window messages in the
watchdog process on the thread that rundll32 calls in to you with.
Evidently, rundll32 likes to create a hidden window, for whatever reason,
and if you don't dispatch window messages, I have found that this results in
a couple of programs getting upset when broadcasting of messages doesn't
work as they would expect. (Stupid and undocumented, but it caused problems
for us a couple of times until I just added a dummy message loop.)

Going with the watchdog process approach does not have the performance hit
of a debugger process, and it also does not prevent you from attaching a
debugger later on. It is ever so slightly less robust than a watchdog
process acting as a full debugger, but if you are careful (and make all of
your state immutable after initialization), it's very close to the debugger
option (and with much less overhead and performance hit - every exception,
module load event, and soforth will not cause the entire process to be
suspended while the debugger watchdog inspects the debug event).

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Günter Prossliner" <g.prossliner/gmx/at> wrote in message
news:%23fUlrG9...@TK2MSFTNGP03.phx.gbl...

Skywing [MVP]

unread,
Nov 26, 2006, 11:49:25 AM11/26/06
to
I would tend to prefer inheritance (or handle duplication) of unnamed
objects to named objects whereever possible. You don't have to worry about
the security implications of someone malicious opening your named object and
doing bad things to it, and you don't create a headache for yourself if you
later decide to support multiple instances of your application running on
the same TS session.

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Ben Voigt" <r...@nospam.nospam> wrote in message
news:%23rSF7p%23DHH...@TK2MSFTNGP02.phx.gbl...

Skywing [MVP]

unread,
Nov 26, 2006, 11:55:14 AM11/26/06
to
To add to this, this in particular context switch is especially bad because
it is also an address space switch and not just a register context switch,
which has severe implications on things like processor caches, TLBs, and
whatnot.

There is also a lot of overhead if your program frequently creates threads
or loads DLLs, as those events cause the entire process to be suspended and
the debugger to be awoken.

On XP and later, the communication between the debugger and debuggee will by
default use a new underlying API (based on the kernel debug object) instead
of the LPC port model, but the principles are the same as Oleg has outlined.

--

Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Oleg Starodumov" <com-dot-debuginfo-at-oleg> wrote in message
news:%23054ReA...@TK2MSFTNGP06.phx.gbl...

Günter Prossliner

unread,
Nov 28, 2006, 7:24:02 AM11/28/06
to
Hello Skywing!

Thank you for your very informative response!

> I would recommend the rundll32 option. I have done something very
> similar before for a product that I worked on (and in fact also used
> rundll32 for this purpose).

Is this product closed source? Is is possible to get the (relevant) source?

The component I am working on (I am still in the planning phase) will be
Open-Source (I will publish it on codeproject.com).


GP


0 new messages