Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Frame-based exception handling problem on Server 2008

279 views
Skip to first unread message

Corinna Vinschen

unread,
Feb 21, 2008, 11:54:34 AM2/21/08
to
Hi,

while testing Cygwin on Server 2008, I encountered a couple of spurious
hangs in OS functions, taking as much CPU it can get. This happens only
on Server 2008, both, 32 and 64 bit version, but it does not happen on
any other Windows version, up to and including Vista SP1.

Debugging turned up that Server 2008 has apparently a problem with
Cygwin's exception handling.

Usually, when using frame-based exception handling, the exception
handlers are organized via a linked list on the stack, starting at the
address referenced by the register %fs:0, using a structure like this.

typedef struct exception_list
{
struct exception_list *prev;
exception_handler *handler;
};

This is used by Cygwin, too, but with a tweak. There is only one
exception_list entry on the stack (not counting the default handler).
This entry is generated before the application's entry function is
called, and at creation time of any thread. The specific tweak is that
the exception prev pointer points back to itself, instead of to the
default handler. This allowed Linux-like signal handling even for
recurring computational exceptions so far in all Windows releases up to
and including Vista SP1, including all 64 bit versions.

However, exactly here's the problem. If the exception handler list is
an endless loop as described above, certain OS calls on Server 2008
simply hang endlessly, taking 100% CPU.

A very simple testcase is a division by zero:

main ()
{
return 1 / 0;
}

This is usually handled by our exception handler by either dumping a
stacktrace or core file, or by calling the applications's signal handler
for SIGFPE.

However, on Server 2008, our exception handler never gets called when
this happens. The process simply hangs, taking whatever CPU it can
grab.

Since this worked for all Windows versions before 2008, and since we're
not interested in the default exception handler taking over for Cygwin,
we would like to know, if there's a chance that this problem could be
fixed in 2008.

Barring that, it would be nice to learn how we can get our old behaviour
back, even if we don't create a exception handler loop, and if possible
in a unified way which works on previous Windows releases as well.


Thanks in advance,
Corinna

--
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

Jeffrey Tan[MSFT]

unread,
Feb 21, 2008, 9:47:23 PM2/21/08
to
Hi Corinna,

It seems like you have done some hack for the Windows SEH which is a bit
like the Visual C++ runtime SEH wrapper(which also wraps all the exception
list entries in one function in one place for table-driver dispatch).

Now, it seems that the problem is that the Windows2008 exception dispatcher
did not call your registered exception_list entry at FS:[0], yes? Have you
tried to set a breakpoint in the user-mode exception dispatcher(should be
in ntdll) in debugger? If it is called, then you can step through to
understand its algorithm and find out why your handler is not called.

Is it possible for you to provide a simple sample project to reproduce this
problem? If I can reproduce, we can work on it more efficiently. Also, I
will try to setup a Windows2008 machine for testing it. Thanks.

Best regards,
Jeffrey Tan
Microsoft Online Community Support
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscriptions/support/default.aspx.
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.

roge...@gmail.com

unread,
Feb 22, 2008, 7:56:57 AM2/22/08
to
On Feb 21, 4:54 pm, Corinna Vinschen <cori...@community.nospam> wrote:
> Hi,
>
> while testing Cygwin on Server 2008, I encountered a couple of spurious
> hangs in OS functions, taking as much CPU it can get.  This happens only
> on Server 2008, both, 32 and 64 bit version, but it does not happen on
> any other Windows version, up to and including Vista SP1.
>
> Debugging turned up that Server 2008 has apparently a problem with
> Cygwin's exception handling.

Curious.

What is the call stack, including kernel entries?

One easy way to get this is using Process Explorer:
Find process,
right click and select Properties..
go to Threads tab
Find thread in loop
Click Stack

That info might help understand the problem.

Out of interest, why does the exception handler chain end in a loop?
I'd always assumed that if an exception handler didn't return the
"continue search" value then there would be no reason to even read
the pointer to prev value.

Regards,
Roger.

Corinna Vinschen

unread,
Feb 22, 2008, 9:51:59 AM2/22/08
to
Hi Jeffery,

"Jeffrey Tan[MSFT]" wrote:
> Hi Corinna,
>
> It seems like you have done some hack for the Windows SEH which is a bit
> like the Visual C++ runtime SEH wrapper(which also wraps all the exception
> list entries in one function in one place for table-driver dispatch).

Mmh, that's possible, but I'm not quite sure about this. Cygwin's using
GCC as compiler and Cygwin is its own runtime environment, so you
probably know indefinitely more about the Visual-C++ runtime. :-)

> Now, it seems that the problem is that the Windows2008 exception dispatcher
> did not call your registered exception_list entry at FS:[0], yes?

That seems to be the case, yes. If I set a breakpoint to our exception
handler, it's not called. Instead, 2008 hangs, feeding on the CPU. As
soon as I change the exception_list entry so that the prev pointer
points to the default handler, our exception handler is called just fine.

But the drawback is, it's only called once for a given exception. If
the exception handler returns without rectifying the cause for the
exception, the next time the default handler is called. That's not what
we want for hopefully obvious reasons.

> Have you
> tried to set a breakpoint in the user-mode exception dispatcher(should be
> in ntdll) in debugger? If it is called, then you can step through to
> understand its algorithm and find out why your handler is not called.

I'm sorry, but I don't know exactly how to do that. I can't set a
breakpoint in ntdll from within GDB. I get an I/O error. However, what
I did was to step through and it appears that an endless loop is run in
this area of ntdll.dll:

Dump of assembler code from 0x76f00d6d to 0x76f00da2:

0x76f00d6d <ntdll!RtlTimeToTimeFields+74226>: cmp -0x8(%ebp),%ebx
0x76f00d70 <ntdll!RtlTimeToTimeFields+74229>: jb 0x76eedf10 <ntdll!EtwSetMark+495>
0x76f00d76 <ntdll!RtlTimeToTimeFields+74235>: lea 0x8(%ebx),%eax
0x76f00d79 <ntdll!RtlTimeToTimeFields+74238>: cmp -0xc(%ebp),%eax
0x76f00d7c <ntdll!RtlTimeToTimeFields+74241>: ja 0x76eedf10 <ntdll!EtwSetMark+495>
0x76f00d82 <ntdll!RtlTimeToTimeFields+74247>: test $0x3,%bl
0x76f00d85 <ntdll!RtlTimeToTimeFields+74250>: jne 0x76eedf10 <ntdll!EtwSetMark+495>
0x76f00d8b <ntdll!RtlTimeToTimeFields+74256>: mov 0x4(%ebx),%eax
0x76f00d8e <ntdll!RtlTimeToTimeFields+74259>: cmp -0x8(%ebp),%eax
0x76f00d91 <ntdll!RtlTimeToTimeFields+74262>: jb 0x76f00d9c <ntdll!RtlTimeToTimeFields+74273>
0x76f00d93 <ntdll!RtlTimeToTimeFields+74264>: cmp -0xc(%ebp),%eax
0x76f00d96 <ntdll!RtlTimeToTimeFields+74267>: jb 0x76eedf10 <ntdll!EtwSetMark+495>
0x76f00d9c <ntdll!RtlTimeToTimeFields+74273>: mov (%ebx),%ebx
0x76f00d9e <ntdll!RtlTimeToTimeFields+74275>: cmp %edi,%ebx
0x76f00da0 <ntdll!RtlTimeToTimeFields+74277>: jne 0x76f00d6d <ntdll!RtlTimeToTimeFields+74226>

> Is it possible for you to provide a simple sample project to reproduce this
> problem? If I can reproduce, we can work on it more efficiently. Also, I
> will try to setup a Windows2008 machine for testing it. Thanks.

Below you'll find the source code for a self-contained testcase, which
allows to reproduce the behaviour. It just installs the exception
handler on the stack in the main() function. That should do it for
testing purposes.

However, it's using GCC syntax. I built it under Cygwin like this:

gcc -g -mno-cygwin -o exc_test exc_test.c

The same should work fine without -mno-cygwin under MingW. I don't know
how you can build it in Visual-C++. I assume it should just work if you
replace the GCC assembler syntax in the line

extern exception_list_t *_except_list __asm__ ("%fs:0");

with the appropriate Visual-C++ syntax.

Without argument, the exception_list entry is installed pointing the
prev pointer to itself. With any argument, the exception_list entry is
installed with prev pointing to the previous exception list entry (the
default handler).

On 2008, running the example without argument does never call the
exception handler, but it hangs. On (for instance) XP it calls the own
exception handler over and over again, which is the desired behaviour.
Running the example with any argument calls the own exception handler
once, but calls the default handler the next time the division by zero
occurs.

Ok, here's the self-contained testcase:

============================ SNIP ============================
#include <stdio.h>
#include <windows.h>

extern DWORD __stdcall RtlUnwind (void *, void *, EXCEPTION_RECORD *, void *);

struct _exception_list;

typedef int (*exception_handler_t) (EXCEPTION_RECORD *,
struct _exception_list *,
CONTEXT *,
void *);

typedef struct _exception_list
{
struct _exception_list *prev;
exception_handler_t handler;
} exception_list_t;

typedef struct _cygtls
{
exception_list_t el;
} cygtls_t;

extern exception_list_t *_except_list __asm__ ("%fs:0");

void
init_exception_handler (cygtls_t *tls, exception_handler_t eh, int do_loop)
{
tls->el.handler = eh;
if (do_loop)
tls->el.prev = &tls->el; // loop back to itself
else
tls->el.prev = _except_list; // Just point to default handler
_except_list = &tls->el;
}

int
handle_exception (EXCEPTION_RECORD *e, exception_list_t *frame,
CONTEXT *c, void *dummy)
{
fputs ("In exception_handler\n", stderr);
RtlUnwind (frame, (void *) c->Eip, e, 0);
return 0;
}

int
func ()
{
return 1/ 0;
}

int
main (int argc, char **argv)
{
cygtls_t _my_tls;
init_exception_handler (&_my_tls, handle_exception, argc <= 1);
fputs ("Before func\n", stderr);
func ();
fputs ("After func\n", stderr);
return 0;
}
============================ SNAP ============================

Ivan Brugiolo [MSFT]

unread,
Feb 22, 2008, 11:24:31 AM2/22/08
to
Can you elaborate on the NX options of the bcdedit configuration ?
My take is that the server SKU has NX enabled by default, that implies
strict validation of the SEH-Handler against the table generated by the
compiler,
while the client SKU has a looser validation, and your code gets by.

Then, could you use a ntsd/cdb/windbg based debugger to report the stack ?
Those debuggers have the ability to use public PDBs to give meaningful stack
traces.

--

--

This posting is provided "AS IS" with no warranties, and confers no rights.

Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


"Corinna Vinschen" <cor...@community.nospam> wrote in message
news:fpmnif$7i4$1...@perth.hirmke.de...

Corinna Vinschen

unread,
Feb 22, 2008, 12:06:35 PM2/22/08
to
roge...@gmail.com wrote:

> On Feb 21, 4:54?pm, Corinna Vinschen <cori...@community.nospam> wrote:
>> Debugging turned up that Server 2008 has apparently a problem with
>> Cygwin's exception handling.
>
> Curious.
>
> What is the call stack, including kernel entries?
>
> One easy way to get this is using Process Explorer:

Uh, right, I forgot to use ProcessExplorer when I sent my reply
to Jeffrey. I'll follow up with the stack info in my reply to Ivan.
Thanks for the hint.

> Out of interest, why does the exception handler chain end in a loop?
> I'd always assumed that if an exception handler didn't return the
> "continue search" value then there would be no reason to even read
> the pointer to prev value.

Two reasons:

Cygwin is trying to emulate Linux as close as possible. Exceptions are
converted to POSIX signals. If an application has no signal handler for
a given signal, the behaviour is defined by a system specific default
behaviour (SIG_DFL), which is either to ignore the signal and just go on
as if nothing happened, or to throw a core dump. If an application has
a signal handler for the given signal, and if the signal handler
returns, it returns to exactly the same spot where the exception
occured. If the cause of the exception hasn't been rectified, the
signal handler is called again.

The second reason is that the default signal handler interferes with
our exception handling. If the default handler is called, we can't
do the stuff we usually do, like, say, generating a core dump, making
kernel syslog entries, etc.

Corinna Vinschen

unread,
Feb 22, 2008, 12:28:20 PM2/22/08
to
Hi Ivan,

Ivan Brugiolo [MSFT] wrote:
> Can you elaborate on the NX options of the bcdedit configuration ?

Per the output of bcdedit, the NX option is set to OptOut.

> My take is that the server SKU has NX enabled by default, that implies
> strict validation of the SEH-Handler against the table generated by the
> compiler,
> while the client SKU has a looser validation, and your code gets by.

The compiler (GCC) does not generate SEH handlers, except if you use c++
exception handling. This is not the case here. The code is plain C.

The second problem is that our approach works as expected on 2003
Server, as well as on earlier Server versions, so it shouldn't have
anything to do with server vs. client.

Anyway, I switched the 2008 Server to OptIn as in Vista and rebooted.
I verified that the setting was still OptIn after the reboot. The
problem that 2008 Server hangs persists even wit NX=OptIn.

> Then, could you use a ntsd/cdb/windbg based debugger to report the stack ?
> Those debuggers have the ability to use public PDBs to give meaningful stack
> traces.

I haven't windbg installed. If it's necessary, I can do that, but I
never used it so I'd need some instructions. For the time being, I have
pasted the backtrace from ProcessExplorer in the hanging example code
from my reply to Jeffrey on a 2008 Server:

0 ntdll.dll!RtlTimeToElapsedTimeFields+0x12225
1 ntdll.dll!KiUserExceptionDispatcher+0xf
2 exceptionhandler_example.exe+0x13e9
3 exceptionhandler_example.exe+0x124b
4 exceptionhandler_example.exe+0x1298
5 kernel32.dll!BaseThreadInitThunk+0x12
6 ntdll.dll!RtlInitializeExceptionChain+0x63
7 ntdll.dll!RtlInitializeExceptionChain+0x36

Does that help?

roge...@gmail.com

unread,
Feb 22, 2008, 1:30:35 PM2/22/08
to
On Feb 22, 5:28 pm, Corinna Vinschen <cori...@community.nospam> wrote:

[snip]

> I haven't windbg installed.  If it's necessary, I can do that, but I
> never used it so I'd need some instructions.  For the time being, I have
> pasted the backtrace from ProcessExplorer in the hanging example code
> from my reply to Jeffrey on a 2008 Server:
>
>   0 ntdll.dll!RtlTimeToElapsedTimeFields+0x12225
>   1 ntdll.dll!KiUserExceptionDispatcher+0xf
>   2 exceptionhandler_example.exe+0x13e9
>   3 exceptionhandler_example.exe+0x124b
>   4 exceptionhandler_example.exe+0x1298
>   5 kernel32.dll!BaseThreadInitThunk+0x12
>   6 ntdll.dll!RtlInitializeExceptionChain+0x63
>   7 ntdll.dll!RtlInitializeExceptionChain+0x36
>
> Does that help?

Process Explorer hasn't got the symbols for ntdll.dll,
so the function name (RtlTimeToElapsedTimeFields)
isn't that relevant...but someone with the same version
of 2008 and symbols installed could find out the
name of the function.

The call site (KiUserExceptionDispatcher+0xf)
may be enough for someone with access to server 2008 to
find out what's being called.

Regards,
Roger.

Ivan Brugiolo [MSFT]

unread,
Feb 22, 2008, 6:19:33 PM2/22/08
to
Vista SP1 and WinSrv2008 are build from the same code, and,
the only SKU difference is the exception validation behavior that
is triggered by the NX boot and CPU options
(among other things: there are specific
appcompt shims to skip exception handlers validation for binaries
known to have issues).

That said, the safe exception handler table has nothing to do with C++.
It's a structure referenced in the debug-directory of the PE, and it is
used for exception handler validation.
Using the MS toolset, if you do
`link -dump -all <binary> | findstr /c:"Safe Exception Handler Table"`
you should see that yourself.

To debug debug the issue, start from KeUserExceptionDispatcher.
That's the first function you can set a breakpoint on in user mode.
Then, with good symbols, observer the code that walks the frame-handler
list,
and, check if some Rtl*Validate* function is called.

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


"Corinna Vinschen" <cor...@community.nospam> wrote in message

news:fpn0nk$cvt$2...@perth.hirmke.de...

m

unread,
Feb 22, 2008, 8:15:34 PM2/22/08
to
If all you want is a signal, then would a vectored exception handler work?
(AddVectoredExceptionHandler)

Alternately, your could set the default handler
(SetUnhandledExceptionFilter)

IMHO: Your problem is likely the result of Windows scanning the stack of EH
frames before calling them to ensure that there are no 'unsafe' (i.e. hacked
via buffer overrun etc.) frames that might be called.


"Corinna Vinschen" <cor...@community.nospam> wrote in message

news:fpmver$cvt$1...@perth.hirmke.de...

Corinna Vinschen

unread,
Feb 23, 2008, 5:37:00 AM2/23/08
to
Ivan Brugiolo [MSFT] wrote:
> Vista SP1 and WinSrv2008 are build from the same code, and,
> the only SKU difference is the exception validation behavior that
> is triggered by the NX boot and CPU options
> (among other things: there are specific
> appcompt shims to skip exception handlers validation for binaries
> known to have issues).
>
> That said, the safe exception handler table has nothing to do with C++.
> It's a structure referenced in the debug-directory of the PE, and it is
> used for exception handler validation.
> Using the MS toolset, if you do
> `link -dump -all <binary> | findstr /c:"Safe Exception Handler Table"`
> you should see that yourself.

Well, no. The above command prints nothing for my executables.

> To debug debug the issue, start from KeUserExceptionDispatcher.
> That's the first function you can set a breakpoint on in user mode.
> Then, with good symbols, observer the code that walks the frame-handler
> list,
> and, check if some Rtl*Validate* function is called.

The example code I sent in my reply to Jeffrey (message-id:
<fpmnif$7i4$1...@perth.hirmke.de>) is a self-contained testcase. It does
not involve linking against the Cygwin DLL. It shows clearly that the
problem happens *only* on 2008 Server. No other OS has this problem,
regardless of Vista being from the same codebase or not. From my point
of view something is broken specificially in 2008 Server. But your
reply sounds as if this application code has to take the blame.

I'm not fluent in debugging the OS DLLs. I can surely look into this
after I have set up and learned how to use windbg.

OTOH, is there any chance that *you* look into this? After all, you're
obviously much better suited to debug your OS.


Corinna


P.S.: How do I get the symbols for 2008?

Corinna Vinschen

unread,
Feb 23, 2008, 5:55:39 AM2/23/08
to
m wrote:
> If all you want is a signal, then would a vectored exception handler work?
> (AddVectoredExceptionHandler)

Vectored exception handlers only work since XP, but the Cygwin DLL
still supports NT4 and 2000. The current release even runs on
98 and Me and there are actually still people using it on these
systems. But in my desperation I tried it nevertheless. A couple of
tests appeared to work fine on Vista and 2008, but the same code crashed
strangly on XP. I didn't figure out why, though.

> Alternately, your could set the default handler
> (SetUnhandledExceptionFilter)

I tried that, but it doesn't work as expected. For one thing, the
filter function has no access to the frame pointer via the
LPEXCEPTIONS_POINTER parameter.

> IMHO: Your problem is likely the result of Windows scanning the stack of EH
> frames before calling them to ensure that there are no 'unsafe' (i.e. hacked
> via buffer overrun etc.) frames that might be called.

That might be the case, but if so, I'm still wondering that it only
happens on 2008 Server but neither on any other OS, from NT4 up to and
including Vista SP1, nor on any 64 bit release before 2008 Server.

If this is something configured into the OS, I'd be already happy
to learn how to switch this off just for all applications linked
against the Cygwin DLL.


Corinna

--
Antworten an o.g. (existierende) Adresse werden ungelesen verworfen.
Private Mails bitte an corinnaPLOPvinschenPINGde.

Jochen Kalmbach [MVP]

unread,
Feb 23, 2008, 7:46:11 AM2/23/08
to
Hi Corinna!

>> Alternately, your could set the default handler
>> (SetUnhandledExceptionFilter)
>
> I tried that, but it doesn't work as expected.

Just a small addition:
SetUnhandledExceptionFilter only works in *some* cases. Starting with
VS2005, the CRT sometimes prevents the calling of the ExceptionHandler
and forces WER!
See also: "SetUnhandledExceptionFilter" and VC8
http://blog.kalmbachnet.de/?postid=75

--
Greetings
Jochen

My blog about Win32 and .NET
http://blog.kalmbachnet.de/

Corinna Vinschen

unread,
Feb 23, 2008, 4:00:17 PM2/23/08
to
Corinna Vinschen wrote:
> Ivan Brugiolo [MSFT] wrote:
>> [...]

>> To debug debug the issue, start from KeUserExceptionDispatcher.
>> That's the first function you can set a breakpoint on in user mode.
>> Then, with good symbols, observer the code that walks the frame-handler
>> list,
>> and, check if some Rtl*Validate* function is called.
>
> The example code I sent in my reply to Jeffrey (message-id:
> <fpmnif$7i4$1...@perth.hirmke.de>) is a self-contained testcase.
> [...]

> I'm not fluent in debugging the OS DLLs. I can surely look into this
> after I have set up and learned how to use windbg.

Ok, I bit the bullet, installed WinDbg, Visual-C++ and the debug symbols
for 2008. I changed the example code to compile under Visual-C++ and
debugged according to your above instructions. The call stack when it's
in the endless loop looks like this:

ntdll!RtlDispatchException+0x67
ntdll!KiUserExceptionDispatcher+0xf
exceptionhandler_example!func+0x22
exceptionhandler_example!main+0x6c
exceptionhandler_example!__tmainCRTStartup+0x1a8
exceptionhandler_example!mainCRTStartup+0xf
kernel32!BaseThreadInitThunk+0xe
ntdll!__RtlUserThreadStart+0x23
ntdll!_RtlUserThreadStart+0x1b

When stepping from KiUserExceptionDispatcher to RtlDispatchException,
there is no call to any Rtl*Validate* function, nor is there a call to
RtlIsValidHandler (which is one function which looks like a likely
candidate). The endless loop occurs as soon as the PC is at
ntdll!RtlDispatchException+0x67.

This is the full assembler code of the entire loop:

ntdll!RtlDispatchException+0x67:
76e80d6d 3b5df8 cmp ebx,dword ptr [ebp-8]
76e80d70 0f829ad1feff jb ntdll!RtlDispatchException+0x19d (76e6df10)
76e80d76 8d4308 lea eax,[ebx+8]
76e80d79 3b45f4 cmp eax,dword ptr [ebp-0Ch]
76e80d7c 0f878ed1feff ja ntdll!RtlDispatchException+0x19d (76e6df10)
76e80d82 f6c303 test bl,3
76e80d85 0f8585d1feff jne ntdll!RtlDispatchException+0x19d (76e6df10)
76e80d8b 8b4304 mov eax,dword ptr [ebx+4]
76e80d8e 3b45f8 cmp eax,dword ptr [ebp-8]
76e80d91 7209 jb ntdll!RtlDispatchException+0x96 (76e80d9c)
76e80d93 3b45f4 cmp eax,dword ptr [ebp-0Ch]
76e80d96 0f8274d1feff jb ntdll!RtlDispatchException+0x19d (76e6df10)
76e80d9c 8b1b mov ebx,dword ptr [ebx]
76e80d9e 3bdf cmp ebx,edi
76e80da0 75cb jne ntdll!RtlDispatchException+0x67 (76e80d6d) [br=1]

That's it. It stays in this loop until the process is stopped. Is that
enough for you to debug this further? I don't know what it's doing in
this loop, sorry.

Below is the code of the example applications changed so that it can be
build with VC++:

=================== SNIP ===================
#include <stdio.h>
#include <windows.h>

//#ifdef __GNUC__


extern DWORD __stdcall RtlUnwind (void *, void *, EXCEPTION_RECORD *, void *);

//#endif

struct _exception_list;

typedef int (*exception_handler_t) (EXCEPTION_RECORD *,
struct _exception_list *,
CONTEXT *,
void *);

typedef struct _exception_list
{
struct _exception_list *prev;
exception_handler_t handler;
} exception_list_t;

typedef struct _cygtls
{
exception_list_t el;
} cygtls_t;

#ifdef __GNUC__


extern exception_list_t *_except_list __asm__ ("%fs:0");

#endif

void
init_exception_handler (cygtls_t *tls, exception_handler_t eh, int do_loop)
{

exception_list_t *_except_list;

tls->el.handler = eh;
if (do_loop)
tls->el.prev = &tls->el; // loop back to itself
else

#ifdef __GNUC__


tls->el.prev = _except_list; // Just point to default handler
_except_list = &tls->el;

#else
{
__asm { mov eax, fs:[0]
mov _except_list, eax }
tls->el.prev = _except_list;
}
_except_list = &tls->el;
__asm { mov eax, _except_list
mov fs:[0], eax }
#endif
}

int
handle_exception (EXCEPTION_RECORD *e, exception_list_t *frame,
CONTEXT *c, void *dummy)
{
fputs ("In exception_handler\n", stderr);
RtlUnwind (frame, (void *) c->Eip, e, 0);
return 0;
}

int
func (int a, int b)
{
return a / b;
}

int
main (int argc, char **argv)
{
cygtls_t _my_tls;
init_exception_handler (&_my_tls, handle_exception, argc <= 1);
fputs ("Before func\n", stderr);

func (1, 0);


fputs ("After func\n", stderr);
return 0;
}

=================== SNAP ===================

Corinna

Ivan Brugiolo [MSFT]

unread,
Feb 24, 2008, 2:30:06 PM2/24/08
to
To enter that loop the process must have been makerd for exception
validation.
If you alter the result of NtQuerySystemInformation
in the place marked with ++++ you should get back the Client-SKU behavior.
Assuming you can validate that, the next step would be to deploy
an app-compat shim for the application generated by the compiler below so
that they would receive a legacy behavior when run on modern OS-es.

ntdll!RtlDispatchException+0x26:
77f0d153 53 push ebx
77f0d154 57 push edi
77f0d155 8d45f4 lea eax,[ebp-0Ch]
77f0d158 50 push eax
77f0d159 8d45f8 lea eax,[ebp-8]
77f0d15c 50 push eax
77f0d15d e8d2000000 call ntdll!RtlpGetStackLimits (77f0d234)
77f0d162 e886cb0100 call ntdll!zzz_AsmCodeRange_End (77f29ced)
77f0d167 8365f000 and dword ptr [ebp-10h],0
77f0d16b 6a00 push 0
77f0d16d 6a04 push 4
77f0d16f 8bd8 mov ebx,eax
77f0d171 8d45f0 lea eax,[ebp-10h]
77f0d174 50 push eax
77f0d175 6a22 push 22h
77f0d177 83cfff or edi,0FFFFFFFFh
77f0d17a 57 push edi
77f0d17b c6450b01 mov byte ptr [ebp+0Bh],1
77f0d17f e804b90100 call ntdll!ZwQueryInformationProcess (77f28a88)
<<<<<<
77f0d184 85c0 test eax,eax
77f0d186 0f8c770d0300 jl ntdll!RtlDispatchException+0xb9 (77f3df03)

ntdll!RtlDispatchException+0x5b:
77f0d18c f645f040 test byte ptr [ebp-10h],40h <<+++++++
77f0d190 0f846d0d0300 je ntdll!RtlDispatchException+0xb9 (77f3df03)
<<<<<<<<<<

--

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


"Corinna Vinschen" <cor...@community.nospam> wrote in message

news:fpq1h1$5ng$1...@perth.hirmke.de...

Corinna Vinschen

unread,
Feb 25, 2008, 7:03:27 AM2/25/08
to

Confirmed. If I change the test result, our exception handler is called.
The next time RtlDispatchException is called and if I don't change the
test result the endless loop runs off again.

But I don't understand what to do next. Where do I get an "app-compat
shim" and, that's even more important, what application are we talking
about here?

What *I'm* talking about is the Cygwin DLL (http://cygwin.com). It's
not a single application, it's a POSIX emulation layer, which is linked
to literally thousands of applications I have no control over. What
we need is a solution within the Cygwin DLL, which works for all these
applications on all Windows releases.

Still, this endless loop in 2008 looks like a bug to me. It can't be
the right thing to do to get into a tight endless loop within an OS DLL,
just because of specific assembler code in an user space application.

Ivan Brugiolo [MSFT]

unread,
Feb 25, 2008, 12:23:56 PM2/25/08
to
The proper fix would be for the compiler to generate
Safe-Handler-table in the debug directory of the PE executable.

You could pursue with PSS an OS fix, that would probably come in the form
of:
the handler is invalid, let's not infinite-loop,
but raise a non-continuable-exception that will terminate-the-process.

Other mitigations range from placing the system in a less-secure mode,
by disabling handler validation globally, or, using sdbinst.exe and the
related tools to create a shim for the executable using that runtime.

--

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


"Corinna Vinschen" <cor...@community.nospam> wrote in message

news:fpuaqf$hej$1...@perth.hirmke.de...

Corinna Vinschen

unread,
Feb 25, 2008, 3:38:36 PM2/25/08
to
Ivan Brugiolo [MSFT] wrote:
> The proper fix would be for the compiler to generate
> Safe-Handler-table in the debug directory of the PE executable.

Is there public documentation how a safe-handler table looks like and
what content it has to have to be accepted by the OS? Since GCC can't
generate such a table, we would have to generate this table manually
in the DLL image, if no other solution exists.

> You could pursue with PSS an OS fix, that would probably come in the form
> of:
> the handler is invalid, let's not infinite-loop,
> but raise a non-continuable-exception that will terminate-the-process.

I would have expected that such an OS fix would be in Microsoft's own
interest but if that's really necessary I guess I have to do that.

> Other mitigations range from placing the system in a less-secure mode,
> by disabling handler validation globally, or, using sdbinst.exe and the
> related tools to create a shim for the executable using that runtime.

I explained it already in my previous posting. There is no such thing
as "the" executable. The Cygwin DLL is just that, a DLL, which is
freely available under a GPL-alike license. We can't and don't have
control over the executables which are running under Cygwin, nor over
the systems Cygwin is running on. Any solution which requires to change
the running system or to apply a local compatibility tweak is simply no
solution at all.

Besides, out of curiosity I tried to use `set __COMPAT_LAYER=Win2000' in
a batch file, right before starting the first Cygwin process. It had
absolutely no effect. The OS hang persists.

Corinna Vinschen

unread,
Feb 26, 2008, 3:14:40 AM2/26/08
to
Corinna Vinschen wrote:
> Ivan Brugiolo [MSFT] wrote:
>> The proper fix would be for the compiler to generate
>> Safe-Handler-table in the debug directory of the PE executable.
>
> Is there public documentation how a safe-handler table looks like and
> what content it has to have to be accepted by the OS? Since GCC can't
> generate such a table, we would have to generate this table manually
> in the DLL image, if no other solution exists.

It just occured to me that I don't understand what this has to do with
the safe exception handler table.

In the example code you can see that the exception handler is called
fine (but only once), if the exception list's "prev" pointer points to
the next handler in the chain. Only if the exception list's prev
pointer points back to itself, the RtlDispatchException function loops
endlessly.

So, in one case the exception handler is "safe" at least once, in the
other case it's suddenly "unsafe"? Why?

And what is the tight endless loop supposed to accomplish?

roge...@gmail.com

unread,
Feb 26, 2008, 6:38:02 AM2/26/08
to
On Feb 22, 2:51 pm, Corinna Vinschen <cori...@community.nospam> wrote:

[snip]

> Ok, here's the self-contained testcase:

Note that this code won't work as supplied with more recent versions
of MSVC
This is because the exception handler is not marked with the SAFESEH
attribute,
and the newer compilers produce the "Safe Exception Handler Table"
that Ivan
referred to.

For VS 2005 you need
=================== Enable.asm ======================
.386

.model FLAT

?
handle_exception@@YAHPAU_EXCEPTION_RECORD@@PAU_exception_list@@PAU_CONTEXT@@PAX@Z
PROTO SYSCALL

.SAFESEH ?
handle_exception@@YAHPAU_EXCEPTION_RECORD@@PAU_exception_list@@PAU_CONTEXT@@PAX@Z

_TEXT SEGMENT

enableExceptionHandlerRequired PROC PUBLIC
enableExceptionHandlerRequired ENDP

_TEXT ENDS

END
=================== end enable.asm ================
Then run
ML /safeseh /c enable.asm
and add enable.obj into the compile line.
You also need to prevent incremental linking (or dereference the jmp
target
of the function pointer in your code)

Eg
cl /Zi /MD sample.cpp enable.obj /link /incremental:no

However in the example code you don't need to call RtlUnwind to
get the repeat behaviour - the following simplified code for the
exception handler
still produces a loop with or without the command line argument:

int
handle_exception (EXCEPTION_RECORD *e, exception_list_t *frame,
                  CONTEXT *c, void *dummy)
{
  fputs ("In exception_handler\n", stderr);

  return 0; // = ExceptionContinueExecution
}

I wonder whether it is possible to rewrite the exception handler so
you don't need
the 'trick' of pointing to self to make the RtlUnwind leave the target
exception
handler still active?

Regards,
Roger.

Corinna Vinschen

unread,
Feb 26, 2008, 7:53:39 AM2/26/08
to
roge...@gmail.com wrote:

> On Feb 22, 2:51?pm, Corinna Vinschen <cori...@community.nospam> wrote:
>> Ok, here's the self-contained testcase:
>
> Note that this code won't work as supplied with more recent versions
> of MSVC

I built it with the VC++ 2008 Express Edition as well as with VC++ from
the SDK for Windows 2008, and it worked for me to demonstrate the
problem. Using the Express Edition I had to define RtlUnwind myself,
using the SDK VC++ I hadn't. That was the only difference.

Keep in mind that I don't want to have an example which only works when
built with VC++, since the original code in question is not built using
VC++, but GCC. GCC doesn't know anything about the Load Configuration
directory, nor about a safe exception handler table.

> However in the example code you don't need to call RtlUnwind to
> get the repeat behaviour - the following simplified code for the
> exception handler
> still produces a loop with or without the command line argument:
>
> int
> handle_exception (EXCEPTION_RECORD *e, exception_list_t *frame,

> ? ? ? ? ? ? ? ? ? CONTEXT *c, void *dummy)
> {
> ? fputs ("In exception_handler\n", stderr);
> ? return 0; // = ExceptionContinueExecution
> }

Yes, that's right. But if you don't use RtlUnwind, the exception
handling only works for the first exception. Look at this POSIXy
example code:

#include <stdio.h>
#include <signal.h>
#include <setjmp.h>

int i = 0;
jmp_buf j;

void segvhdl (int sig)
{
fprintf (stderr, "in segvhdl!\n");
if (++i > 3)
longjmp (j, 1);
}

int main(int argc, char **argv)
{
signal (SIGSEGV, segvhdl);
if (!setjmp (j, 1))
*(int *) 0 = 4711; // Exception 1, SIGSEGV
fprintf (stderr, "BACK!\n");
return 1 / 0; // Exception 2, SIGFPE
}

The signal handler in the application code is called by Cygwin's
exception handler at the first exception. The second exception is not
SIGSEGV but SIGFPE, so it has no signal handler installed, thus Cygwin's
exception handler is supposed to create a core dump file and exit.

Assuming Cygwin has no exception handler loop, if we *do* RtlUnwind,
what happens is that Cygwin's exception handler is called exactly once
for the first exception. When the signal handler returns, the next SEGV
does not call Cygwin's exception handler anymore, but instead the OS
default handler is called:

in segvhdl!
[default exception handler dialog]

If we *don't* RtlUnwind, the first exception results in the desired
loop, but as soon as the second exception occurs, Cygwin's exception
handler is not called anymore, but the default handler:

in segvhdl!
in segvhdl!
in segvhdl!
in segvhdl!
BACK!
[default exception handler dialog]

Only if the exception handler points to itself, and only if we use
RtlUnwind in the exception handler, we get the desired behaviour
on all Windows systems up to Vista SP1, just not on Server 2008 :(

> I wonder whether it is possible to rewrite the exception handler so
> you don't need
> the 'trick' of pointing to self to make the RtlUnwind leave the target
> exception
> handler still active?

I would be really, really glad if somebody could point out how to
do that in a generic way, be it with or without RtlUnwind. Every
solution I tried so far (with and without RtlUnwind, vectored
exception handling, SetUnhandledExceptionFilter) didn't work one
way or the other.


Thanks,

Corinna Vinschen

unread,
Feb 26, 2008, 8:32:48 AM2/26/08
to
Corinna Vinschen wrote:
> Ivan Brugiolo [MSFT] wrote:
>> The proper fix would be for the compiler to generate
>> Safe-Handler-table in the debug directory of the PE executable.
>
> Is there public documentation how a safe-handler table looks like and
> what content it has to have to be accepted by the OS? Since GCC can't
> generate such a table, we would have to generate this table manually
> in the DLL image, if no other solution exists.

I tried that now.

I searched the existing docs and I managed to generate a
IMAGE_LOAD_CONFIG_DIRECTORY32_2 structure in the Cygwin DLL, with all
members set to 0, except for the SEHandlerTable and SEHandlerCount
members. I set the DLL's DataDirectory entry which points to the Load
Configuration accordingly, using a hex editor for now. When examining
the result using `link -dump -all', everything looks correct:

$ link -dump -all C:\cygwin\bin\cygwin1.dll
[...]
OPTIONAL HEADER VALUES
[...]
61000000 image base (61000000 to 612FFFFF)
[...]
4920 [ 40] RVA [size] of Load Configuration Directory
[...]
Section contains the following load config:

00000048 size
0 time date stamp
0.00 Version
0 GlobalFlags Clear
0 GlobalFlags Set
0 Critical Section Default Timeout
0 Decommit Free Block Threshold
0 Decommit Total Free Threshold
00000000 Lock Prefix Table
0 Maximum Allocation Size
0 Virtual Memory Threshold
0 Process Heap Flags
0 Process Affinity Mask
0 CSD Version
0000 Reserved
00000000 Edit list
00000000 Security Cookie
6113A180 Safe Exception Handler Table
2 Safe Exception Handler Count

Safe Exception Handler Table

Address
--------
610037A0 __ZN7_cygtls27handle_threadlist_exceptionEP17_EXCEPTION_RECO
RDP15_exception_listP8_CONTEXTPv
610364D0 __ZN7_cygtls17handle_exceptionsEP17_EXCEPTION_RECORDP15_exce
ption_listP8_CONTEXTPv

The names are mangled GCC c++ function names. Demangled the names are

_cygtls::handle_threadlist_exception
_cygtls::handle_exceptions

When I run a Cygwin application which generates an exception, the same
hang using 100% CPU time occurs. If I change the exception chain to
have no loop, our exception handler is *never* called, not even the
first time. Instead, the default handler takes over immediately.

If I'm not doing something very incorrect, then using the Safe Exception
Handler Table is of no help. If I'm doing something incorrect, I'd
need to know what.

roge...@gmail.com

unread,
Feb 26, 2008, 9:49:12 AM2/26/08
to
On Feb 26, 12:53 pm, Corinna Vinschen <cori...@community.nospam>
wrote:

> roger....@gmail.com wrote:
> > On Feb 22, 2:51?pm, Corinna Vinschen <cori...@community.nospam> wrote:
> >> Ok, here's the self-contained testcase:
>
> > Note that this code won't work as supplied with more recent versions
> > of MSVC
>
> I built it with the VC++ 2008 Express Edition as well as with VC++ from
> the SDK for Windows 2008, and it worked for me to demonstrate the
> problem.  Using the Express Edition I had to define RtlUnwind myself,
> using the SDK VC++ I hadn't.  That was the only difference.

Ok, so the exception table stuff must only be in the retail visual
studio.

Yup, I understand this is because the RltUnwind resets the exception
chain top.

> If we *don't* RtlUnwind, the first exception results in the desired
> loop, but as soon as the second exception occurs, Cygwin's exception
> handler is not called anymore, but the default handler:
>
>   in segvhdl!
>   in segvhdl!
>   in segvhdl!
>   in segvhdl!
>   BACK!
>   [default exception handler dialog]

So the question is, what is happening under the covers when the signal
handler
returns rather than calling longjmp.
It looks like something is calling RtlUnwind in that case - or
unwinding the
exception chain manually - as the exception chain no longer includes
your handler.
Is that gcc's signal handling implementation on Win32?

Roger.

Corinna Vinschen

unread,
Feb 26, 2008, 11:59:19 AM2/26/08
to
roge...@gmail.com wrote:
> So the question is, what is happening under the covers when the signal
> handler
> returns rather than calling longjmp.
> It looks like something is calling RtlUnwind in that case - or
> unwinding the
> exception chain manually - as the exception chain no longer includes
> your handler.

When the signal handler returns, Cygwin's underlying exception handler
returns with 0. From the Cygwin's DLL perspective, nothing else
happens. And the exception handler is still in the chain, otherwise
the signal handler wouldn't be called 4 times (which means the underlying
Cygwin exception handler is called four times).

When longjmp is called, the CPU state gets
reset to the state it had when setjmp has been called. The return
value !=0 results in skipping the SEGV exception. When it hits the
FPE exception, the Cygwin exception handler isn't called anymore.

However, if I reinstantiate the old value of fs:0 after the longjmp
call, it works as expected!

So that leads to two questions, one of them is if the sigjmp/longjmp
implementation I'm relying on is missing the segment registers
when storing and reverting the CPU state. The other question is
a question for this newsgroup:

Is it correct to call RtlUnwind in the exception handler or is
that wrong? I'm somewhat puzzled right now.

roge...@gmail.com

unread,
Feb 26, 2008, 5:47:34 PM2/26/08
to
On 26 Feb, 16:59, Corinna Vinschen <cori...@community.nospam> wrote:

> roger....@gmail.com wrote:
> > So the question is, what is happening under the covers when the signal
> > handler
> > returns rather than calling longjmp.
> > It looks like something is calling RtlUnwind in that case - or
> > unwinding the
> >exceptionchain manually - as theexceptionchain no longer includes

> > your handler.
>
> When the signal handler returns, Cygwin's underlyingexceptionhandler
> returns with 0.  From the Cygwin's DLL perspective, nothing else
> happens.  And theexceptionhandler is still in the chain, otherwise

> the signal handler wouldn't be called 4 times (which means the underlying
> Cygwinexceptionhandler is called four times).

'post in haste, repent at leisure'

Yup, I meant the other way round.

What is going on now makes more sense to me -- particularly with the
example using longjmp.

However, now I've got access to Cygwin at home, I am curious
about the return case too, as RtlUnwind is called in that case.

I'm inferring that RtlUnwind is being used to return to the 'known
state'
of exception disposition. However as the target of the RtlUnwind is
within cygwin1.dll possibly that code could reset fs:0 before
returning control?

The longjmp handler would be a different problem -- I assume you're
just picking up the gcc implementation?

BTW have you had any input from anyone from gcc -- this problem is
going to affect their programs as well as Cygwin.

The best solution is obviously to get a change from Microsoft (as this
problem will affect a scarily large number of programs built with
Cygwin
and/or gcc)

Regards,
Roger.

Jeffrey Tan[MSFT]

unread,
Feb 26, 2008, 9:07:12 PM2/26/08
to
Hi Corinna,

Thank you for the detailed information.

Can you try to set the following regkey and see if the issue still repros:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session
Manager\kernel\DisableExceptionChainValidation = 0x1

Thanks.

Best regards,
Jeffrey Tan
Microsoft Online Community Support
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscriptions/support/default.aspx.
==================================================

Adrian Tulloch

unread,
Feb 27, 2008, 3:44:04 AM2/27/08
to
"Corinna Vinschen" <cor...@community.nospam> wrote in message
news:fq1gh7$9ag$1...@perth.hirmke.de...

> However, if I reinstantiate the old value of fs:0 after the longjmp
> call, it works as expected!
>
> So that leads to two questions, one of them is if the sigjmp/longjmp
> implementation I'm relying on is missing the segment registers
> when storing and reverting the CPU state. The other question is
> a question for this newsgroup:
>
> Is it correct to call RtlUnwind in the exception handler or is
> that wrong? I'm somewhat puzzled right now.

My understanding is that, in your case, it is incorrect to call RtlUnwind in
the exception handler. Your exception handler is returning
ExceptionContinueExecution (aka 0). This means that the OS will do all the
work needed to ensure that execution continues at the spot where the
exception was raised (which is basically the behaviour you want for your
signal handlers, isn't it?).

However, if you wanted to handle the exception by transferring control to
another block of code, such as a C++ catch block several frames away from
the top of the stack, you would likely need to do a RtlUnwind to remove any
intervening exception handlers from the exception handling chain. You'd also
have to do a certain amount of other work to ensure that the register
context is set up correctly for the catch block.

However, I don't really understand how RtlUnwind helps hide the problems
with the gcc setjmp/longjmp behaviour. When you leave the signal handler
with a longjmp, is the RtlUnwind code even called? If so, is it called
before the signal handler is entered, or in some other way?

Adrian


Corinna Vinschen

unread,
Feb 27, 2008, 5:14:59 AM2/27/08
to
Hi Roger,

roge...@gmail.com wrote:
> On 26 Feb, 16:59, Corinna Vinschen <cori...@community.nospam> wrote:
>> When the signal handler returns, Cygwin's underlyingexceptionhandler

>> returns with 0. ?From the Cygwin's DLL perspective, nothing else
>> happens. ?And theexceptionhandler is still in the chain, otherwise


>> the signal handler wouldn't be called 4 times (which means the underlying
>> Cygwinexceptionhandler is called four times).
>
> 'post in haste, repent at leisure'
>
> Yup, I meant the other way round.
>
> What is going on now makes more sense to me -- particularly with the
> example using longjmp.
>
> However, now I've got access to Cygwin at home, I am curious
> about the return case too, as RtlUnwind is called in that case.
>
> I'm inferring that RtlUnwind is being used to return to the 'known
> state'
> of exception disposition. However as the target of the RtlUnwind is
> within cygwin1.dll possibly that code could reset fs:0 before
> returning control?

I finally figured this out and I have a working implementation in
testing which reverts to the old, Linux-like behaviour without the need
for the self-reference in the exception chain.

After more debugging yesterday, I found that the top of the exception
list chain changes as soon as the exception occured.

When you observe the value of %fs:0, you'll find that, right before the
exception occured, %fs:0 points to our own exception handler. As soon
as the exception occured and our own exception handler is called, %fs:0
refers to some other exception handler within the ntdll. Apparently
that's a safe guard against exceptions in the application exception
handler.

When you call RtlUnwind with the frame pointer pointing to your own
exception chain entry (which is the second parameter to the exception
handler function), %fs:0 is changed to point to this frame.

Now it appears that the calling function (RtlDispatchException?) assumes
that no unwind has been taken place if the user space handler returns 0.
Consequentially it removes the top level entry (which it assumes is its
own installed handler) from the exception handler stack. Unfortunately
this is our handler, which now simply disappears from the exeption
handler stack.

The trick is to make sure that the calling function does not remove our
exception handler entry from the stack. Right now we have two different
patches accomplishing that, we just have to discuss which is the better
one.

> The longjmp handler would be a different problem -- I assume you're
> just picking up the gcc implementation?

No, we have our own implementation. Given the above, it is actually
unnecessary to change setjmp/longjmp. The whole fix can take place in
Cygwin's exception handler.

> BTW have you had any input from anyone from gcc -- this problem is
> going to affect their programs as well as Cygwin.

setjmp/longjmp is a processor and OS specific library implementation.
There is no generic GCC implementation afaik. But I may be wrong. I
never looked into any setjmp/longjmp implementation before yesterday.

> The best solution is obviously to get a change from Microsoft (as this
> problem will affect a scarily large number of programs built with
> Cygwin
> and/or gcc)

After I figured out the above, I'm not sure anymore :)

OTOH, even if it doesn't hurt Cygwin anymore, I still think the tight
busy loop in ntdll is a bug in 2008 Server.

Thanks, Roger! Your input has lead me in the right direction.

Corinna Vinschen

unread,
Feb 27, 2008, 5:33:28 AM2/27/08
to
Adrian Tulloch wrote:
> "Corinna Vinschen" <cor...@community.nospam> wrote in message
> news:fq1gh7$9ag$1...@perth.hirmke.de...
>> So that leads to two questions, one of them is if the sigjmp/longjmp
>> implementation I'm relying on is missing the segment registers
>> when storing and reverting the CPU state. The other question is
>> a question for this newsgroup:
>>
>> Is it correct to call RtlUnwind in the exception handler or is
>> that wrong? I'm somewhat puzzled right now.
>
> My understanding is that, in your case, it is incorrect to call RtlUnwind in
> the exception handler. Your exception handler is returning
> ExceptionContinueExecution (aka 0). This means that the OS will do all the
> work needed to ensure that execution continues at the spot where the
> exception was raised (which is basically the behaviour you want for your
> signal handlers, isn't it?).
>
> However, if you wanted to handle the exception by transferring control to
> another block of code, such as a C++ catch block several frames away from
> the top of the stack, you would likely need to do a RtlUnwind to remove any
> intervening exception handlers from the exception handling chain. You'd also
> have to do a certain amount of other work to ensure that the register
> context is set up correctly for the catch block.

Well, I'm stumbling over this for a couple of days now and I'm
constantly switching between "unwinding is correct" and "unwinding is
incorrect". Right now I think unwinding is correct and even
unavoidable. Let me try to explain.

When Cygwin's exception handler gets called, it has several options,
depending on the exception type (which gets transformed into a specific
signal) and the signal setting of the application.

The simple case is that the application has no signal handler installed.
In this case, the exception ends fatal, usually Cygwin creates a stack
dump. In this case all other exception handlers should get a chance to
cleanup, so unwinding is correct.

What if the application has a signal handler installed? Cygwin's
exception handler does not know if that signal handler will take the
right measures to rectify the problem which led to the exception or not.
It also does not know if the signal handler returns or not. So, for
all it knows, the signal handler will handle the exception, one way
or the other. So unwinding is the right thing to do to say the other
exception handlers, "we care".

Does that make sense?

> However, I don't really understand how RtlUnwind helps hide the problems
> with the gcc setjmp/longjmp behaviour. When you leave the signal handler
> with a longjmp, is the RtlUnwind code even called? If so, is it called
> before the signal handler is entered, or in some other way?

Yes, RtlUnwind is called before the signal handler is called.

Assuming you don't unwind, if the signal handler function does not
return, but calls longjmp as in my example, the exception handler
installed by ntdll (I explained in my previous posting) will stay on top
of the exception stack, because ntdll has no chance to remove it after
our exception handler returns. Thus, the next exception will invariable
be handled by the default handler again.

roge...@gmail.com

unread,
Feb 27, 2008, 5:53:54 AM2/27/08
to
On Feb 27, 10:14 am, Corinna Vinschen <cori...@community.nospam>
wrote:

> I finally figured this out and I have a working implementation in
> testing which reverts to the old, Linux-like behaviour without the need
> for the self-reference in the exception chain.

Good news!

One question that occurs to me is whether there would be any mileage
in putting the "Safe Exception Handler Table" into the CYGWIN dll?

In case you don't know what this does, it prevents any exception
handlers
being called other than those listed in the table.
This prevents buffer overrun exploits which can otherwise corrupt the
chain
starting at fs:0 and insert their own handler, then cause an
exception.
In the Visual C++ many functions have their own structured exception
handlers
as the same mechanism is also used for C++ exceptions and stack
unwinding.

I don't know how prevalent such sorts of bufffer overruns are in the
cygwin world:
I believe gcc doesn't use structured exception handling for its C++
exception handling
and stack unwinding [it certainly handles C++ exceptions faster than VC
++ and without
calling RaiseException]; but it would be worth checking whether the
exception chain
is susceptible to an overrun attack.

Note that programs built with this table still run on earlier versions
of windows,
just without the checking!

> setjmp/longjmp is a processor and OS specific library implementation.
> There is no generic GCC implementation afaik.  But I may be wrong.  I
> never looked into any setjmp/longjmp implementation before yesterday.

No, I'm probably wrong :-)

Regards,
Roger.

Corinna Vinschen

unread,
Feb 27, 2008, 6:15:48 AM2/27/08
to
Hi Jeffrey,

Jeffrey Tan[MSFT] wrote:
> Hi Corinna,
>
> Thank you for the detailed information.
>
> Can you try to set the following regkey and see if the issue still repros:
> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session
> Manager\kernel\DisableExceptionChainValidation = 0x1

well, it's interesting. Given the original code with the exception
stack entry pointing to itself, and a Cygwin DLL with no safe exception
handler table. If I change the above setting in the registry and
reboot, suddenly Cygwin behaves on 2008 as it does on any other Windows.
Our exception hander is called just right.

However, still with DisableExceptionChainValidation = 0x1, as soon as I
add the safe exception handler table as I descibed in my posting you
replied too, the bahviour changes. Now our exception handler is never
called, but the ntdll default handler instead.

Given that I found a solution which appears to work fine (see my
postings in this thread from today) without the necessity to
self-reference our exception stack entry and without the need to install
a safe handler table, Cygwin will not be affected by this problem
anymore.

Nevertheless, I think the tight busy loop which occurs in
RtlDispatchException in case DisableExceptionChainValidation = 0x0 is a
bug. I'm not sure what this loop is doing, I assume it's walking the
exception list entries without testing for a loop? If so, this should
really be fixed so that at least the default exception handler kicks in
at one point.


Thanks,

Corinna Vinschen

unread,
Feb 27, 2008, 7:08:45 AM2/27/08
to
roge...@gmail.com wrote:
> On Feb 27, 10:14?am, Corinna Vinschen <cori...@community.nospam>

> wrote:
>
>> I finally figured this out and I have a working implementation in
>> testing which reverts to the old, Linux-like behaviour without the need
>> for the self-reference in the exception chain.
>
> Good news!
>
> One question that occurs to me is whether there would be any mileage
> in putting the "Safe Exception Handler Table" into the CYGWIN dll?

Uh, if you read my postings about installing a safe handler table in
Cygwin, you saw that it doesn't work as I (and probably others) had
expected. GNU ld has also no generic way to specify the value for the
data directory so, right now, I have to set the value using nm and
hexedit. For some reason the exception handler doesn't work anymore
after I added the table. There's a very good chance that I'm just
missing something, of course. For instance, does the code always need a
security cookie to work? If so, I could create a cookie, but I found no
documentation which tells how to generate a cookie on the call stack, if
it's not automatically supported by the compiler.

> I believe gcc doesn't use structured exception handling for its C++
> exception handling

Right now the Windows versions of G++ are using SJLJ. There's ongoing
discussion to switch to Dwarf-2 exception handling, but this is a major
step which breaks compatibility to older DLLs using SJLJ.

Adrian Tulloch

unread,
Feb 27, 2008, 4:35:13 PM2/27/08
to
"Corinna Vinschen" <cor...@community.nospam> wrote in message
news:fq3e9o$u3d$2...@perth.hirmke.de...

> When Cygwin's exception handler gets called, it has several options,
> depending on the exception type (which gets transformed into a specific
> signal) and the signal setting of the application.
>
> The simple case is that the application has no signal handler installed.
> In this case, the exception ends fatal, usually Cygwin creates a stack
> dump. In this case all other exception handlers should get a chance to
> cleanup, so unwinding is correct.

Can't you return ExceptionContinueSearch without doing any unwinding here?
That way the OS will call into the next handler on the chain, giving it a
chance to handle the problem. If anything, I'd have thought that unwinding
may cause problems here. If a lower handler does fix the problem, and wants
execution to continue in exactly the same spot where the exception occured,
your unwinding may have ended up permanently removing some higher-up
exception handlers from the chain. This is likely to cause problems down the
track.

>
> What if the application has a signal handler installed? Cygwin's
> exception handler does not know if that signal handler will take the
> right measures to rectify the problem which led to the exception or not.
> It also does not know if the signal handler returns or not. So, for
> all it knows, the signal handler will handle the exception, one way
> or the other. So unwinding is the right thing to do to say the other
> exception handlers, "we care".

In another message in this thread, you mentioned that you've got a couple of
patches which fix this problem. Can you point me to them? I feel that we may
be talking at cross purposes here, and seeing those patches may help fix
that.

I'd have thought that, if the signal handler does fix the problem and
returns, returning ExceptionContinueExecution and *not* unwinding makes
everything work, and is doing "the right thing" by the OS. In the situation
where the signal handler doesn't return (is a longjmp the only way this
occurs?), the long jumper is responsible for changing the program state to
into a form that's appropriate for the target of the jump. This manipulation
of the program state may or may not involve doing a RtlUnwind, but that's
the decision of the long jumper, not the generic exception handler.

Adrian

roge...@gmail.com

unread,
Feb 28, 2008, 5:52:20 AM2/28/08
to
On Feb 27, 12:08 pm, Corinna Vinschen <cori...@community.nospam>
wrote:

> Uh, if you read my postings about installing a safe handler table in


> Cygwin, you saw that it doesn't work as I (and probably others) had
> expected.  GNU ld has also no generic way to specify the value for the
> data directory so, right now, I have to set the value using nm and
> hexedit.  For some reason the exception handler doesn't work anymore
> after I added the table.  There's a very good chance that I'm just
> missing something, of course.

I suspect the exception table contains a different address to the
actual
exception handler installed - one may be an import entry that jumps to
or calls the other -- but the actual hex addresses must match exactly
for the OS to dispatch the exception. Anyway, as it doesn't sound
like
in the Cygwin world that there is any security benefit in attempting
to use
the table the precise reason isn't that interesting :-)

Roger.

Corinna Vinschen

unread,
Feb 28, 2008, 6:58:13 AM2/28/08
to
roge...@gmail.com wrote:
> On Feb 27, 12:08?pm, Corinna Vinschen <cori...@community.nospam>

> wrote:
>
>> Uh, if you read my postings about installing a safe handler table in
>> Cygwin, you saw that it doesn't work as I (and probably others) had
>> expected. ?GNU ld has also no generic way to specify the value for the

>> data directory so, right now, I have to set the value using nm and
>> hexedit. ?For some reason the exception handler doesn't work anymore
>> after I added the table. ?There's a very good chance that I'm just

>> missing something, of course.
>
> I suspect the exception table contains a different address to the
> actual
> exception handler installed - one may be an import entry that jumps to
> or calls the other -- but the actual hex addresses must match exactly
> for the OS to dispatch the exception.

I checked the result with `link -dump -all' and it really looked fine.
The entry in the DataDirectory was set to the RVA, the size to 0x40.
Link printed the correct content of the load config directory, and the
content of the handler table was resolved correctly, too. See my reply
to one of Ivan's postings: http://tinyurl.com/24fjq5

That's why I'm somewhat puzzled that it didn't work. The only reason
I can think of is the missing security cookie stuff.

Corinna Vinschen

unread,
Feb 28, 2008, 6:48:57 AM2/28/08
to
Adrian Tulloch wrote:
> "Corinna Vinschen" <cor...@community.nospam> wrote in message
> news:fq3e9o$u3d$2...@perth.hirmke.de...
>> The simple case is that the application has no signal handler installed.
>> In this case, the exception ends fatal, usually Cygwin creates a stack
>> dump. In this case all other exception handlers should get a chance to
>> cleanup, so unwinding is correct.
>
> Can't you return ExceptionContinueSearch without doing any unwinding here?
> That way the OS will call into the next handler on the chain, giving it a
> chance to handle the problem. If anything, I'd have thought that unwinding
> may cause problems here.

But that's exactly what we don't want. The Cygwin DLL is a POSIX
emulation layer in Win32 user space trying to be as Linux-compatible on
the API level and in general behaviour as possible. Our exception
handler is supposed to be *the* (final) exception handler for Cygwin
applications.

The next handler in the chain is the OS handler which just terminates
the application and shows the "application has terminated unexpectedly"
GUI. All other potential handlers are higher in the chain.

Usually there are no other handlers, but if so, the fact that our
exception handler got called shows that these exception handlers don't
know how to handle this exception anyway. Otherwise they wouldn't
have returned ExceptionContinueSearch.

>> What if the application has a signal handler installed? Cygwin's
>> exception handler does not know if that signal handler will take the
>> right measures to rectify the problem which led to the exception or not.
>> It also does not know if the signal handler returns or not. So, for
>> all it knows, the signal handler will handle the exception, one way
>> or the other. So unwinding is the right thing to do to say the other
>> exception handlers, "we care".
>
> In another message in this thread, you mentioned that you've got a couple of
> patches which fix this problem. Can you point me to them? I feel that we may
> be talking at cross purposes here, and seeing those patches may help fix
> that.

The final solution we agreed on is not to return from the exception
handler after unwinding, but to return to the point of the exception
immediately:

http://sourceware.org/cgi-bin/cvsweb.cgi/winsup/cygwin/exceptions.cc.diff?cvsroot=uberbaum&r1=1.311&r2=1.312

> I'd have thought that, if the signal handler does fix the problem and
> returns, returning ExceptionContinueExecution and *not* unwinding makes
> everything work, and is doing "the right thing" by the OS. In the situation

As I explained, the exception handler doesn't know what the signal
handler will do. It's in the application which is just "some"
application we don't know anything about.

> where the signal handler doesn't return (is a longjmp the only way this
> occurs?), the long jumper is responsible for changing the program state to
> into a form that's appropriate for the target of the jump. This manipulation
> of the program state may or may not involve doing a RtlUnwind, but that's
> the decision of the long jumper, not the generic exception handler.

The signal handler has no access to the exception information given
to the exception handler. It gets only the signal number as parameter.
longjmp just reverts the state of the CPU as stored in the jmp_buf by
the call to setjmp. When setjmp is called there's no information about
any exception available.

roge...@gmail.com

unread,
Feb 28, 2008, 9:02:16 AM2/28/08
to
On Feb 28, 11:48 am, Corinna Vinschen <cori...@community.nospam>
wrote:
> Adrian Tulloch wrote:
> > "Corinna Vinschen" <cori...@community.nospam> wrote in message
> http://sourceware.org/cgi-bin/cvsweb.cgi/winsup/cygwin/exceptions.cc....
>

Looks like the best possible solution.
The exception chain doesn't have that nasty loop in it, and the code
should work on previous versions of Windows too.

I must admit to getting pretty confused when I first used some of my
own debugging tools on a Cygwin process
and the exception frame walk didn't terminate :-)

Regards,
Roger.

Jochen Kalmbach [MVP]

unread,
Feb 28, 2008, 9:24:55 AM2/28/08
to
Hi Corinna!


> The final solution we agreed on is not to return from the exception
> handler after unwinding, but to return to the point of the exception
> immediately:
>
> http://sourceware.org/cgi-bin/cvsweb.cgi/winsup/cygwin/exceptions.cc.diff?cvsroot=uberbaum&r1=1.311&r2=1.312

=> SetThreadContext (GetCurrentThread (), in);


Just a small note from MSDN:
<quote>Do not try to set the context for a running thread; the results
are unpredictable.</quote>

Also I don't know if setting the context of the current thread is or
will be supported in the future...

Corinna Vinschen

unread,
Feb 28, 2008, 11:34:47 AM2/28/08
to
Jochen Kalmbach [MVP] wrote:
> Hi Corinna!
>
>
>> The final solution we agreed on is not to return from the exception
>> handler after unwinding, but to return to the point of the exception
>> immediately:
>>
>> http://sourceware.org/cgi-bin/cvsweb.cgi/winsup/cygwin/exceptions.cc.diff?cvsroot=uberbaum&r1=1.311&r2=1.312
>
> => SetThreadContext (GetCurrentThread (), in);
>
>
> Just a small note from MSDN:
> <quote>Do not try to set the context for a running thread; the results
> are unpredictable.</quote>

Thanks for the hint. OTOH, I tested it with various exception tests on
every system from NT4SP6 up to 2008 x64 and it worked on all systems.

> Also I don't know if setting the context of the current thread is or
> will be supported in the future...

Oh please, not another backward incompatibility...

But at least, if that ever happens, I know how to do something similar
without caling SetThreadContext ;)

Adrian Tulloch

unread,
Feb 28, 2008, 4:27:15 PM2/28/08
to
"Corinna Vinschen" <cor...@community.nospam> wrote in message
news:fq6739$jok$1...@perth.hirmke.de...

> Adrian Tulloch wrote:
>> "Corinna Vinschen" <cor...@community.nospam> wrote in message
>> news:fq3e9o$u3d$2...@perth.hirmke.de...
>> Can't you return ExceptionContinueSearch without doing any unwinding
>> here?
>> That way the OS will call into the next handler on the chain, giving it a
>> chance to handle the problem. If anything, I'd have thought that
>> unwinding
>> may cause problems here.

[..]


> Usually there are no other handlers, but if so, the fact that our
> exception handler got called shows that these exception handlers don't
> know how to handle this exception anyway. Otherwise they wouldn't
> have returned ExceptionContinueSearch.

Have you looked at what RtlUnwind does? Amongst other things, it calls back
into each of these handlers, setting the EXCEPTION_RECORD's ExceptionFlags
field to 2. This is intended to be used for things such as C++ destructors
and __finally blocks to do their cleanup work. I think this means that it's
very much incorrect to call RtlUnwind in the case where your signal handler
fixes the problem and wants execution to continue. It means that you risk
calling this cleanup code twice.

As a slightly contrived example, let's say that a (cygwin) developer is
implementing a sparse data structure by using the OS's support for guard
pages. Whenever anyone tries to write to this guard page, a
STATUS_GUARD_PAGE_VIOLATION is raised (which you turn into a SIG_BUS). The
idea is that the signal handler allocates another guard page, then allows
execution to continue in exactly the place where it left off. Let's also
pretend that an OS API is the unlucky person who tried to write to this
guard page (perhaps you're calling into the Registry APIs, getting them to
write into this sparse buffer). Your call to RtlUnwind means that any OS
__finally blocks or C++ destructors will get called before your signal
handler is entered. Then, when you resume execution to this OS API, its
normal code path will call this cleanup code for a second time. That's bound
to be a bad thing.

Incidentally, I'm told that more recent versions of the OS (e.g. X64) call
RtlUnwindEx to do their exception support. I'm sure they've got a good
reason for doing so. If you do want to stay with your current approach
(instead of fixing setjmp/longjmp), you may want to understand why they do
that and whether it's also appropriate for you as well.


>> In another message in this thread, you mentioned that you've got a couple
>> of
>> patches which fix this problem. Can you point me to them?

> The final solution we agreed on is not to return from the exception


> handler after unwinding, but to return to the point of the exception
> immediately:
>
> http://sourceware.org/cgi-bin/cvsweb.cgi/winsup/cygwin/exceptions.cc.diff?cvsroot=uberbaum&r1=1.311&r2=1.312

I see that you're using SetThreadContext to do this. MSDN states that
calling SetThreadContext in this manner (with a running thread) may caues
unpredictable results. Are you sure it works on the older versions of the OS
you need to support?

Getting back to the original problem, am I correct in thinking that your
setjmp/longjmp implementation doesn't save and restore fs:[0]? If you fix
this, won't the problem go away? I'd have thought you want to fix this
anyway --- if you don't, you'll have problems with longjmps which *aren't*
out of signal handlers. Let's say you're in a callback function which an OS
routine (e.g. EnumWindows or something similar) calls, and you want to
longjmp to somewhere far away, before EnumWindows was even called. I think
your current implementation of longjmp will leave any OS exception handlers
on the chain, causing all sorts of grief as that stack gets re-used. Even
better, if you do fix it, you could use the built-in support to accomplish
exactly what you want, delete a bunch of tricky code, and know that things
will work in future versions of the OS.

Adrian


Adrian Tulloch

unread,
Feb 28, 2008, 5:43:19 PM2/28/08
to
Corinna Vinschen wrote:

> Jochen Kalmbach [MVP] wrote:
>> Just a small note from MSDN:
>> <quote>Do not try to set the context for a running thread; the results
>> are unpredictable.</quote>
>
> Thanks for the hint. OTOH, I tested it with various exception tests on
> every system from NT4SP6 up to 2008 x64 and it worked on all systems.

If you must follow this approach, you may want to look at RtlRestoreContext, at least for
64-bit systems. However, to reiterate what I said in my earlier post, I think you'd be
much better off fixing longjmp instead.

Adrian

Jeffrey Tan[MSFT]

unread,
Feb 29, 2008, 12:38:48 AM2/29/08
to
Hi Corinna,

Thanks for your feedback.

Sorry, there are too many replies in this post, I can not find the solution
you find. Can you tell me the date/time of the reply that contains your
solution?

I have discussed this with our security team. In Windows Server 2008 when
an exception is raised, the OS walk the EH chain to ensure it terminates
the way it expects. If the chain is corrupt, the process is terminated and
the Windows Error Reporting logic is invoked. Because cygwin has created a
cycle in the EH chain, this chain validation cycles.

Anyway, we believe it is still a problem that the OS did not detect the
loop cycle and terminate the process. The security has filed an internal
issue to track this issue for future consideration now.

Hope this helps.

Corinna Vinschen

unread,
Feb 29, 2008, 7:27:39 AM2/29/08
to
Hi Jeffrey,

Jeffrey Tan[MSFT] wrote:
> Hi Corinna,
>
> Thanks for your feedback.
>
> Sorry, there are too many replies in this post, I can not find the solution
> you find. Can you tell me the date/time of the reply that contains your
> solution?

I'm referring to this posting:

http://groups.google.com/group/microsoft.public.win32.programmer.kernel/msg/2ac1988a74b35684

The actual solution right now looks like this:

int exception_handler(EXCEPTION_RECORD *e, exception_list *frame,
CONTEXT *c, void *v)
{
RtlUnwind(frame, ip, e, 0);
...
SetThreadContext(GteCurrentThread (), c);
/*NOTREACHED*/
}

However, given especially Adrian's response in

http://groups.google.com/group/microsoft.public.win32.programmer.kernel/msg/34c1708c3c5192fd

I got serious doubts about this solution. While this appears to work on
all NT versions since NT4, it turned out to be broken on Windows 98
(which, while being outphased with the next major release, is still
supported by the current Cygwin release).

So I'm now considering another approach which Adrian suggests and which
I had already tried two days ago but dropped again when I found that I
can solve that entirely within the exception handler. That is, store
fs:0 in the setjmp/longjmp buffer and don't RtlUnwind in the exception
handler if it's going to return with 0.

It's still not *really* clear to me if it's ok to RtlUnwind when the
exception handler returns with 0 or not. A definitive reply from you
about this issue would be nice :}

> I have discussed this with our security team. In Windows Server 2008 when
> an exception is raised, the OS walk the EH chain to ensure it terminates
> the way it expects. If the chain is corrupt, the process is terminated and
> the Windows Error Reporting logic is invoked. Because cygwin has created a
> cycle in the EH chain, this chain validation cycles.

That's what I assumed.

> Anyway, we believe it is still a problem that the OS did not detect the
> loop cycle and terminate the process. The security has filed an internal
> issue to track this issue for future consideration now.

Sounds good to me. The OS function shouldn't loop endlessly but
recognize a loop in the EH chain. The open question is just, should
the OS walk treat a loop in theEH chain as "corrupted" or as "ok"?

Corinna Vinschen

unread,
Feb 29, 2008, 7:32:09 AM2/29/08
to
Hi Adrian,

Adrian Tulloch wrote:
> [...]


> Getting back to the original problem, am I correct in thinking that your
> setjmp/longjmp implementation doesn't save and restore fs:[0]? If you fix
> this, won't the problem go away? I'd have thought you want to fix this
> anyway --- if you don't, you'll have problems with longjmps which *aren't*
> out of signal handlers. Let's say you're in a callback function which an OS
> routine (e.g. EnumWindows or something similar) calls, and you want to
> longjmp to somewhere far away, before EnumWindows was even called. I think
> your current implementation of longjmp will leave any OS exception handlers
> on the chain, causing all sorts of grief as that stack gets re-used. Even
> better, if you do fix it, you could use the built-in support to accomplish
> exactly what you want, delete a bunch of tricky code, and know that things
> will work in future versions of the OS.

you made me think about this once more. Not unwinding and
storing/restoring %fs:0 in setjmp/longjmp seems to work nicely on all
OSes since Windows 98. As I just wrote in my reply to Jeffrey, I tried
that already two days ago. However, when I found the solution which
seemed to fix this within the exception handler, I dropped this again.
I now recovered that code and, as I said, it appears to work fine.

Now I have to convince the maintainer of this code part in Cygwin that
this is the better approach...


Thanks,

Jeffrey Tan[MSFT]

unread,
Mar 2, 2008, 10:30:14 PM3/2/08
to
Hi Corinna,

Thanks for your feedback.

I agree with Adrian that RtlUnwind() may not be suitable if you want to fix
the exception and resume the execution. RtlUnwind() will call all __finally
blocks as expected. If you are curious about the details implementation of
RtlUnwind() and OS exception handling, you may review the article below:
"A Crash Course on the Depths of Win32? Structured Exception Handling"
http://www.microsoft.com/msj/0197/exception/exception.aspx

I believe an infinite loop is definitely a problem in the exception handler
chain. OS exception dispatcher should end it and report this problem error.
(It is not ok because there is no ok place to resume execution)

Thanks.

Corinna Vinschen

unread,
Mar 3, 2008, 6:00:47 AM3/3/08
to
Hi Jeffrey,

Jeffrey Tan[MSFT] wrote:
> Hi Corinna,
>
> Thanks for your feedback.
>
> I agree with Adrian that RtlUnwind() may not be suitable if you want to fix
> the exception and resume the execution. RtlUnwind() will call all __finally
> blocks as expected. If you are curious about the details implementation of
> RtlUnwind() and OS exception handling, you may review the article below:
> "A Crash Course on the Depths of Win32? Structured Exception Handling"
> http://www.microsoft.com/msj/0197/exception/exception.aspx
>
> I believe an infinite loop is definitely a problem in the exception handler
> chain. OS exception dispatcher should end it and report this problem error.
> (It is not ok because there is no ok place to resume execution)

Thanks for your helpful reply. Actually, this whole thread has helped a
lot. Thanks to everyone.

Jeffrey Tan[MSFT]

unread,
Mar 4, 2008, 9:28:49 PM3/4/08
to
You are welcome.

If you need further help, please feel free to post, thanks.

0 new messages