2. If not, how's this change sound:
ECX is not a callee-saved register, so callers assume it gets nuked
anyway. So for LLVM functions, ECX gets a flag indicating whether
unwinding is taking place. At each callsite for "call", check ECX and
bail out if the unwind flag is set. At the callsite for "invoke",
check ECX and jump to the unwind label if ECX is set; otherwise, jump
to the regular return label.
It doesn't add to register pressure since ECX gets clobbered by
function calls anyway. It doesn't access memory for LLVM-to-LLVM
calls. The only overhead to callsites is a conditional branch on a
register value.
Now when calling external functions, this obviously won't work.
Perhaps a thread-local global that gets checked only on returns from
external functions. Or perhaps unwinds coming from external functions
just doesn't get supported for now.
3. Perhaps a pass that lowers unwinds to an EH intrinsic? Would that
map well without adding more overhead than the current setjmp/longjmp
lowering pass?
_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> ECX is not a callee-saved register, so callers assume it gets nuked
> anyway.
2 problems here at least:
1. ECX is used as parameter passing register in some calling conventions
2. ECX is used as chain holding register for nested functions
PS: What's about x86-64?
--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University
Anyway, unless the callee is required to preserve it in a given
calling convention, that doesn't preclude us using it for a *return*
value. It would be checked after calls return, and wouldn't affect
the use of the register for passing values in before the call is made.
The callee would set it right before return.
2. Does LLVM support nested functions? I must have missed that.
Anyway, I haven't looked too deeply into X86-64, but I was thinking
that a similar scheme with one of its non-callee-saved registers would
work there.
The internal fastcc convention and the Windows fastcall convention off
the top of my head.
> Anyway, unless the callee is required to preserve it in a given
> calling convention, that doesn't preclude us using it for a *return*
> value. It would be checked after calls return, and wouldn't affect
> the use of the register for passing values in before the call is made.
> The callee would set it right before return.
Right, so that sounds okay.
> 2. Does LLVM support nested functions? I must have missed that.
To the extent required to implement the gcc nested functions
extension, yes. The specific relevant behavior here is that if a
parameter is marked with the nest attribute, it gets passed in ECX.
-Eli
In the past there have been suggestions that a good approach would be to
target the libunwind (http://www.nongnu.org/libunwind/) library
interface in a lowering pass. This could provide both low "availability"
overhead and low "use" overhead.
If libunwind had setjmp/longjmp implementations for your platform (I
think they're currently only available in IA64), then it would be
trivial to use a setjmp/longjmp lowering pass and get what you want.
I keep wanting to do this but it always seems to get bumped off of my
critical path.
Luke
> 1. Which ones? I know that Windows uses it for the "this" pointer.
Many :)
1. windows fastcall
2. LLVM's own fastcc
3. arguments marked inreg (consider e.g. gcc's attribute inreg(3), etc).
> 2. Does LLVM support nested functions? I must have missed that.
nested functions in gcc's sense. They are funky lowered, etc.
http://lwn.net/Articles/252125/
data coming from L1 is only about three times as expensive as data
coming from a register. So putting a register check after *every*
call is probably not going to be profitable, compared to a
thread-local global variable check after every invoke... if they
happen often on a thread, that variable will probably be in cache, and
if they don't happen often, the performance impact will be minimal.
Of course if most methods have variables with destructors, I'll end up
with a check of some kind after almost every (non-nounwind) call
anyway, so a register check would be better. On the other hand,
implementing the register check would seem to require native codegen
changes at callsites as opposed to an IR-modifying pass with a
possible new intrinsic or two.
Anyway, here's my new plan:
1. A thread local global variable, type i8*, initialized to zero.
2. At invoke callsites, right before the invoke call a native method
(mysetjmp) that:
a. Saves ESI, EDI, EBX, EBP, ESP to a buffer alloca'd within the
method containing the invokesite..
b. Sets EAX to 0
c. Returns.
3. The return value of that native method (EAX) is checked, and if
nonzero, branch to unwind label. Otherwise, save the value of the
thread-local-global into the buffer, write the address of that
alloca'd buffer into the thread-local global and make the call.
4. After the call returns, copy the old thread-local-global value out
of the alloca'd buffer back to the thread-local-global.
The unwind instruction will then:
1. Load the thread-local-global value. If it's zero, there's nowhere
to unwind to, so abort.
2. Restore ESI, EDI, EBX, EBP, ESP, and the thread-local-global value
from the buffer.
3. Set EAX to 1.
4. Jump to 2c. (the return instruction for the native method mysetjmp).
The native method will return with all callee-saved registers restored
and a return value in EAX of 1, which will cause the following check
to branch to the unwind label.
Invoke sites only write five callee-saved registers to the stack, and
read/write one pointer to a single thread-local global variable, and
make one direct call. Unwind sites make one direct call, read five
callee-saved registers from the stack (some distance up, so those
memory values might not be warm) and read/write one pointer to a
single thread-local global variable.
The next step would be to replace the mysetjmp call with a new
intrinsic, and then I'd have to save EIP and do an indirect jump to it
at the unwind site instead of jumping to a constant offset within the
native mysetjmp. Making mylongjmp call a new intrinsic will
necessitate no other modifications.
Ciao,
Duncan.
Since unwind doesn't take any operands, is there *any* possible
implementation of unwind that fits with exception handling/invoke?
Implemented as an optional pass, my scheme can go unused when you're
using the g++ front-end or something else that uses __cxa_throw.
Let's see if I understand this...
1. Everywhere inside a "try" block, the C++ front-end emits "invoke"
instructions instead of "call" instructions. Without any
transformations, this "invoke" instruction compiles down to assembly
code that doesn't seem to do anything different from a "call"
instruction. Also, "unwind" compiles down to nothing. However, every
function gets some DWARF info compiled into it by LLVM, and part of it
is information about the invoke site.
2. To throw an exception, call __cxa_allocate_exception to allocate an
exception object, and __cxa_throw to throw it.
3. Every function gets some DWARF info complied into it by LLVM. The
__cxa_throw function uses it to find the function that issued the
"invoke" and find the "landing pads" and jump to the right landing pad
based on the exception type.
4. The landing pad uses exception-handling intrinsics to match the
exception type and to get the exception object.
The lowerinvoke pass adds SJLJ-based unwinding, which is a separate
mechanism based on GCC sjlj exception handling.
My proposed pass adds a lighter-weight setjmp/longjmp-style unwinding.
How do either of these prevent DWARF exception handling from working?
Would a landing pad expecting to get an exception object from the
exception intrinsics fail to get one in the case of an unwind and
crash?
Did I misunderstand anything I outlined above?
Is the exception-throwing function call expected to become an
intrinsic or an instruction in the future? Will it replace unwind?
(Perhaps I should put all this aside and just have my compiler handle
my invoke/unwind logic instead of trying to use invoke/unwind
instructions.)
On Sat, Jul 18, 2009 at 3:04 PM, Duncan Sands<bald...@free.fr> wrote:
> How do either of these prevent DWARF exception handling from working?
if you throw an exception using your proposed unwind implementation,
then it wouldn't be caught by dwarf catch/cleanup regions (eg: invoke).
> Would a landing pad expecting to get an exception object from the
> exception intrinsics fail to get one in the case of an unwind and
> crash?
The landing pad would never be executed in the first place. This
is rather bad, for example cleanups won't be run.
> (Perhaps I should put all this aside and just have my compiler handle
> my invoke/unwind logic instead of trying to use invoke/unwind
> instructions.)
For the moment that is the best solution I think.
Can I interject something at this point.
Can I suggest that invoke/unwind be renamed DWARF_invoke/DWARF_unwind to
warn the unwary that if they want lightweight exception handling in
their Python/ML/whatever implementation they should use some other method.
PS.
Kenneth, why don't you just use setjmp/longjmp directly.
Or, if you want, I can email you my lightweight versions if you want,
Mark.
> Can I suggest that invoke/unwind be renamed DWARF_invoke/DWARF_unwind to
> warn the unwary that if they want lightweight exception handling in
> their Python/ML/whatever implementation they should use some other method.
probably there should be a switch to choose whether codegen should turn
unwind/invoke into dwarf or setjmp/longjmp style code.
There is, but it happens before codegen and is slow.
-enable-correct-eh-support will translate invoke/unwind into
setjmp/longjmp pairs for the correct behavior. See:
http://llvm.org/docs/Passes.html#lowerinvoke
Nick
>
> Ciao,
>
> Duncan.
> _______________________________________________
> LLVM Developers mailing list
> LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
--
Nick Johnson
It seems to me that there is an implicit, and undocumented, assumption
that unwinding needs to handle stack-allocated objects.
In languages without stack-allocated objects (ie. most languages that
support exceptions) there is no need to unwind frame-by-frame, the
unwind simply needs to make a single jump to the invoke instruction and
restore the context (which in x86 is just 6 registers).
>
> There is, but it happens before codegen and is slow.
> -enable-correct-eh-support will translate invoke/unwind into
> setjmp/longjmp pairs for the correct behavior. See:
> http://llvm.org/docs/Passes.html#lowerinvoke
*Begin rant*
It is possible to implement invoke/unwind in such a way that both invoke
*and* unwind are fast, when unwind just unwinds and doesn't perform
any magic behind-the-scenes operations.
After all, isn't it the job of the front-end to insert all the clean-up
code for stack-allocated objects?
Java, C#, Python, and Ruby have destructors(finalizers), but they are
managed by the garbage collector.
C++ is the odd one out, so why do the semantics of an llvm instruction
depend on C++ semantics?
*End rant* ;)
Mark.
Not quite. It's also necessary to execute all the pending POSIX
pthread_cleanup_pop() actions.
> It is possible to implement invoke/unwind in such a way that both invoke
> *and* unwind are fast, when unwind just unwinds and doesn't perform
> any magic behind-the-scenes operations.
I don't think so. Not for pthreads, anyway.
Andrew.
POSIX pthread_cleanup_pop() can only be called directly from C++/C.
C doesn't haven't exceptions.
So yet again, this is a C++ issue.
>
>> It is possible to implement invoke/unwind in such a way that both invoke
>> *and* unwind are fast, when unwind just unwinds and doesn't perform
>> any magic behind-the-scenes operations.
>
> I don't think so. Not for pthreads, anyway.
This is C++ specific.
No. Java, C#, Ruby and Python all support the finally/ensure block;
C# supports the using( IDisposable x =...) {} construct. Both
constructs require support for a frame-by-frame unwind; as these
construct can be nested, a single throw may visit many landing pads
(which may come from different compilation units).
It doesn't have anything to do with stack-allocated vs heap-allocated,
but rather with the language's guarantees about exceptions.
> It is possible to implement invoke/unwind in such a way that both invoke
> *and* unwind are fast, when unwind just unwinds and doesn't perform
> any magic behind-the-scenes operations.
>
Why? Exceptions are supposed to occur in exceptional situations. In
general, one should try to optimize for the common case, which does
not include invoke/unwind.
One should certainly not slow down a function call which never throws
just because other functions may throw. Paraphrasing Bjarne
Stroustrup, "If you don't use it, you shouldn't pay for it."
Nick
But it does have pthread_exit().
> So yet again, this is a C++ issue.
No, it isn't:
The effect of calling longjmp() or siglongjmp() is undefined if there
have been any calls to pthread_cleanup_push() or pthread_cleanup_pop()
made without the matching call since the jump buffer was filled.
We are not talking about longjmp() or siglongjmp() we are talking about
invoke/unwind!
I take it you have never used Python ;)
(Python uses exceptions to terminate loops, so it helps if they aren't
too slow)
>
> One should certainly not slow down a function call which never throws
> just because other functions may throw. Paraphrasing Bjarne
> Stroustrup, "If you don't use it, you shouldn't pay for it."
>
And to quote Tim Peters (about Python)
"If the implementation is hard to explain, it's a bad idea."
Please try and get out of the C++ mindset, llvm may be implemented *in*
C++, but its not implemented just *for* C++ (at least I hope it isn't).
Mark.
That lowerinvoke pass does seem to have room for improvement, then.
A function with stack-allocated objects having non-trivial destructors
in C++ would work the same way. Unwind would skip any functions that
don't need cleanup, go to the cleanup code within functions that do,
and that cleanup code would issue another unwind to continue the
process.
Let's go back a bit. Your claim is that there is no need to unwind
frame-by-frame, an unwind simply needs to make a single jump to an
invoke instruction and restore the context (which in x86 is just 6
registers). (This is, more or less, what longjmp() does.) Duncan Sands
explained to you why that wouldn't work, saying "if you throw an
exception using your proposed unwind implementation, then it wouldn't
be caught by dwarf catch/cleanup regions".
He's right. You can't just jump to the invoke instruction, you must
also pop any cleanups. This is nothing to do with C++, and it has
nothing to do with whether a language has stack-allocated objects.
But don't all functions that require cleanups issue invokes rather
than calls? Unwind or __cxa_throw would just go to the nearest one,
no?
I don't think so: any function in any language that has pthreads bindings
can call pthread_cleanup_push() .
Andrew.
If you can make your point without any references to any C/C++ specific
features it might be more convincing ;)
>
> He's right. You can't just jump to the invoke instruction, you must
> also pop any cleanups. This is nothing to do with C++, and it has
> nothing to do with whether a language has stack-allocated objects.
What cleanups?
If the C++ front end leaves all these dead objects on the stack and
insists they are cleaned up promptly, then it its responsibility to
clean them up.
It shouldn't burden other front-ends.
In a purely heap allocated language without finalizers, say ML or
haskell, there is nothing to clean-up and a simple jump will suffice.
For languages with finalizers such as Python or Java, a special
finalizer thread can do the cleaning up lazily once the GC has collected
the dead objects.
It does depend on the language semantics, and a lot of languages allow
lazy finalization of objects.
Both Java and Python support "finally" clauses, which are roughly
equivalent in this context.
-Eli
>> Let's go back a bit. Your claim is that there is no need to unwind
>> frame-by-frame, an unwind simply needs to make a single jump to an
>> invoke instruction and restore the context (which in x86 is just 6
>> registers). (This is, more or less, what longjmp() does.) Duncan Sands
>> explained to you why that wouldn't work, saying "if you throw an
>> exception using your proposed unwind implementation, then it wouldn't
>> be caught by dwarf catch/cleanup regions".
>
> If you can make your point without any references to any C/C++ specific
> features it might be more convincing ;)
Well, you seemed to be claiming that cleanups were due to stack-allocated
objects in C++. I have shown that is not the case.
>> He's right. You can't just jump to the invoke instruction, you must
>> also pop any cleanups. This is nothing to do with C++, and it has
>> nothing to do with whether a language has stack-allocated objects.
>
> What cleanups?
The ones pushed by pthread_cleanup_push().
> If the C++ front end leaves all these dead objects on the stack and
> insists they are cleaned up promptly, then it its responsibility to
> clean them up.
> It shouldn't burden other front-ends.
Like I said, it's nothing to do with C++, or with the C++ front end.
> In a purely heap allocated language without finalizers, say ML or
> haskell, there is nothing to clean-up and a simple jump will suffice.
>
> For languages with finalizers such as Python or Java, a special
> finalizer thread can do the cleaning up lazily once the GC has collected
> the dead objects.
[Aside: That's not quite true for Java, where unwinding releases
locks. This could be done by registering a handler at every site
where a lock is acquired, but I don't think that's how it generally
works.]
Maybe there does exist a programming language that never calls (or is
called by) programs in other programming languages and never runs in
an environment where one of its threads may be terminated. In that
case, interoperability of generated code doesn't matter. In the
heterogeneous world of the contemporary OS I'm not sure if that's a
common case.
I did. Recall my mention of java/c#/ruby/python's finally/ensure
blocks, or C#'s using blocks. For proper implementation, these need
multi-level unwinds, as they specify that some code must run, even if
an exception would bail-out.
> I take it you have never used Python ;)
> (Python uses exceptions to terminate loops, so it helps if they aren't
> too slow)
I have used python, and it is slow (sorry). In fact, python
Exceptions are implemented in python as a second return value, thus
EVERY function, even those which don't throw exceptions, must pay the
price. And just because the python community does it, doesn't mean
it's good programming practice.
>Please try and get out of the C++ mindset, llvm may be implemented *in*
>C++, but its not implemented just *for* C++ (at least I hope it isn't).
That is exactly my argument. Multi-level unwinds are required by MANY
languages.
--
Nick Johnson
I believe both Java and .NET tend to assume that code running outside
their respective runtimes isn't killing any of their threads. Also,
Java makes no attempt to allow exceptions to propagate across JNI
bounaries... incoming C++ exceptions just kill the process, and
outgoing Java exceptions have to be explicitly checked for in C++
code.
The finally clause merely guarantees to execute whether or not an
exception is thrown. They do not cause finalization of any objects.
Java, Python, etc. do not require instant finalization of objects,
because objects exist on the heap, so they can outlive any stack
references to them.
Unwinding the stack destroys stack-allocated objects so they must be
finalised immediately *before* the frame in which they were allocated is
destroyed. Heap allocated objects merely become unreachable.
Mark.
You have shown no such thing.
>
>>> He's right. You can't just jump to the invoke instruction, you must
>>> also pop any cleanups. This is nothing to do with C++, and it has
>>> nothing to do with whether a language has stack-allocated objects.
>> What cleanups?
>
> The ones pushed by pthread_cleanup_push().
pthread_cleanup_push() only exists in C/C++. It is a C library function
declared in "pthreads.h" a C header file
There are other languages than C/C++.
And some of them are easier to use :)
>
>> If the C++ front end leaves all these dead objects on the stack and
>> insists they are cleaned up promptly, then it its responsibility to
>> clean them up.
>> It shouldn't burden other front-ends.
>
> Like I said, it's nothing to do with C++, or with the C++ front end.
So give me an example than is not related to C/C++.
>
>> In a purely heap allocated language without finalizers, say ML or
>> haskell, there is nothing to clean-up and a simple jump will suffice.
>>
>> For languages with finalizers such as Python or Java, a special
>> finalizer thread can do the cleaning up lazily once the GC has collected
>> the dead objects.
>
> [Aside: That's not quite true for Java, where unwinding releases
> locks. This could be done by registering a handler at every site
> where a lock is acquired, but I don't think that's how it generally
> works.]
But llvm doesn't support that, so a Java front-end must insert code to
do the unlocking. Its separate from unwinding. Why can't the C++
front-end insert code to do its language-specific cleanups.
>
> Maybe there does exist a programming language that never calls (or is
> called by) programs in other programming languages and never runs in
> an environment where one of its threads may be terminated. In that
> case, interoperability of generated code doesn't matter. In the
> heterogeneous world of the contemporary OS I'm not sure if that's a
> common case.
The level of inter-language interoperability you are talking about is
frankly next to impossible.
Java doesn't allow threads to be terminated precisely because of the
sort of problems it causes.
try:
do something
ex = NULL
goto _final
catch:
ex = caught exception
do something else
_final:
do something important
if ex:
rethrow ex
> blocks, or C#'s using blocks. For proper implementation, these need
> multi-level unwinds, as they specify that some code must run, even if
> an exception would bail-out.
You are confusing stopping the unwinding at *some* levels or at *all*
levels.
Eg.
invoke
call
call
invoke
call
call
unwind
Requires a two unwinding steps for Java/Python and six steps for C++.
Only languages with stack-allocated objects must stop unwinding for calls.
>
>> I take it you have never used Python ;)
>> (Python uses exceptions to terminate loops, so it helps if they aren't
>> too slow)
>
> I have used python, and it is slow (sorry). In fact, python
Python: write program 1 day, run program 10mins = 1day + 10mins
C++: write program 1 month, run program 30 seconds = 1month + 30 secs
It depends on how you look at it, but we are getting off topic here :)
> Exceptions are implemented in python as a second return value, thus
Only in one particular Python implementation, there are several.
Its not in the language spec.
> EVERY function, even those which don't throw exceptions, must pay the
> price. And just because the python community does it, doesn't mean
> it's good programming practice.
The same could be said of C++. Both languages have their place, and llvm
should support compilers for both (and lots of other languages as well).
>
>> Please try and get out of the C++ mindset, llvm may be implemented *in*
>> C++, but its not implemented just *for* C++ (at least I hope it isn't).
>
> That is exactly my argument. Multi-level unwinds are required by MANY
> languages.
>
Not in the same way as C++, see above.
C++ doesn't need you to stop unwinding at every level. Only at levels
where stack-allocated objects with non-trivial destructors exist. At
those levels, the code generator can turn all calls into invokes.
Other levels can be skipped, since stack-allocated objects without
non-trivial destructors can simply be abandoned as the stack pointer
is bumped past them.
So a function with stack-allocated objects having non-trivial
destructors is pretty much the equivalent of a function with a
finalizer, and can be code-generated as such using invokes instead of
calls. Calls can still be skipped. The difference is that many C++
programs will tend to wind up with more invokes.
Python: write program 1 day, run program 10mins = 1day + 10mins
C++: write program 1 month, run program 30 seconds = 1month + 30 secs
It depends on how you look at it, but we are getting off topic here :)
Only in one particular Python implementation, there are several.
> Exceptions are implemented in python as a second return value, thus
Its not in the language spec.
The same could be said of C++. Both languages have their place, and llvm
> EVERY function, even those which don't throw exceptions, must pay the
> price. And just because the python community does it, doesn't mean
> it's good programming practice.
should support compilers for both (and lots of other languages as well).
>Not in the same way as C++, see above.
>> Please try and get out of the C++ mindset, llvm may be implemented *in*
>> C++, but its not implemented just *for* C++ (at least I hope it isn't).
>
> That is exactly my argument. Multi-level unwinds are required by MANY
> languages.
>
> You are confusing stopping the unwinding at *some* levels or at *all*
> levels.
> Eg.
> invoke
> call
> call
> invoke
> call
> call
> unwind
the dwarf unwinder only stops and runs user code at the invokes.
It does restore registers and so forth at every stack frame.
This is extra work done at unwind time that reduces the cost of
making invoke calls by avoiding the need to save a bunch of
context.
Ciao,
Duncan.
> Maybe there does exist a programming language that never calls (or isThe level of inter-language interoperability you are talking about is
> called by) programs in other programming languages and never runs in
> an environment where one of its threads may be terminated. In that
> case, interoperability of generated code doesn't matter. In the
> heterogeneous world of the contemporary OS I'm not sure if that's a
> common case.
frankly next to impossible.
Java doesn't allow threads to be terminated precisely because of the
sort of problems it causes.
So who is responsible for (as stated under the invoke description in the
language reference) "ensure that proper cleanup is performed in the case
of either a longjmp or a thrown exception"?
Is it entirely the front-end or is dwarf unwinding doing some extra work
(other than a few unecessary register restores), like restoring signals,
and calling pthread cleanup routines?
In other words does the dwarf unwinder do nothing other than unwind,
and was all that stuff about pthread_cleanup_pop() actions just a
red-herring?
Mark.
And __cxa_throw will still throw a DWARF exception with the landing
pads and the exception object selection and all that jazz?
On Mon, Jul 20, 2009 at 12:12 PM, Duncan Sands<bald...@free.fr> wrote:
> Hi Kenneth,
>
>> Is there anything in the current codebase that maps an "unwind"
>> instruction to a DWARF unnwinder? Seeing as how (I think) the DWARF
>> info is already compiled in, this would be a useful thing.
>
> in llvm from svn an "unwind" instruction is mapped to a call to
> _Unwind_Resume. This is no good for throwing a new dwarf exception,
> but it does mean that you can now implement cleanups as:
>
> invoke XYZ to label %normal unwind label %cleanup
> cleanup:
> do_stuff
> unwind
>
> Ciao,
>
> Duncan.
> Is there anything in the current codebase that maps an "unwind"
> instruction to a DWARF unnwinder? Seeing as how (I think) the DWARF
> info is already compiled in, this would be a useful thing.
in llvm from svn an "unwind" instruction is mapped to a call to
_Unwind_Resume. This is no good for throwing a new dwarf exception,
but it does mean that you can now implement cleanups as:
invoke XYZ to label %normal unwind label %cleanup
cleanup:
do_stuff
unwind
Ciao,
Duncan.
Yes. You still need to call __cxa_throw (or equivalent) to throw
a new exception, but you can now use unwind to rethrow an exception
from the unwind block of an invoke. That said, I don't think anyone
is making use of it - probably no-one except me even knows about it :)
> And __cxa_throw will still throw a DWARF exception with the landing
> pads and the exception object selection and all that jazz?
The __cxa_throw code is part of the gcc library and outside the control
of the LLVM project, so it works the same as before.
> So who is responsible for (as stated under the invoke description in the
> language reference) "ensure that proper cleanup is performed in the case
> of either a longjmp or a thrown exception"?
the unwinder (however it works) needs to stop at each invoke and run
the code in the unwind block of the invoke. I think it is important
that the default method of code generation for the unwind and invoke
instructions should interact correctly with code compiled with gcc.
This means that the default needs to make use of dwarf unwinding. It
does not prevent having multiple code generation implementations, eg
done using some kind of setjmp/longjmp implementation of unwind/invoke.
Code created using other implementations will only work correctly if the
entire executable, or at least the bits doing exception handling, are
all built with the same compiler/codegen options, but that's ok if it's
not the default.
> Is it entirely the front-end or is dwarf unwinding doing some extra work
> (other than a few unecessary register restores), like restoring signals,
> and calling pthread cleanup routines?
The front-end needs to ensure that there is an appropriate invoke that
runs the pthread cleanup routines in the unwind block.
> In other words does the dwarf unwinder do nothing other than unwind,
> and was all that stuff about pthread_cleanup_pop() actions just a
> red-herring?
I think it was a red-herring, but I don't know anything about
pthread_cleanup_pop.
:)
Working on Unladen Swallow, we've considered whether to try to port
Python to use Dwarf unwinding. It's a fairly simple tradeoff: with
return-value-exceptions you get a small cost in both time (a
predictable branch) and code size at each call site, whether or not
the call site is inside a try, but you get pretty cheap exception
throwing and propagation. With dwarf-exceptions, you have a small
space cost at each invoke (for the dwarf metadata), no time cost at
calls that don't throw, but fairly expensive exceptions when you do
throw them. Because Python throws exceptions for ending loops and
garbage-collecting generators, we expect return-value-exceptions to be
cheaper overall. In languages that throw exceptions more rarely (just
about everyone), dwarf-exceptions should be cheaper. Dwarf-exceptions
will still be worth investigating for Python eventually; we're just
not prioritizing it. Anyone who wants us to prioritize it higher
should bring data. :)
IronPython's and Jython's experiences are interesting too. IronPython
has had big problems with the cost of exceptions on the .NET platform.
Jython has _not_ had problems in the JVM, which indicates that the two
big virtual machines have made different tradeoffs for this. That
would make me very hesitant to say that either option is obviously
wrong.
Jeffrey
I forgot to mention that for the moment you still need to specify
the personality function using an eh.selector. That's because I
never got around to telling the code generators that if there is
none then it should use the C (not C++) personality.
For avoidance of doubt:
pthread_cleanup_pop() is nothing special, it's just an example of a
cleanup handler. Cleanups are very similar to exception handlers, but
with one small difference: they do some work and then call
_Unwind_Resume() which continues unwinding. The unwinder itself
doesn't know anything about pthread cleanups, it just executes
whatever is at the landing pad.
Andrew.
> pthread_cleanup_pop() is nothing special, it's just an example of a
> cleanup handler. Cleanups are very similar to exception handlers, but
> with one small difference: they do some work and then call
> _Unwind_Resume() which continues unwinding. The unwinder itself
> doesn't know anything about pthread cleanups, it just executes
> whatever is at the landing pad.
cleanups are turned into invoke + (cleanup code) + _Unwind_Resume by
llvm-gcc.
Ciao,
Duncan.
You only need that to throw an exception, not to do a simple unwind, right?
The selector and personality are not for throwing exceptions, they
are for catching them when doing dwarf exception handling.
Ciao,
Duncan.
>> pthread_cleanup_pop() is nothing special, it's just an example of a
>> cleanup handler. Cleanups are very similar to exception handlers, but
>> with one small difference: they do some work and then call
>> _Unwind_Resume() which continues unwinding. The unwinder itself
>> doesn't know anything about pthread cleanups, it just executes
>> whatever is at the landing pad.
>
> cleanups are turned into invoke + (cleanup code) + _Unwind_Resume by
> llvm-gcc.
Yes. Wasn't that obvious? Sorry, I don't understand what point you're
trying to make.
Andrew.
I noticed that some people (not you) seem to think that the dwarf
unwinder knows special things about signals, pthread_cleanup_pop
and whatnot. I was just trying to say that it does not: all this
stuff is covered by the standard exception handling concepts.
Ciao,
Duncan
I for one am glad you made that comment. I've been struggling until
the other day to understand just what the dwarf unwinder did and how
it did it and what it depended on and how it was kicked off, and how
exception objects and signals and pthread_cleanup_pop figured into it.
I started this thread because I saw a nice hole in the form of a
documented instruction that compiled down to nothing, and thought
maybe I could fill it. You beat me to it apparently, and enlightened
me quite a bit in the process.
Another language in this category is Ada. Ada has exceptions more-or-less like
C++, and it has "finalization", which is more-or-less like C++ "destructors"
(although slightly more robust). So the requirements on the run-time system
and generated code are similar.
Ada compilers typically use the same trade-off as C++ -- near-zero cost to
enter an exception-handling region, but high cost to actually raise the
exception.
- Bob
If you want all the details, there's a full description here:
http://www.codesourcery.com/public/cxx-abi/abi-eh.html
It is officially "for Itanium", but it's really target agnostic.
Andrew.
> Another language in this category is Ada. Ada has exceptions more-or-less like
> C++, and it has "finalization", which is more-or-less like C++ "destructors"
> (although slightly more robust). So the requirements on the run-time system
> and generated code are similar.
yes, they are very close. One difference is that in C++ you can throw
an arbitrary object (which may require complicated finalization), while
with Ada you can basically at most throw a string.
Ciao,
Duncan.