Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

early draft of exceptions PDD

0 views
Skip to first unread message

Allison Randal

unread,
Apr 5, 2006, 6:24:27 PM4/5/06
to Internals List
In: docs/pdds/clip/pddXX_exceptions.pod

As with the I/O PDD, this isn't a final form, it's just a draft to
seed discussion. What's missing? What's inaccurate? What's accurate
for the current state of Parrot, but is something you always intended
to write out later? What thoughts have you had on how exceptions
should work? All comments, suggestions, and contributions cheerfully
welcomed.

Allison

Bob Rogers

unread,
Apr 8, 2006, 10:49:54 PM4/8/06
to Allison Randal, Internals List
From: Allison Randal <all...@perl.org>
Date: Wed, 5 Apr 2006 15:24:27 -0700

In: docs/pdds/clip/pddXX_exceptions.pod

Allison

Here's what I hope is a contribution.

-- Bob Rogers
http://rgrjr.dyndns.org/

------------------------------------------------------------------------
# Copyright: 2001-2006 The Perl Foundation.
# $Id: pddXX_exceptions.pod 12153 2006-04-09 02:23:27Z rgrjr $

=head1 NAME

docs/pdds/clip/pddXX_exceptions.pod - Parrot Exceptions

. . .

=item *

C<push_eh> creates an exception handler and pushes it onto the control
stack. It takes a label (the location of the exception handler) as its
only argument. [Is this right? Treating exception handlers as label
jumps rather than full subroutines is error-prone.]

They are not "jumps" but continuations, so in a sense they are more
general than subs, which don't have prior state.

. . .

=item *

C<pushaction> pushes a subroutine object onto the control stack. If the
control stack is unwound due to an exception (or C<popmark>, or
subroutine return), the subroutine is invoked with an integer argument:
C<0> means a normal return; C<1> means an exception has been raised.
[Seems like there's lots of room for dangerous collisions here.]

I'm not sure what you mean by "collisions" here, nor why you think they
would be dangerous. Arguably, C<pushaction> is too simplistic; it
doesn't provide for such things as the repeated exit-and-reenter
behavior of coroutines, and there is no mechanism to specify a thunk
that gets called when *entering* a dynamic context . . .

=back

=head1 IMPLEMENTATION

[I'm not convinced the control stack is the right way to handle
exceptions. Most of Parrot is based on the continuation-passing style of
control, shouldn't exceptions be based on it too? See bug #38850.]

Seems to me there isn't any real choice. Exception handlers are part of
the dynamic context, and dynamic contexts nest in such a way as to
behave like a stack. Even pure CPS implementations that want to
maintain dynamic state have to create an explicit stack in a global
variable somewhere.

. . .

Other opcodes respond to an C<errorson> setting to decide whether to
throw an exception or return an error value. C<find_global> throws an
exception (or returns a Null PMC) if the global name requested doesn't
exist. C<find_name> throws an exception (or returns a Null PMC) if the
name requested doesn't exist in a lexical, current, global, or built-in
namespace.

It's a little odd that so few opcodes throw exceptions (these are the
ones that are documented, but a few others throw exceptions internally
even though they aren't documented as doing so). It's worth considering
either expanding the use of exceptions consistently throughout the
opcode set, or eliminating exceptions from the opcode set entirely. The
strategy for error handling should be consistent, whatever it is. [I
like the way C<LexPad>s and the C<errorson> settings provide the option
for exception-based or non-exception-based implementations, rather than
forcing one or the other.]

This have-your-cake-and-eat-it-too (HYCAEIT?) strategy sounds good in
theory, but may be dangerous in practice. Which style of error handling
a given piece of code uses is a static property of the way the code is
written. On the other hand, C<errorson> is dynamic and global. If one
of the modules you use wants to do error handling by checking return
values, but another module doesn't check returns because it expects
errors to be signalled, then no C<errorson> setting will satisfy both,
regardless of how you want to design *your* code.

I personally prefer exception-based error handling, since it scales
better. I have been acting on this when the opportunity arises,
changing internal_exception calls to real_exception when it makes sense,
and when I'm mucking around in that code anyway. (A good example of
this is "No exception to pop", come to think of it.) It is also helpful
to get a backtrace when something fails.

On the other hand, it would be a pain have to write 10 ops for an
error handler just to catch a slightly unusual situation that could be
handled adequately by testing a special return value. I think each case
needs to be examined individually, but it's a choice of return value OR
throwing an error. IMHO.

=head2 Excerpt

[Excerpt from "Perl 6 and Parrot Essentials" to seed discussion.
Out-of-date in some ways, and in others it was simply speculative.]

Exceptions provide a way of calling a piece of code outside the normal
flow of control. They are mainly used for error reporting or cleanup
tasks, but sometimes exceptions are just a funny way to branch from
one code location to another one.

Exceptions are objects that hold all the information needed to handle
the exception: the error message, the severity and type of the error,
etc. The class of an exception object indicates the kind of exception
it is.

Exception handlers are derived from continuations. They are ordinary
subroutines that follow the Parrot calling conventions, but are never
explicitly called from within user code.

Not quite true; a Continuation is not a Sub, though it can be invoked
like one.

User code pushes an exception
handler onto the control stack with the C<push_eh> opcode. The system
calls the installed exception handler only when an exception is thrown.

push_eh _handler # push handler on control stack
find_global P10, "none" # may throw exception
clear_eh # pop the handler off the stack
...

_handler: # if not, execution continues here
get_params '(0,0)', P0, S0 # handler is called with (exception, message)
...

If the global variable is found, the next statement
(C<clear_eh>) pops the exception handler off the control stack and
normal execution continues. If the C<find_global> call doesn't find
C<none> it throws an exception by passing an exception object to the
exception handler.

The first exception handler in the control stack sees every exception

This is really the last (topmost) exception handler.

thrown. The handler has to examine the exception object and decide
whether it can handle it (or discard it) or whether it should
C<rethrow> the exception to pass it along to an exception handler
deeper in the stack. The C<rethrow> opcode is only valid in exception
handlers. It pushes the exception object back onto the control stack so
Parrot knows to search for the next exception handler in the stack. The

This is not correct; exception objects are never pushed onto the control
stack. And the exception handler itself is popped off the control stack
before it is invoked.

process continues until some exception handler deals with the exception
and returns normally, or until there are no more exception handlers on
the control stack. When the system finds no installed exception handlers
it defaults to a final action, which normally means it prints an
appropriate message and terminates the program.

Currently it also prints a backtrace, which is really nice. Alas, the
backtrace is only from the point of the final rethrow by the oldest
(bottommost) exception handler. This is the greatest weakness with the
current Parrot exception-handling design: By the time you find out that
a given exception is unhandled, the dynamic environment of the C<throw>
has been destroyed by the very process of searching for a willing
handler. This makes it extremely difficult to write a debugger than can
do anything useful about uncaught exceptions.

When the system installs an exception handler, it creates a return
continuation with a snapshot of the current interpreter context. If

This is confusing; I assume you are talking about the Exception_Handler
itself and not a RetContinuation.

the exception handler just returns (that is, if the exception is
cleanly caught) the return continuation restores the control stack
back to its state when the exception handler was called, cleaning up
the exception handler and any other changes that were made in the
process of handling the exception.

Hmm. It seems that an exception is "cleanly caught" only if it is not
rethrown. It is therefore not possible to tell by looking at the
exception itself whether or not it is "cleanly caught" or if it is still
in the process of being signalled.

Exceptions thrown by standard Parrot opcodes (like the one thrown by
C<find_global> above or by the C<throw> opcode) are always resumable,
so when the exception handler function returns normally it continues
execution at the opcode immediately after the one that threw the
exception. Other exceptions at the run-loop level are also generally
resumable.

You seem to want to say that unhandled exceptions are ignored. Is that
correct? If so, I see several problems:

1. What is "the exception handler function" and how is it
distinguished from the function that established the exception handler?
[It sounds like you are expecting the exception handler to behave more
like a closure than a continuation . . . ]

2. The previous paragraph says that if "the exception handler just
returns", that means that "the exception is cleanly caught". Unless you
want to propose a new mechanism, the only way a handler can decline to
handle an exception is by rethrowing it, which precludes the possibility
of resuming.

3. Shouldn't unhandled exceptions either enter the debugger if
interactive, else die? Ignoring the fact that an opcode failed, like
ignoring the fact that anything else failed, seems dangerous . . .

new P10, Exception # create new Exception object
set P10["_message"], "I die" # set message attribute
throw P10 # throw it

Exceptions are designed to work with the Parrot calling conventions.
Since the return addresses of C<bsr> subroutine calls and exception
handlers are both pushed onto the control stack, it's generally a bad
idea to combine the two.

How about replacing this with the following:

. . . exception
handlers are both pushed onto the control stack, care must be taken
to nest them properly, i.e. by removing error handlers established
after C<bsr> before the corresponding C<ret>.

After all, it works as long as the user plays by the rules.

Allison Randal

unread,
Apr 18, 2006, 6:07:56 PM4/18/06
to Bob Rogers, Internals List
On Apr 8, 2006, at 19:49, Bob Rogers wrote:
> . . .
>>
>> =item *
>>
>> C<push_eh> creates an exception handler and pushes it onto the
>> control
>> stack. It takes a label (the location of the exception handler)
>> as its
>> only argument. [Is this right? Treating exception handlers as
>> label
>> jumps rather than full subroutines is error-prone.]
>
> They are not "jumps" but continuations, so in a sense they are more
> general than subs, which don't have prior state.

Right, a continuation taken on the address of a label in the current
compilation unit.

HLL exception handlers on the other hand, are likely to be written as
independent subroutines, much like the current signal handlers in
Perl 5. An exception handler is closer to an event handler than it is
to a return continuation. (The design choice is between having
exception handlers that are complete compilation units, or just code
segments. Both are valid options. And it may be that we want to
support both.)

The "error-prone" comment has to do with control flow. The effect of
the current implementation is that when the interpreter catches an
exception, it dumps control flow at the label that was captured in
the continuation. Any control flow after that is the responsibility
of the developer, and it's easy to get it wrong.

It might be more helpful if the continuation taken was a return
continuation: where to return to if an exception is caught and
successfully handled.


>> =item *
>>
>> C<pushaction> pushes a subroutine object onto the control
>> stack. If the
>> control stack is unwound due to an exception (or C<popmark>, or
>> subroutine return), the subroutine is invoked with an integer
>> argument:
>> C<0> means a normal return; C<1> means an exception has been
>> raised.
>> [Seems like there's lots of room for dangerous collisions here.]
>
> I'm not sure what you mean by "collisions" here, nor why you think
> they
> would be dangerous.

Specifically, because the control stack is used for multiple
different things, it's easy to get into a situation where the thing
you're popping off the stack isn't what you meant to pop off the
stack. It's one of the reasons we aren't using stack-based control
flow through most of Parrot.

> Arguably, C<pushaction> is too simplistic; it
> doesn't provide for such things as the repeated exit-and-reenter
> behavior of coroutines, and there is no mechanism to specify a thunk
> that gets called when *entering* a dynamic context . . .

That too.

>> =back
>>
>> =head1 IMPLEMENTATION
>>
>> [I'm not convinced the control stack is the right way to handle
>> exceptions. Most of Parrot is based on the continuation-passing
>> style of
>> control, shouldn't exceptions be based on it too? See bug #38850.]
>
> Seems to me there isn't any real choice. Exception handlers are
> part of
> the dynamic context, and dynamic contexts nest in such a way as to
> behave like a stack. Even pure CPS implementations that want to
> maintain dynamic state have to create an explicit stack in a global
> variable somewhere.

"dynamic contexts nest in such a way as to behave like a stack" is
true, but not necessarily the same thing as storing all exception
handlers on a single global stack that's also used for primitive
control flow.

Let's take the example of something that recently came up:
asynchronous I/O with exceptions. The current implementation says:
push a global exception handler onto the stack, call the routine that
might throw an exception, then pop the exception handler off the
stack. But with asynchronous I/O, the exception handler is likely to
be popped off the stack long before the async call throws an
exception. Or, if you delay popping off the exception handler until
the async callback is called, then you may have other exception
handlers pushed onto the stack in the mean time (possibly exception
handlers for other async calls).

In theory, the return continuation maintains the state of the
caller's control stack, so you can invoke return continuations up the
CPS chain until you reach a dynamic context where the exception is
handled. But where does control flow go after you handle an exception
from an async op?

Maybe we need a non-global equivalent of these options.

> I personally prefer exception-based error handling, since it scales
> better. I have been acting on this when the opportunity arises,
> changing internal_exception calls to real_exception when it makes
> sense,
> and when I'm mucking around in that code anyway. (A good example of
> this is "No exception to pop", come to think of it.) It is also
> helpful
> to get a backtrace when something fails.

Backtracing can be enabled without exceptions.

>> =head2 Excerpt
>>
>> [Excerpt from "Perl 6 and Parrot Essentials" to seed discussion.
>> Out-of-date in some ways, and in others it was simply
>> speculative.]

For everything below this point, keep in mind that the text was
written in 2004.

>> Exceptions provide a way of calling a piece of code outside the
>> normal
>> flow of control. They are mainly used for error reporting or
>> cleanup
>> tasks, but sometimes exceptions are just a funny way to branch
>> from
>> one code location to another one.
>>
>> Exceptions are objects that hold all the information needed to
>> handle
>> the exception: the error message, the severity and type of the
>> error,
>> etc. The class of an exception object indicates the kind of
>> exception
>> it is.
>>
>> Exception handlers are derived from continuations. They are
>> ordinary
>> subroutines that follow the Parrot calling conventions, but are
>> never
>> explicitly called from within user code.
>
> Not quite true; a Continuation is not a Sub, though it can be invoked
> like one.

This is one of the "out-of-date" bits.

>> thrown. The handler has to examine the exception object and decide
>> whether it can handle it (or discard it) or whether it should
>> C<rethrow> the exception to pass it along to an exception handler
>> deeper in the stack. The C<rethrow> opcode is only valid in
>> exception
>> handlers. It pushes the exception object back onto the control
>> stack so
>> Parrot knows to search for the next exception handler in the
>> stack. The
>
> This is not correct; exception objects are never pushed onto the
> control
> stack. And the exception handler itself is popped off the control
> stack
> before it is invoked.

Another out-of-date bit. It was one way we considered implementing it
(and still worth keeping in mind).

>> process continues until some exception handler deals with the
>> exception
>> and returns normally, or until there are no more exception
>> handlers on
>> the control stack. When the system finds no installed exception
>> handlers
>> it defaults to a final action, which normally means it prints an
>> appropriate message and terminates the program.
>
> Currently it also prints a backtrace, which is really nice. Alas, the
> backtrace is only from the point of the final rethrow by the oldest
> (bottommost) exception handler. This is the greatest weakness with
> the
> current Parrot exception-handling design: By the time you find out
> that
> a given exception is unhandled, the dynamic environment of the
> C<throw>
> has been destroyed by the very process of searching for a willing
> handler. This makes it extremely difficult to write a debugger
> than can
> do anything useful about uncaught exceptions.

Exception handler tracing is a useful feature, and is worth adding if
it doesn't cost too much (in terms of implementation complexity,
execution speed, etc).

>> When the system installs an exception handler, it creates a return
>> continuation with a snapshot of the current interpreter
>> context. If
>
> This is confusing; I assume you are talking about the
> Exception_Handler
> itself and not a RetContinuation.

In this context, no. It really meant a return continuation.

>> the exception handler just returns (that is, if the exception is
>> cleanly caught) the return continuation restores the control stack
>> back to its state when the exception handler was called,
>> cleaning up
>> the exception handler and any other changes that were made in the
>> process of handling the exception.
>
> Hmm. It seems that an exception is "cleanly caught" only if it is not
> rethrown. It is therefore not possible to tell by looking at the
> exception itself whether or not it is "cleanly caught" or if it is
> still
> in the process of being signalled.

For the most part, exceptions are likely to be discarded soon after
they're caught (and garbage collected at some point after that). But,
marking exceptions as "caught" may be a cheap way of tracking the
history of how a particular exception was handled. And if we do
decide to have resumable exceptions, that sort of information may be
immediately useful.

>> Exceptions thrown by standard Parrot opcodes (like the one
>> thrown by
>> C<find_global> above or by the C<throw> opcode) are always
>> resumable,
>> so when the exception handler function returns normally it
>> continues
>> execution at the opcode immediately after the one that threw the
>> exception. Other exceptions at the run-loop level are also
>> generally
>> resumable.
>
> You seem to want to say that unhandled exceptions are ignored. Is
> that
> correct? If so, I see several problems:
>
> 1. What is "the exception handler function" and how is it
> distinguished from the function that established the exception
> handler?
> [It sounds like you are expecting the exception handler to behave more
> like a closure than a continuation . . . ]

An "exception handler function" would be an exception handler that is
a complete compilation unit rather than just a code segment inside
some other compilation unit.

> 2. The previous paragraph says that if "the exception handler just
> returns", that means that "the exception is cleanly caught".
> Unless you
> want to propose a new mechanism, the only way a handler can decline to
> handle an exception is by rethrowing it, which precludes the
> possibility
> of resuming.

The current prototype implementation doesn't support resumable
exceptions, it's true. But, resumable exceptions are a useful
feature, and one that we originally planned for Parrot. Before we
throw out the baby with the bath water, we need to first look at what
it will take to build in resumable exceptions. It's possible that an
architecture that supports resumable exceptions may be a better
architecture overall.

> 3. Shouldn't unhandled exceptions either enter the debugger if
> interactive, else die? Ignoring the fact that an opcode failed, like
> ignoring the fact that anything else failed, seems dangerous . . .
>
> new P10, Exception # create new Exception object
> set P10["_message"], "I die" # set message attribute
> throw P10 # throw it

There are different levels of severity in exceptions. Some are
necessarily fatal. Some aren't. For example, some languages treat the
"end of file" condition as a non-fatal exception.

>> Exceptions are designed to work with the Parrot calling
>> conventions.
>> Since the return addresses of C<bsr> subroutine calls and
>> exception
>> handlers are both pushed onto the control stack, it's generally
>> a bad
>> idea to combine the two.
>
> How about replacing this with the following:
>
> . . . exception
> handlers are both pushed onto the control stack, care must be taken
> to nest them properly, i.e. by removing error handlers established
> after C<bsr> before the corresponding C<ret>.
>
> After all, it works as long as the user plays by the rules.

We can define any set of rules for exceptions (or calling
conventions, or any other Parrot subsystem) and expect users to
follow them, but some sets of rules are more prone to user error than
others. Our job as designers and implementors is to examine the
options and choose the set of rules that is most stable, robust,
maintainable, and (as much as possible) user-friendly.

Allison

Bob Rogers

unread,
Apr 29, 2006, 11:49:39 PM4/29/06
to Allison Randal, Internals List
From: Allison Randal <all...@perl.org>
Date: Tue, 18 Apr 2006 15:07:56 -0700

. . .

HLL exception handlers on the other hand, are likely to be written as
independent subroutines, much like the current signal handlers in
Perl 5. An exception handler is closer to an event handler than it is
to a return continuation. (The design choice is between having
exception handlers that are complete compilation units, or just code
segments. Both are valid options. And it may be that we want to
support both.)

I see three possibilities:

1. Compilation units only;

2. Continuations only; and

3. Compilation units that (may) invoke continuations.

The third is an inclusive interpretation of "both" -- I will argue below
that this is the best choice. Presumably, the first two could be
implemented in terms of the third?

The "error-prone" comment has to do with control flow. The effect of
the current implementation is that when the interpreter catches an
exception, it dumps control flow at the label that was captured in
the continuation. Any control flow after that is the responsibility
of the developer, and it's easy to get it wrong.

Seems to me that this is unavoidable. Exceptions are useful mainly
because they allow these drastic changes to normal control flow.
Writers of HLL code are relieved of at least some of this responsibility
by their compiler, but writers of PIR are exposed to the full complexity
of nonlocal transfer of control.

It might be more helpful if the continuation taken was a return
continuation: where to return to if an exception is caught and
successfully handled.

I would tend to agree. But something has to decide which handler gets
to catch the exception, before the continuation is invoked. So that
would mean dividing the current Parrot notion of exception handler into
a tester sub which can be invoked in the dynamic context of the error,
and the actual "what to do" code, which is reached via the continuation.
Reading ahead, this does seem to be what you have in mind; am I right?

But is this much of a change really on the table? I had thought that
PIR-visible semantic changes are frowned on these days?

>> =item *
>>
>> C<pushaction> pushes a subroutine object onto the control
>> stack. If the
>> control stack is unwound due to an exception (or C<popmark>, or
>> subroutine return), the subroutine is invoked with an integer
>> argument:
>> C<0> means a normal return; C<1> means an exception has been
>> raised.
>> [Seems like there's lots of room for dangerous collisions here.]
>
> I'm not sure what you mean by "collisions" here, nor why you think
> they would be dangerous.

Specifically, because the control stack is used for multiple
different things, it's easy to get into a situation where the thing
you're popping off the stack isn't what you meant to pop off the
stack. It's one of the reasons we aren't using stack-based control
flow through most of Parrot.

Do you have a specific example of such a situation? For compiled
languages (AFAIK), the features that use the control stack have
well-defined lexical "enter" and "exit" points, which makes it easy for
a compiler to generate correct code; that's the reasoning behind the
"behaves like a stack" argument below. Of course, that's not the case
for hand-written PIR, but the only remedy I can think of -- giving each
dynamic construct its own private stack -- seems like it would add a lot
of complexity for (IMO) an obscure benefit.

> Arguably, C<pushaction> is too simplistic; it
> doesn't provide for such things as the repeated exit-and-reenter
> behavior of coroutines, and there is no mechanism to specify a thunk
> that gets called when *entering* a dynamic context . . .

That too.

I'm working on mods to actions as part of my (long overdue) dynamic
binding implementation proposal [1]. I think we also need a
C<popaction> for consistency, and should probably support "enter"
actions as well as "exit" actions.

Another thing that may need clarification is the environment in which
the action runs. Since actions are kept on the control stack, and since
the current implementation calls them just after they are popped, they
see exactly the dynamic context in effect at C<pushaction> time. This
is true even when throwing to an outer exception handler. For instance,
if A calls B calls C calls D, and A pushed EH1, B pushed EH2, C pushed
action C1, and D throws to EH1 in A, the current implementation calls C1
with both EH1 and EH2 still in scope. I think this is correct;
otherwise, the programmer can't count on the dynamic state of the
cleanup action. One could make a case that both handlers, or at least
EH2, should be popped first, but this seems wrong.

>> =head1 IMPLEMENTATION
>>
>> [I'm not convinced the control stack is the right way to handle
>> exceptions. Most of Parrot is based on the continuation-passing
>> style of
>> control, shouldn't exceptions be based on it too? See bug #38850.]
>
> Seems to me there isn't any real choice. Exception handlers are part
> of the dynamic context, and dynamic contexts nest in such a way as to
> behave like a stack. Even pure CPS implementations that want to
> maintain dynamic state have to create an explicit stack in a global
> variable somewhere.

"dynamic contexts nest in such a way as to behave like a stack" is
true, but not necessarily the same thing as storing all exception
handlers on a single global stack that's also used for primitive
control flow.

By "primitive control flow" do you mean C<bsr/ret>? I would agree
that's pretty primitive -- and might be better off with its own stack
(see below). Otherwise, keeping handlers on the same stack with actions
and (some day) temporizations makes it convenient to peel them back in
the right order. Actions need to be executed in the right dynamic
binding and handler context, for one thing.

Let's take the example of something that recently came up:
asynchronous I/O with exceptions. The current implementation says:
push a global exception handler onto the stack, call the routine that
might throw an exception, then pop the exception handler off the
stack. But with asynchronous I/O, the exception handler is likely to
be popped off the stack long before the async call throws an
exception. Or, if you delay popping off the exception handler until
the async callback is called, then you may have other exception
handlers pushed onto the stack in the mean time (possibly exception
handlers for other async calls).

Or the original handler may catch something it wasn't supposed to.
Excellent point.

In theory, the return continuation maintains the state of the
caller's control stack, so you can invoke return continuations up the
CPS chain until you reach a dynamic context where the exception is
handled. But where does control flow go after you handle an exception
from an async op?

My kneejerk reaction is that maybe each asynchronous IO operation
requires its own coroutine (or something very like it) so that the user
can set up a different dynamic state than the main line that spawned it.
Off the top of my head, if you yield to the async coro before it's
ready, nothing happens, but if an exception was pending, the exception
happens in the coro environment.

But this is all very half-baked; the "coro" would look more like a
lightweight thread. And I'm not even certain how one would want an
asynchronous IO API to look, having never played with one (and having
largely ignored the thread).

>> Other opcodes respond to an C<errorson> setting . . .


>
> This have-your-cake-and-eat-it-too (HYCAEIT?) strategy sounds good in
> theory, but may be dangerous in practice. Which style of error handling
> a given piece of code uses is a static property of the way the code is
> written. On the other hand, C<errorson> is dynamic and global. If one
> of the modules you use wants to do error handling by checking return
> values, but another module doesn't check returns because it expects
> errors to be signalled, then no C<errorson> setting will satisfy both,
> regardless of how you want to design *your* code.

Maybe we need a non-global equivalent of these options.

I was trying to argue that any such option that acts globally would be
too much trouble to support. A global option might work if it could be
temporized, but temporizing around method calls to objects that want to
handle errors differently would be tedious and error-prone. It also
requires the programmer to be aware of the "preferred" setting of
C<errorson> for all modules used, which is also error-prone. Enforcing
a single model, or at the very least having it lexically "compiled in,"
seems much more tractable.

> I personally prefer exception-based error handling, since it scales
> better. I have been acting on this when the opportunity arises,
> changing internal_exception calls to real_exception when it makes sense,
> and when I'm mucking around in that code anyway. (A good example of
> this is "No exception to pop", come to think of it.) It is also helpful
> to get a backtrace when something fails.

Backtracing can be enabled without exceptions.

Really? Even where internal_exception is called?

>> =head2 Excerpt
>>
>> [Excerpt from "Perl 6 and Parrot Essentials" to seed discussion.
>> Out-of-date in some ways, and in others it was simply
>> speculative.]

For everything below this point, keep in mind that the text was
written in 2004.

Sorry; I didn't mean to be pedantic.

>> process continues until some exception handler deals with the
>> exception
>> and returns normally, or until there are no more exception
>> handlers on
>> the control stack. When the system finds no installed exception
>> handlers
>> it defaults to a final action, which normally means it prints an
>> appropriate message and terminates the program.
>
> Currently it also prints a backtrace, which is really nice. Alas, the
> backtrace is only from the point of the final rethrow by the oldest
> (bottommost) exception handler. This is the greatest weakness with the
> current Parrot exception-handling design: By the time you find out that
> a given exception is unhandled, the dynamic environment of the C<throw>
> has been destroyed by the very process of searching for a willing
> handler. This makes it extremely difficult to write a debugger than
> can do anything useful about uncaught exceptions.

Exception handler tracing is a useful feature, and is worth adding if
it doesn't cost too much (in terms of implementation complexity,
execution speed, etc).

I would agree, but I wasn't just talking about tracing (and tracing
exceptions, rather than handlers). I was talking about allowing an
interactive debugger, as in "perl -d", to take control at the point
where the uncaught exception is signaled, so that I can figure out why
it wasn't caught. (For the record, I've never actually needed to use
"perl -d" to debug a Perl 5 program with hairy eval/die logic, but I bet
it's no picnic.)

However, as I've already hinted, I think a workable solution is
within reach . . .

>> When the system installs an exception handler, it creates a return
>> continuation with a snapshot of the current interpreter
>> context. If
>
> This is confusing; I assume you are talking about the Exception_Handler
> itself and not a RetContinuation.

In this context, no. It really meant a return continuation.

Hmm. A RetContinuation recycles the leaving context, but in the case of
an exception, we don't know the identity of the leaving context until
the exception is invoked, which makes it hard to decide whether this is
safe/appropriate. So, unless I am still misunderstanding you, I don't
think this works with the current codebase (though it ought to work if
the "lightweight RetContinuation" proposal [2] is ever implemented).

> Hmm. It seems that an exception is "cleanly caught" only if it is not
> rethrown. It is therefore not possible to tell by looking at the
> exception itself whether or not it is "cleanly caught" or if it is still
> in the process of being signalled.

I think I now understand how you mean to do this.

> You seem to want to say that unhandled exceptions are ignored. Is
> that correct? If so, I see several problems:
>
> 1. What is "the exception handler function" and how is it
> distinguished from the function that established the exception handler?
> [It sounds like you are expecting the exception handler to behave more
> like a closure than a continuation . . . ]

An "exception handler function" would be an exception handler that is
a complete compilation unit rather than just a code segment inside
some other compilation unit.

Great; got it.

> 2. The previous paragraph says that if "the exception handler just
> returns", that means that "the exception is cleanly caught".
> Unless you
> want to propose a new mechanism, the only way a handler can decline to
> handle an exception is by rethrowing it, which precludes the
> possibility of resuming.

I now realize that you *were* proposing a new mechanism (new to me in
any case), using an "exception handler function" that "just returns."
So never mind.

The current prototype implementation doesn't support resumable

exceptions, it's true . . .

"Prototype"?? That implies a lot more flexibility to change the way
Parrot exceptions work than I had thought would be allowed . . .

But, resumable exceptions are a useful feature, and one that we
originally planned for Parrot. Before we throw out the baby with the
bath water, we need to first look at what it will take to build in
resumable exceptions. It's possible that an architecture that
supports resumable exceptions may be a better architecture overall.

I certainly agree that versatile error recovery is a big plus. In fact,
it's one of the things I like about Common Lisp, in which debuggers
typically present a menu of corrective actions for an unhandled error
along with the error message.

But thinking about this has made me realize the nature of my problem
with the following statement:

Exceptions thrown by standard Parrot opcodes (like the one
thrown by C<find_global> above or by the C<throw> opcode) are
always resumable, so when the exception handler function returns
normally it continues execution at the opcode immediately after
the one that threw the exception.

When I think of "resumable" errors, I think of being able to "skip" and
"retry" as the two main possibilities that apply to most situations,
with "substitute some other value" and possibly other corrective actions
as additional possibilities that depend on the operation. Right there,
the handler would need to do more than "just return" in order to select
the right possibility. But these possibilities really apply to
operations that are much higher level than instructions, such as
compiling a file or sending an email. For the most part, there is no
way for an outside agent to determine whether it is appropriate (or even
safe) to skip or retry an opcode; indeed, that may not even be apparent
to the person who wrote the HLL code from which it was compiled.

In other words, I think "resuming" makes sense only in terms of
HLL-programmer-defined concepts. In which case, there may be a whole
slew of restart alternatives that are available in the current dynamic
context, and there need to be mechanisms for finding out what they are,
and invoking a particular one. If you like (and assuming that you don't
think I'm on the wrong track), I can try to design something for Parrot
based on the Common Lisp model [3].

In this vein, it occurrs to me that the current design doesn't
specify what other actions a handler is allowed to take. To quote the
relevant paragraph (the "previous paragraph" mentioned above):

When the system installs an exception handler, it creates a return
continuation with a snapshot of the current interpreter context. If

the exception handler just returns (that is, if the exception is
cleanly caught) the return continuation restores the control stack
back to its state when the exception handler was called, cleaning up
the exception handler and any other changes that were made in the
process of handling the exception.

To paraphrase, each exception handler function has an associated
continuation. If the handler "just returns," Parrot invokes the
associated continuation, and the exception is thereby handled. Have I
got this right? If so, how does a handler *decline* to handle the
exception? By rethrowing? And is it acceptable for the handler to take
other action, e.g. by making a non-local exit via some other
continuation? Because, besides being useful in its own right, that is
the logical way for a handler to invoke a restart.

Allow me to propose an answer:

1. When an exception handler function is called during a C<throw>,
the handler is allowed to do pretty much anything, with the caveat that
it is running in the dynamic context of the code that is throwing,
modified temporarily such that the handler itself is not bound, i.e. the
handler can resignal the same condition (or another of the same class)
without invoking itself [4].

2. If the handler returns, then it has declined to handle the
exception, and Parrot goes on to try the next most recently bound
handler.

3. If the handler decides to handle the exception, it does so by
effecting a non-local exit. This could be by calling a continuation,
presumably to return to some point in the context that bound the
handler, by invoking a restart, or by throwing a new exception. It may
also make sense to rethrow the same exception, which (for non-fatal
exceptions) gives older handlers a chance to run first, making the inner
handler in effect a default handler.

4. If no handler takes up the challenge, then do nothing, continuing
after the signaling instruction in an appropriate way. Languages that
want some other behavior (such as "exit(255)" or entering a debugger)
must arrange to wrap the necessary handler around their main program.

Note that the code internal to C<throw> that is invoking the handlers
doesn't even need to know about the continuations that are used; they
would be used directly by the handlers, where presumably they would be
kept in closure variables.

IMHO, this would be a great improvement; it would solve the debugger
problem discussed above. Also (though I almost hesitate to mention it
[5]), this is compatible with the Common Lisp "condition" system design
of "signaling" [6], though I've left out a few subtleties.

On the down side, it makes it more difficult to mark exceptions as
"handled", since the very act of handing them transfers control to
somewhere else.

> 3. Shouldn't unhandled exceptions either enter the debugger if
> interactive, else die? Ignoring the fact that an opcode failed, like
> ignoring the fact that anything else failed, seems dangerous . . .
>
> new P10, Exception # create new Exception object
> set P10["_message"], "I die" # set message attribute
> throw P10 # throw it

There are different levels of severity in exceptions. Some are
necessarily fatal. Some aren't. For example, some languages treat the
"end of file" condition as a non-fatal exception.

And other languages will require a that fatal (or at least "serious")
exception be signaled. In CL, for example, unhandled EOF errors are
defined in such a way as to enter the debugger by default. Dealing with
this seems to reqire the following:

1. Define mechanisms for non-fatal exceptions. C<throw> could just
fall through to the next instruction, but it might be useful to have one
op that might return if the error is unhandled and another that never
returns [7], for the sake of code optimization. Then again, maybe this
should depend solely on the exception class.

2. Define a "generic" EOF exception which is non-fatal, and arrange
to signal it when an EOF is detected. If it returns, then the code sets
up the appropriate EOF return value(s).

3. Languages that require a fatal EOF bind a handler around the
dynamic scope of their code that intercepts the generic EOF and signals
the right language-appropriate exception. Such a binding would not be
easy to undo if the "strict EOF language" calls into a "non-strict EOF
language", so it might be better to choose the except class based on the
HLL from the start.

Which brings up another issue. The description of C<die> implies
that exception type and severity are separate:

C<die> throws an exception. It takes two arguments, one for the
severity of the exception and one for the type of exception.

Shouldn't the severity be defined by the exception class? Specifically,
by the taxonomy of exception classes?

>> Exceptions are designed to work with the Parrot calling
>> conventions.
>> Since the return addresses of C<bsr> subroutine calls and
>> exception
>> handlers are both pushed onto the control stack, it's generally
>> a bad
>> idea to combine the two.
>
> How about replacing this with the following:
>
> . . . exception
> handlers are both pushed onto the control stack, care must be taken
> to nest them properly, i.e. by removing error handlers established
> after C<bsr> before the corresponding C<ret>.
>
> After all, it works as long as the user plays by the rules.

We can define any set of rules for exceptions (or calling
conventions, or any other Parrot subsystem) and expect users to
follow them, but some sets of rules are more prone to user error than
others. Our job as designers and implementors is to examine the
options and choose the set of rules that is most stable, robust,
maintainable, and (as much as possible) user-friendly.

Allison

All very true. And C<ret> addresses are unique among the denizens of
the control stack in only pertaining to the context that pushed them;
they don't actually affect the dynamic context of called subs. So one
could certainly make a case that each context deserves its own
"stacklet" expressly to contain C<ret> addresses.

Then again, is this worth it? It does make PIR slightly more
"user-friendly" in this regard, but I can't imagine ever needing
C<bsr/ret> in the first place.

Sorry it took me so long to get my thoughts together.

-- Bob

[1] You didn't ask, but there's a draft up at
http://rgrjr.dyndns.org/perl/dynbind-proposal-v2.html .
A key work deadline has passed, so I expect to have more time to
work on it.

[2] See the "RetContinuation promotion, closures, and context leakage"
post of Sat, 04 Feb 2006 13:06:46 -0800
(http://www.mail-archive.com/perl6-i...@perl.org/msg31219.html).

[3] CL calls them "restarts"; see
http://www.lispworks.com/documentation/HyperSpec/Body/09_adb.htm if
you're curious.

[4] This modification of the dynamic statue may argue in favor of
putting exception handlers in their own dynamic stack, though.

[5] I mentioned this in a "Re: [RFC] Dynamic binding patch" post on
Tue, 3 Jan 2006 23:43:50 -0500 in response to Larry's reply (post 6
of http://xrl.us/ji2r). But I got warnocked, so I don't know what
Larry (or anyone else) thinks.

[6] http://www.lispworks.com/documentation/HyperSpec/Body/09_ada.htm

[7] FWIW, CL calls these SIGNAL and ERROR, respectively.

0 new messages