[CVS ci] exceptions-6: signals, catch a SIGFPE (generic platform)

Leopold Toetsch

unread,

Jul 11, 2003, 8:04:26 AM7/11/03

to P6I

More fun to play with.

This is t/op/hacks_5.pasm:
newsub P0, .Exception_Handler, _handler
set_eh P0
div I10, 0
print "not reached\n"
end
_handler:
.include "signal.pasm"
print "catched it\n"
set I0, P5["_type"]
neg I0, I0
ne I0, .SIGFPE, nok
print "ok\n"
nok:
end

I´know, that signals will end up in events sometimes, but for now its a
funny hack.

I still have some questions:
- We seem to need a global (the_exception) with the C<jmp_buf> inside.
What about multiple interpreters?
- When will we check, it there are events in the event queue?
- And of course finally: Is the exception handling, as now in CVS ok?
If yes, cleanup and more platforms should follow. As well as classifying
exception types and switching internal_exception to real ones.

Have fun,
leo

Benjamin Goldberg

unread,

Jul 11, 2003, 4:07:38 PM7/11/03

to perl6-i...@perl.org

Leopold Toetsch wrote:
[snip]

> - When will we check, it there are events in the event queue?

If we check too often (between each two ops), it will slow things down.

If we don't check often enough, the code might manage to avoid checking
for events entirely.

I would suggest that every flow control op check for events. Or maybe
just every control flow op that goes to earlier instructions.

And of course ops which might block for IO.

--
$a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "$@[$a%6
]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}

Gregor N. Purdy

unread,

Jul 11, 2003, 6:28:12 PM7/11/03

to Benjamin Goldberg, perl6-i...@perl.org

Benjamin --

The trick is to find the cheapest possible way to get conditional
processing to occur if and only if there are events in the event
queue.

I'll only be considering the fast core here for simplicity. But,
if you look at include/parrot/interp_guts.h, the only thing of
interest there is the definition of the DO_OP() macro, which
looks like this:

#define DO_OP(PC,INTERP) \
(PC = ((INTERP->op_func_table)[*PC])(PC,INTERP))

The easiest way to intercept this flow with minimal cost is to
have the mechanism that wants to take over replace the interpreter's
op_func_table with a block of pointers to some Parrot_hijack()
function that conforms to the opfunc prototype. Enqueueing an
event would set the appropriate interpreter's op_func_table to
hijack_op_func_table. Of course, if threads are involved there
are going to be locking issues, and I don't know how cheap the
locking can be made... I'm aware that there are some very cheap
locking approaches available, but I don't have a good feel for
when you can and cannot use them, and how cheap they really are.

The hijack op_func can do whatever it needs to do and then reset
the interpreter's op_func_table to the saved pointer and return
the same PC, so the interpreter will pick up where it left off.

There might be some use for continuations in here, too. Perhaps
the hijack function could save the current state as a continuation
(presumably with the old opfunc table as part of the saved
context), and then it could invoke the event handler with that
continuation as the place to return to...

If something like what I've described is workable, it would mean
we wouldn't have to have an explicit event queue checking policy.
Events would get delivered on the next op dispatch after they
were enqueued. As we handle the last event, we notice the queue
is empty and make sure the normal op_func_table is installed and
normal execution is resumed.

This approach does mean that events arriving quickly could DoS
the main line of execution, which could require us to add a bit
more logic to make sure that "thread" is not starved (if that
is important).

Regards,

-- Gregor

On Fri, 2003-07-11 at 16:07, Benjamin Goldberg wrote:
> Leopold Toetsch wrote:
> [snip]
> > - When will we check, it there are events in the event queue?
>
> If we check too often (between each two ops), it will slow things down.
>
> If we don't check often enough, the code might manage to avoid checking
> for events entirely.
>
> I would suggest that every flow control op check for events. Or maybe
> just every control flow op that goes to earlier instructions.
>
> And of course ops which might block for IO.
--

Gregor Purdy gre...@focusresearch.com
Focus Research, Inc. http://www.focusresearch.com/

Leopold Toetsch

unread,

Jul 13, 2003, 2:41:36 PM7/13/03

to Gregor N. Purdy, perl6-i...@perl.org

Gregor N. Purdy <gre...@focusresearch.com> wrote:
> Benjamin --

> #define DO_OP(PC,INTERP) \
> (PC = ((INTERP->op_func_table)[*PC])(PC,INTERP))

> The easiest way to intercept this flow with minimal cost is to
> have the mechanism that wants to take over replace the interpreter's
> op_func_table with a block of pointers to some Parrot_hijack()
> function that conforms to the opfunc prototype. Enqueueing an
> event would set the appropriate interpreter's op_func_table to
> hijack_op_func_table.

> On Fri, 2003-07-11 at 16:07, Benjamin Goldberg wrote:
>>
>> I would suggest that every flow control op check for events. Or maybe
>> just every control flow op that goes to earlier instructions.
>>
>> And of course ops which might block for IO.

Combining these ideas should work. The CG cores don't have a func_table
of op functions but a table of labels. This could be switched too,
pointing to different labels with op bodys that check the event queue.
Its still problematic to do this kind of func_table switching in the JIT
core. For best performance the branch opcodes are JITted, so they can't
be switched like other run cores func_tables. OTOH the performance
penalty of checking the event queue is only high for very small tight
loops, like that in the mops.pasm.

leo

Leopold Toetsch

unread,

Jul 15, 2003, 4:15:57 AM7/15/03

to Gregor N. Purdy, perl6-i...@perl.org

Gregor N. Purdy <gre...@focusresearch.com> wrote:

> #define DO_OP(PC,INTERP) \
> (PC = ((INTERP->op_func_table)[*PC])(PC,INTERP))

> The easiest way to intercept this flow with minimal cost is to
> have the mechanism that wants to take over replace the interpreter's
> op_func_table with a block of pointers to some Parrot_hijack()
> function that conforms to the opfunc prototype. Enqueueing an
> event would set the appropriate interpreter's op_func_table to
> hijack_op_func_table.

Thinking more about this: Switching the whole op_func_table() or
ops_addr[] (for CG cores) is simpler, then e.g. replacing backward branch
ops. Only the switched core doesn't play nicely here. Having neither a
op_func_table nor an ops_addr[], it would need an explicit
CHECK_EVENTS() in the loop.

Now I have done a very similar thing some time ago:
Subject: [RfC] a scheme for core.ops extending
Date: Wed, 05 Feb 2003 12:28:21 +0100

On the enqueueing of an event, the op_lib is told to switch the
jump_table/ops_addr via a special op_lib->init call. (If the event check
func_table doesn't exist yet, its constructed on the fly, by filling the
check_event opcode into the func_table.

The check_event opcode now restores the func_table/ops_addr, saves the
context (generates a return continuation) and branches to the event
handler (if any) or handles the event internally. Then it executes the
same opcode again, like its done in the prederef function, which resumes
normal opcode flow.

leo

Jason Gloudon

unread,

Jul 15, 2003, 8:12:16 AM7/15/03

to Leopold Toetsch, Gregor N. Purdy, perl6-i...@perl.org

On Tue, Jul 15, 2003 at 10:15:57AM +0200, Leopold Toetsch wrote:

How is the described scheme supposed to work with JIT generated code ?

--
Jason

Leopold Toetsch

unread,

Jul 15, 2003, 8:49:42 AM7/15/03

to Jason Gloudon, perl6-i...@perl.org

Jason Gloudon <pe...@gloudon.com> wrote:
> On Tue, Jul 15, 2003 at 10:15:57AM +0200, Leopold Toetsch wrote:

> How is the described scheme supposed to work with JIT generated code ?

JIT code would be intersparsed with (JITted) CHECK_EVENTS() opcodes.
They would get emitted e.g. at backward branches like proposed by
Benjamin.

leo

Leopold Toetsch

unread,

Jul 17, 2003, 8:19:18 AM7/17/03

to l...@toetsch.at, perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> wrote:
> ... Switching the whole op_func_table() or

> ops_addr[] (for CG cores) is simpler,

If have it running now for the slow and the computed goto core.
The signal handler (interrupt code) switches the op_func_table (ops_addr)
and returns.
Then the next executed instructions is the real event handler, which
restores the op_func_table and then does something appropriate with the
event.

But there is a problem with the prederefed cores. They do now something
like:

PC = ((op_func_t*) (*PC)) (PC, INTERP); // prederef functions

To be able to switch function tables, this then should become:

PC = ((op_func_t*) (func_table + *PC)) (PC, INTERP);

Thus predereferncing the function pointer would place an offset into the
function table, not an absolute address.

Or is there a better way to do it?

leo

Sean O'Rourke

unread,

Jul 17, 2003, 8:57:32 AM7/17/03

to Leopold Toetsch, perl6-i...@perl.org

On Thu, 17 Jul 2003, Leopold Toetsch wrote:
> PC = ((op_func_t*) (*PC)) (PC, INTERP); // prederef functions
>
> To be able to switch function tables, this then should become:
>
> PC = ((op_func_t*) (func_table + *PC)) (PC, INTERP);
>
> Thus predereferncing the function pointer would place an offset into the
> function table, not an absolute address.
>
> Or is there a better way to do it?

Replacing the next instruction with a branch to the signal handler
(like adding a breakpoint) out of the question? Of course, if we're
sharing bytecode this is expensive, since you'd have to do something
like this:

bsr handler
...
handler:
if cur_thread == thread_with_signal goto real_handler
# replaced instruction
ret

which penalizes all other bytecode users. I guess it depends how
common we expect signal handling to be.

/s

Leopold Toetsch

unread,

Jul 17, 2003, 10:04:22 AM7/17/03

to Sean O'Rourke, perl6-i...@perl.org

Sean O'Rourke wrote:

>>To be able to switch function tables, this then should become:
>>
>> PC = ((op_func_t*) (func_table + *PC)) (PC, INTERP);
>>

>>Or is there a better way to do it?
>>
>
> Replacing the next instruction with a branch to the signal handler
> (like adding a breakpoint) out of the question?

I don't know, how to get the address of the next instruction i.e. the
"PC" above. Going this way would either mean:
- fill the bytecode segment with the handler opcode function or
- locate the PC on the stack or in registers (like %esi in CGP)
The former seems rather expensive (at least if we heavily use events),
the latter seems to be possible only per platform/compiler(-revision).

> ...Of course, if we're

> sharing bytecode this is expensive, since you'd have to do something
> like this:

The prederefed code can't be shared betwean threads: Registers are
already absolute locations in the interpreter.

> /s

leo

Sean O'Rourke

unread,

Jul 17, 2003, 12:21:43 PM7/17/03

to Leopold Toetsch, perl6-i...@perl.org

On Thu, 17 Jul 2003, Leopold Toetsch wrote:

> > Replacing the next instruction with a branch to the signal handler
> > (like adding a breakpoint) out of the question?
>
> I don't know, how to get the address of the next instruction i.e. the
> "PC" above. Going this way would either mean:
> - fill the bytecode segment with the handler opcode function or

Yuck.

> - locate the PC on the stack or in registers (like %esi in CGP)
> The former seems rather expensive (at least if we heavily use events),
> the latter seems to be possible only per platform/compiler(-revision).

For non-jit code, the latter seems doable if we can find a way to
force registers back out to memory if necessary (short of declaring
the PC volatile, which would just suck). The Boehm collector uses a
platform-independent setjmp() hack to do this.

For jit code, we know the jit PC reg, so it shouldn't be a problem.

Of course, you should probably take this with a grain of salt with
size inversely proportional to the amount of the solution I've coded.
(An infinite grain in this case...)

/s

Leopold Toetsch

unread,

Jul 17, 2003, 3:52:35 PM7/17/03

to Sean O'Rourke, perl6-i...@perl.org

Sean O'Rourke wrote:

> On Thu, 17 Jul 2003, Leopold Toetsch wrote:
>
>>>Replacing the next instruction with a branch to the signal handler
>>>(like adding a breakpoint) out of the question?
>>>
>>I don't know, how to get the address of the next instruction i.e. the
>>"PC" above.

Thinking more of this: There is no next instruction for branches, or
better there are 2, where the second isn't known. Seems to be the end of
this one.
Remaining is:
1) save & fill the byte_code with event handler ops.
2) use address relative to op_func_table
3) Or the "official" way: do regular checks.
I estimate 2) to be best for prederefed code. 3) is the only one for JIT.

leo

Jerome Vouillon

unread,

Jul 17, 2003, 4:31:23 PM7/17/03

to Leopold Toetsch, Sean O'Rourke, perl6-i...@perl.org

On Thu, Jul 17, 2003 at 09:52:35PM +0200, Leopold Toetsch wrote:
> Remaining is:
> 1) save & fill the byte_code with event handler ops.
> 2) use address relative to op_func_table
> 3) Or the "official" way: do regular checks.
> I estimate 2) to be best for prederefed code.

I'm not sure about this. With 2), op_func_table needs to be reloaded
from memory for each instruction, which may be somewhat expansive. It
cannot be kept in a register in a portable way.

-- Jerome

Benjamin Goldberg

unread,

Jul 17, 2003, 8:40:44 PM7/17/03

to perl6-i...@perl.org

Sean O'Rourke wrote:
>
> On Thu, 17 Jul 2003, Leopold Toetsch wrote:
> > PC = ((op_func_t*) (*PC)) (PC, INTERP); // prederef functions
> >
> > To be able to switch function tables, this then should become:
> >
> > PC = ((op_func_t*) (func_table + *PC)) (PC, INTERP);
> >
> > Thus predereferncing the function pointer would place an offset into
> > the function table, not an absolute address.
> >
> > Or is there a better way to do it?
>
> Replacing the next instruction with a branch to the signal handler
> (like adding a breakpoint) out of the question?

Not "the next instruction" ... the next *branch* instruction. And only
replace those branch instructions which could be loops.

> Of course, if we're sharing bytecode this is expensive, since you'd
> have to do something like this:
>
> bsr handler
> ...
> handler:
> if cur_thread == thread_with_signal goto real_handler
> # replaced instruction
> ret
>
> which penalizes all other bytecode users. I guess it depends how
> common we expect signal handling to be.

Actually, I'm thinking of something like the following... suppose the
original code is like:

label_foo:
loop body
branch_address:
branch label_foo

Add in the following:

e_handler_foo:
.local PerlHash handlers_with_events
.local int i_have_an_event
handlers_with_events = ....
i_have_an_event = handlers_with_events[cur_thread]
unless i_have_an_event, label_foo
bsr dequeue_events_and_handle_them_all
branch label_foo

And then, when an event occurs, replace "branch label_foo" with "branch
e_handler_foo".

When there are no events queued, for any thread, then we change "branch
e_handler_foo" back into "branch label_foo", for speed.

And we might even be able to do this with JIT! At least, we can if we
can keep track of the addresses of all the "branch label_foo"s (so we
know what to replace), and if we can replace bits of code while the code
is running.

Leopold Toetsch

unread,

Jul 18, 2003, 8:19:08 AM7/18/03

to Benjamin Goldberg, perl6-i...@perl.org

Benjamin Goldberg <ben.go...@hotpop.com> wrote:

> Not "the next instruction" ... the next *branch* instruction. And only
> replace those branch instructions which could be loops.

Works. Many thanks for the input.

I have now running this:

1) Initialization:
- normal core: build op_func_table with all opcode #4 [1]
- CG core: build ops_addr[] filled with this opcode
- prederef cores: build a list of (backward) branch instructions
and the opcode at that offset

2) When an events gets scheduled (signal handler) it calls the running
core with:

interpreter->op_func_table =
init_func(interpreter, OPLIB_SET_CHK_EV_FT)->op_func_table;

This replaces for normal and CG core the op_func_table or the
ops_addr with that from 1)
Prederefed cores get on all branch instruction from the list built
in 1) an opcode #4 patched.

3) So when the next instruction (normal, CG core) or the branch
instruction (prederefed cores) gets executed, first the op_func_table
or the patched instructions are restored and then the event handler
can be called.

This now works for all cores (except JIT[2]). It doesn't have any runtime
penalty for an extra check if events are due.

[1] This opcode (check_event__) calls the actual event handling code
and returns the same address, i.e. doesn't advance the PC.

[2] We could do the same here, but this needs cache sync for ARM and PPC,
which may or may not be allowed in signal code.

Still needs some cleanup ...

leo

Nicholas Clark

unread,

Sep 12, 2003, 6:38:09 PM9/12/03

to Benjamin Goldberg, perl6-i...@perl.org

On Thu, Jul 17, 2003 at 08:40:44PM -0400, Benjamin Goldberg wrote:

> Actually, I'm thinking of something like the following... suppose the
> original code is like:
>
> label_foo:
> loop body
> branch_address:
> branch label_foo
>
> Add in the following:
>
> e_handler_foo:
> .local PerlHash handlers_with_events
> .local int i_have_an_event
> handlers_with_events = ....
> i_have_an_event = handlers_with_events[cur_thread]
> unless i_have_an_event, label_foo
> bsr dequeue_events_and_handle_them_all
> branch label_foo
>
> And then, when an event occurs, replace "branch label_foo" with "branch
> e_handler_foo".
>
> When there are no events queued, for any thread, then we change "branch
> e_handler_foo" back into "branch label_foo", for speed.

Do we need to do this last bit explicitly? Or can we do it lazily -
each time we get called to e_handler when there are no longer events,
we change back that instruction.

Or is this already done this way?

Nicholas Clark

Mark A. Biggar

unread,

Sep 12, 2003, 6:58:59 PM9/12/03

to Nicholas Clark, perl6-i...@perl.org

Some issues related to this scheme:

1) In a highly secure mode, you don't want self modifing code. we need
a way to lock down code into RO memory when security is important. A
similar problem exists with modifing vtables on the fly.

2) Making that exception safe may be a problem.

3) Is parriot byte code suppose to be position independent? Having
all code be PIC makes dynamic loading simpler.

4) As parriot ops are rather CISC anyway, maybe a special op
that checks for events and takes one of two branches could be used here.

--
ma...@biggar.org
mark.a...@comcast.net

Leopold Toetsch

unread,

Sep 13, 2003, 8:09:32 AM9/13/03

to Nicholas Clark, perl6-i...@perl.org

Its not done in any way. Albeit I had this scheme in my talk at
YAPC::EU, its not committed, lacking some kind of confirmation.

WRT your question: scheduling an event puts the event handler opcode
in place. If there are no events in the quueue, there are no event
check opcodes executed that could slow down the operation.

When there are multiple events to handle, it doesn't save anything to do
these later, so the event handling code puts the original instruction
back in the stream. At least my proposed scheme works like that.

> Nicholas Clark

leo