We now have since quite a time the current subroutine and the current continuation in the interpreter context structure. With that at hand, we should now be able to generate function tracebacks in error case and we need the call chain too, to optimize register frame recycling.
Whenever a continuation is created, we have to walk up the call chain and mark all return continuations as non-recyclable.
Should the traceback object be avaiable as a PMC? What information should be included in the traceback (object)?
At 11:45 AM +0200 10/29/04, Leopold Toetsch wrote:
>We now have since quite a time the current subroutine and the >current continuation in the interpreter context structure. With that >at hand, we should now be able to generate function tracebacks in >error case and we need the call chain too, to optimize register >frame recycling.
>Whenever a continuation is created, we have to walk up the call >chain and mark all return continuations as non-recyclable. >Should the traceback object be avaiable as a PMC?
Nah, I don't think so. There's nothing in the traceback that wouldn't be in the current continuation, so I don't think it's worth bothering. If someone wants to preserve the traceback they can just hold onto a continuation. (Unless we want the traceback to be more static, in which case holding onto it would be more like freezing the current continuation)
Basically we want to be able to walk a continuation chain and get access to everything. For performance reasons (because of people who fall in category 2 (which I'll get to)) we want to do it either with an arbitrary continuation object or with the current continuation without actually instantiating the current continuation.
>What information should be included in the traceback (object)?
Everything. It seems to me that there are two big things people will do with traceback info.
1) Dumps of call chains and state 2) Evil parent environment twiddling
For #1 we'll want to be able to walk up the chain and yank out sub/method names, the object being method-called, the lexical pads in effect, the namespace as it stands, and suchlike stuff. (I'm told that Python has a very nice verbose dump such that when it dumps you get not only the call tree but the variables and their values, which strikes me as good in two ways -- it makes debugging easier and it makes those sorts of errors harder to ignore because the output's so damn big)
For #2, which includes doing things like upvar and other unpleasantness, we'll be taking advantage of the fact that while we can't necessarily touch the actual interpreter/continuation chain, anything we fetch out of it (like, say, the lexical scope) is touchable, so injecting variables into your caller's scope isn't a big deal.
Dunno if we want to have a PMC that has this Evil Knowledge, or if we want to have ops. Part of me leans towards an op or two -- if for no other reason then to force it to be very explicit in any language that does it, which will make people think and make it really tough to do accidentally. -- Dan
--------------------------------------it's like this------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
Dan Sugalski <d...@sidhe.org> wrote: > At 11:45 AM +0200 10/29/04, Leopold Toetsch wrote: >>Should the traceback object be avaiable as a PMC? > Nah, I don't think so. There's nothing in the traceback that wouldn't > be in the current continuation, so I don't think it's worth > bothering.
Ok. This sounds like the traceback object *is* the continuation ;)
I think having methods is ok for that. It's in no way time critical to warrant opcodes.
>>What information should be included in the traceback (object)? > Everything. It seems to me that there are two big things people will > do with traceback info.
When and in which run loop do we update the C<current_pc>[1] ? Only, when warnings are enabled or always?
It's not quite clear from the C<invoke Px> opcode, whether we are calling a subroutine, or returning from a subroutine. So the place to update the current program counter of the current call can only be in the Sub PMC. The Sub PMC's invoke() vtable is called with the C<void *next> opcode pointer following the call. [2]
Now as we have C<invoke> and C<invoke Px> it would need some code inspection to decide if the call had one or two opcodes. That's all a bit ugly and time consuming.
> 1) Dumps of call chains and state > 2) Evil parent environment twiddling > For #1 we'll want to be able to walk up the chain and yank out > sub/method names, the object being method-called, the lexical pads in > effect, the namespace as it stands, and suchlike stuff.
Ok.
> (I'm told > that Python has a very nice verbose dump such that when it dumps you > get not only the call tree but the variables and their values, which > strikes me as good in two ways -- it makes debugging easier and it > makes those sorts of errors harder to ignore because the output's so > damn big)
The traceback printed e.g. after an exception just prints caller locations. But the traceback object itself has some read-only atrributes. Amongst others, C<tb_frame> let you inspect the call frame's variables.
> For #2, which includes doing things like upvar and other > unpleasantness, we'll be taking advantage of the fact that while we > can't necessarily touch the actual interpreter/continuation chain, > anything we fetch out of it (like, say, the lexical scope) is > touchable, so injecting variables into your caller's scope isn't a > big deal.
What for is that needed? Well, if the lexical scope PMC is available by introspection, you can of course add variables to it, with all the consequences that *might* arise from such hacks.
> Dunno if we want to have a PMC that has this Evil Knowledge, or if we > want to have ops. Part of me leans towards an op or two -- if for no > other reason then to force it to be very explicit in any language > that does it, which will make people think and make it really tough > to do accidentally.
Methods ought to be explicit enough.
leo
[1] we had C<cur_pc> in the interpreter structure. It's now C<current_pc> in the context.
[2] except for overriden methods, which are called from C
Leopold Toetsch <l...@toetsch.at> wrote: > Dan Sugalski <d...@sidhe.org> wrote: >> Basically we want to be able to walk a continuation chain and get >> access to everything. > I think having methods is ok for that. It's in no way time critical to > warrant opcodes.
I've now created two methods "caller" and "continuation" in the Continuation PMC that allows walking the continuation chain. It's slightly different then the previous example as Sub PMCs don't have a context. To get at the previous caller, we have to get the continuation of the continuation.
$ ./parrot call.imc main foo Bar bar caller: foo Bar foo called from Sub 'bar' pc 146 called from Sub 'foo' pc 75 called from Sub 'main' pc 21 Bar foo called from Sub 'main' pc 44 ok
> printerr $P0 # does $P0->vtable->getstring(), which creates > # a traceback string, maybe verbosity depending > # on debug settings
>OTOH having get_string the whole traceback chain isn't really good for >overriding it. So maybe it should return just the info for the current >call.
I was thinking something a bit more primitive. Since we can treat the call chain as an array, we could do:
$S0 = insert_opname_here [0; 'subname'] # Get the current sub name $S1 = insert_opname_here [1; 'subname'] # Get the caller's sub name $P1 = insert_opname_here [2; 'pad'] # Get grandparent's pad
We could do the same thing with continuation objects -- access them as an array and pull parts out, which'd work fine.
>I think having methods is ok for that. It's in no way time critical to >warrant opcodes.
The one downside to methods is that we need an object, which means that we've got to instantiate the current continuation. Unless the speed hit there's what you were talking about with not time-critical, in which case I can live with that. I'm not sure I'd want to go with methods here, though -- there's a reasonable chance that the method code might mess up some of the environmental info in the traceback.
> >>What information should be included in the traceback (object)?
>> Everything. It seems to me that there are two big things people will >> do with traceback info.
>When and in which run loop do we update the C<current_pc>[1] ? Only, >when warnings are enabled or always?
We should always be able to get a sane value out of it. If that means with the JIT that we have to play some interesting games with line number metadata sections or something, well... that's OK. It doesn't have to be cheap, just doable.
> > 1) Dumps of call chains and state > > 2) Evil parent environment twiddling
> > For #2, which includes doing things like upvar and other >> unpleasantness, we'll be taking advantage of the fact that while we >> can't necessarily touch the actual interpreter/continuation chain, >> anything we fetch out of it (like, say, the lexical scope) is >> touchable, so injecting variables into your caller's scope isn't a >> big deal.
>What for is that needed? Well, if the lexical scope PMC is available by >introspection, you can of course add variables to it, with all the >consequences that *might* arise from such hacks.
Yep.
There are fairly unusual cases where a subroutine would want to alter the state of its calling sub or method. You might want to conditionally mess around with the caller's warning state or stricture level or something. Should be rare, but I can see doing it. Tcl's got upvar too, so I suppose we really have to do this.
> > Dunno if we want to have a PMC that has this Evil Knowledge, or if we >> want to have ops. Part of me leans towards an op or two -- if for no >> other reason then to force it to be very explicit in any language >> that does it, which will make people think and make it really tough >> to do accidentally.
>Methods ought to be explicit enough.
Nah. I really, *really* want to force people writing compilers to have to special-case the code. Which, I admit, is a bit petty, but dammit if someone's going to be screwing around with the internals like this then I want to force them to be really explicit about it. -- Dan
--------------------------------------it's like this------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
Dan Sugalski <d...@sidhe.org> wrote: > At 11:16 AM +0100 11/2/04, Leopold Toetsch wrote: > I was thinking something a bit more primitive. Since we can treat the > call chain as an array, we could do: > $S0 = insert_opname_here [0; 'subname'] # Get the current sub name > $S1 = insert_opname_here [1; 'subname'] # Get the caller's sub name > $P1 = insert_opname_here [2; 'pad'] # Get grandparent's pad > We could do the same thing with continuation objects -- access them > as an array and pull parts out, which'd work fine.
We don't have opcodes with a keyed on nothing syntax. So 2nd idea:
>>I think having methods is ok for that. It's in no way time critical to >>warrant opcodes. > The one downside to methods is that we need an object, which means > that we've got to instantiate the current continuation.
Not really, We can hang these methods off the interpreter too.
> ... I'm not sure I'd want to go with > methods here, though -- there's a reasonable chance that the method > code might mess up some of the environmental info in the traceback.
That's an argument. OTOH you can't override NCI methods on plain PMCs. The chance of disruption is low.
>>When and in which run loop do we update the C<current_pc>[1] ? Only, >>when warnings are enabled or always? > We should always be able to get a sane value out of it. If that means > with the JIT that we have to play some interesting games with line > number metadata sections or something, well... that's OK. It doesn't > have to be cheap, just doable.
Ok. I presume "line number" is both HLL and PIR line numbers. To be able to create a precise traceback, we've to update the program counter in the interpreter context, so that we know, what we are executing.
Metadata is basically not the problem. We have that for PASM albeit it's not too exact always.
What is the granularity of updating the PC? E.g. when we have:
a = func(b, c)
this translates to several PASM instructions. Alone setting I0..I4 for call and return are 10. For a traceback, we'd need one PC update only.
OTOH when we have:
add P0, P1
it could be an overridden method call, which would need updating the PC.
We could of course just say, that's not our problem: the HLL is responsible for emitting