Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Register allocation bug in IMCC? Or misunderstanding on my part?

12 views
Skip to first unread message

Clinton A. Pierce

unread,
Jun 1, 2003, 11:55:13 AM6/1/03
to perl6-i...@perl.org
The following code:

.local PerlArray READDATA
.local PerlHash RESTOREINFO
.sub _main
READDATA = new PerlArray
RESTOREINFO = new PerlHash


call _data
end
.end
.sub _data
push READDATA, 10
ret
.end

Throws an error in the VM of:

push_integer() not implemented in class 'PerlHash'

This is, as far as I can tell, because the same register is used by IMCC
for both the READDATA and RESTOREINFO locals, thus by the time that the sub
_data gets around to being run, READDATA has become a PerlHash (imcc -t):

>PC=0; OP=821 (new_p_ic); ARGS=(P0, 14)
>PC=3; OP=821 (new_p_ic); ARGS=(P0, 15)
>PC=6; OP=752 (bsr_ic); ARGS=(2)
>PC=8; OP=722 (push_p_ic); ARGS=(P0, 10)
>push_integer() not implemented in class 'PerlHash'

This seems to conflict with the example that Leo gave me on the 28th. I'm
not sure if this is a bug or if I've got some whacky misunderstanding of
how things work. I think my intention is clear by the above piece of
code. Suggestions?

As a postscript, changing the code to:

.local PerlArray READDATA
.local PerlHash RESTOREINFO
.sub _main
READDATA = new PerlArray
RESTOREINFO = new PerlHash


push READDATA, 10
end
.end

Produces the desired (and expected) output. READDATA and RESTOREINFO each
get their own register:

>PC=0; OP=821 (new_p_ic); ARGS=(P0, 14)
>PC=3; OP=821 (new_p_ic); ARGS=(P1, 15)
>PC=6; OP=722 (push_p_ic); ARGS=(P0, 10)
>PC=9; OP=0 (end)

This was all done with imcc, cvs updated as of 12:00 Eastern time. No
optimization options selected.


Leopold Toetsch

unread,
Jun 1, 2003, 12:57:19 PM6/1/03
to Clinton A. Pierce, perl6-i...@perl.org
Clinton A. Pierce <cli...@geeksalad.org> wrote:
> The following code:

> push_integer() not implemented in class 'PerlHash'

> This is, as far as I can tell, because the same register is used by IMCC
> for both the READDATA and RESTOREINFO locals, thus by the time that the sub
> _data gets around to being run, READDATA has become a PerlHash (imcc -t):

>>PC=0; OP=821 (new_p_ic); ARGS=(P0, 14)
>>PC=3; OP=821 (new_p_ic); ARGS=(P0, 15)

Yep. Good analysis.

You could tell it a bug. Imcc should follow the code path into the
subroutine. But he problem is: the subroutine isn't known, when _main is
compiled. So IMCC only sees one usage of the Array and one usage of the
Hash amd assigns both to the same register P0.

If you want really use such constructs, you can't put them in different
compilation units, because they are basically one unit.

Much better is to implement some kind of calling conventions.

As soon as you pass the READDATA array to the subroutine, it will have
its own register and the problem is gone.

leo

Clinton A. Pierce

unread,
Jun 1, 2003, 1:45:46 PM6/1/03
to l...@toetsch.at, perl6-i...@perl.org
At 06:57 PM 6/1/2003 +0200, Leopold Toetsch wrote:
>Clinton A. Pierce <cli...@geeksalad.org> wrote:
> > The following code:
>
> > push_integer() not implemented in class 'PerlHash'
>
> > This is, as far as I can tell, because the same register is used by IMCC
> > for both the READDATA and RESTOREINFO locals, thus by the time that the sub
> > _data gets around to being run, READDATA has become a PerlHash (imcc -t):
>
> >>PC=0; OP=821 (new_p_ic); ARGS=(P0, 14)
> >>PC=3; OP=821 (new_p_ic); ARGS=(P0, 15)
>
>Yep. Good analysis.
>
>You could tell it a bug. Imcc should follow the code path into the
>subroutine. But he problem is: the subroutine isn't known, when _main is
>compiled. So IMCC only sees one usage of the Array and one usage of the
>Hash amd assigns both to the same register P0.

Makes sense.


>If you want really use such constructs, you can't put them in different
>compilation units, because they are basically one unit.

So a BASIC IMCC program, with all of the builtin functions it needs, plus
the actual user program itself as one compilation unit? All for the sake
of a few needed globals? Sounds ghastly.

>Much better is to implement some kind of calling conventions.

Except that these represent true global data structures that the entire
BASIC machine needs. They're not really things to be passed around from
place to place and the thought of having a Px register to hold the "state
of BASIC" and passing it (potentially) everywhere seems kinda silly for a
language that isn't really threaded in any incarnation and is ever only
going to need one state.

I've *got* good, sound calling conventions now but there are things (sofar,
only half a dozen*) that exist in an outermost scope that need initialization.

I could turn the whole thing upside-down (subs first, main last) except
that I'd need a token outer main() to call the initialization routine. And
from a quick test IMCC can tell that the execution path jumped down, and
optimizes those registers away anyway. Nevermind.

Further suggestions, given better info?


* Examples of things that are really global in a BASIC machine: current
random number seed, cursor column position, console current character
attributes, last value picked up from a READ/DATA combo (line & position),
structure definition table, and a GOSUB call-stack lookback specifically
for "RETURN x".


Luke Palmer

unread,
Jun 1, 2003, 2:00:04 PM6/1/03
to cli...@geeksalad.org, l...@toetsch.at, perl6-i...@perl.org
> >If you want really use such constructs, you can't put them in different
> >compilation units, because they are basically one unit.
>
> So a BASIC IMCC program, with all of the builtin functions it needs, plus
> the actual user program itself as one compilation unit? All for the sake
> of a few needed globals? Sounds ghastly.

No. Global things shouldn't be in registers in the first place,
because you're using up your computation resources for things that
have a very long lifetime.

> >Much better is to implement some kind of calling conventions.
>
> Except that these represent true global data structures that the entire
> BASIC machine needs. They're not really things to be passed around from
> place to place and the thought of having a Px register to hold the "state
> of BASIC" and passing it (potentially) everywhere seems kinda silly for a
> language that isn't really threaded in any incarnation and is ever only
> going to need one state.

If you use them often... then maybe one or two need a real register,
but I'd still be weary of doing that. Use find_global and its
friends.

> I've *got* good, sound calling conventions now but there are things (sofar,
> only half a dozen*) that exist in an outermost scope that need initialization.
>
> I could turn the whole thing upside-down (subs first, main last) except
> that I'd need a token outer main() to call the initialization routine. And
> from a quick test IMCC can tell that the execution path jumped down, and
> optimizes those registers away anyway. Nevermind.
>
> Further suggestions, given better info?
>
>
> * Examples of things that are really global in a BASIC machine: current
> random number seed, cursor column position, console current character
> attributes, last value picked up from a READ/DATA combo (line & position),
> structure definition table, and a GOSUB call-stack lookback specifically
> for "RETURN x".

Yeah, most of those seem like they can be stored in the global name
table instead of a register. Load them in when you need them. Except
the latter, which seems like you should be using the call stack, if
I'm not misunderstanding.

Luke

Clinton A. Pierce

unread,
Jun 1, 2003, 2:17:30 PM6/1/03
to Luke Palmer, l...@toetsch.at, perl6-i...@perl.org
At 12:00 PM 6/1/2003 -0600, Luke Palmer wrote:
>If you use them often... then maybe one or two need a real register,
>but I'd still be weary of doing that. Use find_global and its
>friends.

Eureka!

Oh, find_global/set_global, where have you been all my life?

Thank you for pointing out the obvious. :)


> > * Examples of things that are really global in a BASIC machine: current
> > random number seed, cursor column position, console current character
> > attributes, last value picked up from a READ/DATA combo (line & position),
> > structure definition table, and a GOSUB call-stack lookback specifically
> > for "RETURN x".
>
>Yeah, most of those seem like they can be stored in the global name
>table instead of a register. Load them in when you need them. Except
>the latter, which seems like you should be using the call stack, if
>I'm not misunderstanding.

I thought so too, but no sane language I know would allow this kind of
nonsense:

10 GOSUB 100
20 PRINT "I get printed too."
50 REM And the stack is empty at this point.
90 END
100 GOSUB 200
110 PRINT "Nope, won't see me"
199 RETURN
200 GOSUB 1000
210 PRINT "Nope, won't see me either."
299 RETURN
1000 PRINT "You *will* see me"
1010 RETURN 10

I think in PASM this would be something like

bsr USERLABEL_100
eq RETURNTO, "10", CONT_10
eq RETURNTO, "", CONT_10
ret
CONT_10:
print "I get printed too"
end
USERLABEL_100:
bsr USERLABEL_200
eq RETURNTO, "100", CONT_100
eq RETURNTO, "", CONT_100
ret
CONT_100:
print "Nope, won't see me"
set RETURNTO, ""
ret
USERLABEL_200:
bsr USERLABEL_1000
eq RETURNTO, "200", CONT_200
eq RETURNTO, "", CONT_200
ret
CONT_200:
print "Nope, won't see me either."
set RETURNTO, ""
ret
USERLABEL_1000:
print "You *will* see me"
set RETURNTO, "10"
ret

Of course, this allows for jumping into this GOSUB/RETURN nonsense anywhere
in the mix and still a RETURN x takes you back to the right place. Oh and
worse? RETURN x can take a variable in some cases. (Literally, x=10,
return x)

Without RETURN x, just plain old GOSUB/RETURN I could let PIR/PASM handle
all by itself with no supervision.


Leopold Toetsch

unread,
Jun 1, 2003, 4:29:52 PM6/1/03
to Clinton A. Pierce, Luke Palmer, perl6-i...@perl.org
Clinton A. Pierce wrote:

> At 12:00 PM 6/1/2003 -0600, Luke Palmer wrote:
>

>> ... Use find_global and its
>
> Eureka!


Yep. globals are globals are globals.


> eq RETURNTO, "10", CONT_10


AFAIK are these "line numbers", why do you want to use strings - but I
really don't know (and will for sure not learn) this "language".


> USERLABEL_100:
> bsr USERLABEL_200
> eq RETURNTO, "100", CONT_100
> eq RETURNTO, "", CONT_100
> ret


But everythings inside seems to be a real sub, so could be a compilation
unit, AFAIK

> Without RETURN x, just plain old GOSUB/RETURN I could let PIR/PASM
> handle all by itself with no supervision.

Or you handle "normal subs" like that and generate special code for
"return x"?

leo


Clinton A. Pierce

unread,
Jun 1, 2003, 10:26:07 PM6/1/03
to Leopold Toetsch, Luke Palmer, perl6-i...@perl.org
At 10:29 PM 6/1/2003 +0200, Leopold Toetsch wrote:
>> eq RETURNTO, "10", CONT_10
>
>
>AFAIK are these "line numbers", why do you want to use strings - but I
>really don't know (and will for sure not learn) this "language".

Because "line numbers" are Ancient Basic for designating branch
destinations in programs (and statement order!). Nowadays they're
optional. They're strings because instead of line numbers, labels can be
used. So a perfectly valid BASIC program can even mix forms.

10 print "Hello world"
20 gosub mysub
30 end
mysub: print "Goodbye, world"
return

Yes, my friend. Continue to use quotes around the word "language" when
describing this mess.

>>Without RETURN x, just plain old GOSUB/RETURN I could let PIR/PASM handle
>>all by itself with no supervision.
>
>Or you handle "normal subs" like that and generate special code for
>"return x"?

I could, except that it's impossible to tell where exactly the exit point
of a "subroutine" (for GOSUB) is in a BASIC program. In fact, all "GOSUB"
destinations could use an unconditional branch to all use a common RETURN
point. You just return when you stumble across a return. Further analysis
of where the "subroutines" are may be impossible with mechanical (or human)
analysis at compile time:

5 rem Valid but seriously spaghettified BASIC example.
7 rem Good luck analyzing this at compile time for entry
9 rem and exit points for the "subroutines".
10 print "Hello, world!"
20 for i = 1 to 3
30 gosub 70
40 next
50 end
70 gosub mysub
80 gosub 110
90 return
100 return 30
110 rem alternate entry point
mysub:
print "Hello, galaxy!"
on int(rnd(2)+1) goto 90, 100

For this, the "Hello, galaxy" subroutine could be entered at two different
points (110, "mysub") and exited at one of two randomly chosen points (90
or 100) at runtime. Furthermore, one of the return points (100) changes
the exit point of the subroutine entered at line 70 (either 90 or 100).

Remember, this is the language that inspired Dijkstra to write "Go To
Statement Considered Harmful" (http://www.acm.org/classics/oct95/). :)

Uri Guttman

unread,
Jun 1, 2003, 11:10:27 PM6/1/03
to Clinton A. Pierce, Leopold Toetsch, Luke Palmer, perl6-i...@perl.org
>>>>> "CAP" == Clinton A Pierce <cli...@geeksalad.org> writes:

CAP> Yes, my friend. Continue to use quotes around the word "language"
CAP> when describing this mess.

>>> Without RETURN x, just plain old GOSUB/RETURN I could let PIR/PASM
>>> handle all by itself with no supervision.
>>
>> Or you handle "normal subs" like that and generate special code for
>> "return x"?

CAP> I could, except that it's impossible to tell where exactly the exit
CAP> point of a "subroutine" (for GOSUB) is in a BASIC program. In fact,
CAP> all "GOSUB" destinations could use an unconditional branch to all use
CAP> a common RETURN point. You just return when you stumble across a
CAP> return. Further analysis of where the "subroutines" are may be
CAP> impossible with mechanical (or human) analysis at compile time:

CAP> 5 rem Valid but seriously spaghettified BASIC example.
CAP> 7 rem Good luck analyzing this at compile time for entry
CAP> 9 rem and exit points for the "subroutines".
CAP> 10 print "Hello, world!"
CAP> 20 for i = 1 to 3
CAP> 30 gosub 70
CAP> 40 next
CAP> 50 end
CAP> 70 gosub mysub
CAP> 80 gosub 110
CAP> 90 return
CAP> 100 return 30
CAP> 110 rem alternate entry point
CAP> mysub:
CAP> print "Hello, galaxy!"
CAP> on int(rnd(2)+1) goto 90, 100

okay, i am going to jump in and try to drown here. :)

why don't you manage your own basic call stack in an array or PerlArray
or something? trying to map that mess of call/return poo onto a proper
compiler with register allocation is going to lose us the services of
leo while he recuperates at the hospital for homicidal BASIC coders.

you will just have to bite the bullet and emulate your own virtual
engine over parrot and not try to map BASIC directly to it. so code a VM
with BASIC flow control. you can probably do something odd like make
each BASIC statement into a parrot sub. then just compile your basic
into a simple flow control IL that just calls parrot subs. the subs all
have access to your need globals (with the obvious
find_global/set_global :) a primary global now is the BASIC VM PC.

so your code gen is much easier too. each statement is codegenned and
you keep their label or sequence numbers in your flow tree and they map
to a parrot sub address (that you generate as you codegen). later to
execute you just scan your own flow tree (that is the basic interpreter
engine), call each parrot sub (BASIC statement)in turn and let parrot
to the real work in each statement. BASIC calls and returns are handled
with munging the global BASIC PC and the global stack.

feel free to tell me this is total cocky-pop. :)

uri

--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org

Clinton A. Pierce

unread,
Jun 2, 2003, 10:10:22 AM6/2/03
to Uri Guttman, perl6-i...@perl.org
At 11:10 PM 6/1/2003 -0400, Uri Guttman wrote:
>why don't you manage your own basic call stack in an array or PerlArray
>or something? trying to map that mess of call/return poo onto a proper
>compiler with register allocation is going to lose us the services of
>leo while he recuperates at the hospital for homicidal BASIC coders.

Naw. I'm not homocidal. Suicidal with a masochistic bent, that's all.

And I don't think I've ever asked for anything from the Parrot team other
than opinions on honest-to-goodness bugs or understanding of some area of
how to target this platform. Hope no-one's going to need a pshrink after
this. I only ask odd questions because sometimes I'm dealing with odd
things. Like programs that are essentially one big compilation unit --
almost, and doing odd things with bsr/ret...

>you will just have to bite the bullet and emulate your own virtual
>engine over parrot and not try to map BASIC directly to it. so code a VM
>with BASIC flow control. you can probably do something odd like make
>each BASIC statement into a parrot sub. then just compile your basic
>into a simple flow control IL that just calls parrot subs. the subs all
>have access to your need globals (with the obvious
>find_global/set_global :) a primary global now is the BASIC VM PC.

As far as I know, BASIC only gets weird with flow control in two
places. One is 'RETURN X', another is computed goto 'goto X'. The former
I'm handling be wrapping bsr/ret with a little bit of code (after each bsr,
the code asks "was the ret destined for me or not?"). The latter is
handled by a table inserted into the code mapping the compile-time labels
over to runtime destinations. So that:

branch USERLABEL_100 # Normal goto

USERLABEL_100:
# Expression, etc... result in $S0
branch COMPUTED_GOTO

COMPUTED_GOTO:
eq $S0, "USERLABEL_10", USERLABEL_10
eq $S0, "USERLABEL_20", USERLABEL_20
etc...
print "Label "
print $S0
print " does not exist at line "
print BASICLINE
end

And the COMPUTED_GOTO jumptable gets inserted in the code only if the
program does something stupid like a computed goto. BASIC
compilers/interpreters have always felt free to punish the programmer with
space/speed tradeoffs for programmer laziness.

>feel free to tell me this is total cocky-pop. :)

In the first incarnation (04/2002) the Parrot VM (pre-PerlHashes mind you)
was an inconvenience and BASIC essentially was inside it's own little VM on
top of this odd processor. I'd like to exercise as many the facilities
available to me now for speed and whatnot. This re-targeting from PASM to
PIR is much easier than the original PASM port to begin with.


0 new messages