I've just finished re-writing BASIC's expression evaluator (for speed, and for fun -- I need to get out more) and now am integrating the code back into the runtime PASM. My problem is, I moved a lot of the complexity from the runtime (no longer a stack machine to evaluate) into the compiler but I kept many of the old routines for other reasons. The runtime could get a lot smaller, except now I've got junk code left behind.
So the question is, does anyone know a clever hack to see if a particular branch is now isolated never taken? Things like:
MAIN: do this and that and this bsr FOO ret
UNUSED: this code never reached ret
FOO: put code here and here FALSEPOS: ret
And tell me that UNUSED is unreachable? Or at least it's never branched to? I'd even be willing to sort through those labeled-but-never-branched-to instructions (like FALSEPOS)...
In perl.perl6.internals, you wrote: > So the question is, does anyone know a clever hack to see if a particular > branch is now isolated never taken? Things like:
[ code rewritten ]
MAIN: print "main\n" bsr FOO end
UNUSED: print "never reached\n" ret
FOO: print "in foo\n" FALSEPOS: ret
$ imcc -Op1 -d60 unu.pasm block 2 label UNUSED deleted block 4 label FALSEPOS deleted found dead block 2 ins deleted (dead block) print "never reached\n" ins deleted (dead block) ret main in foo
At 09:11 AM 5/21/2003 +0200, Leopold Toetsch wrote:
>In perl.perl6.internals, you wrote:
> > So the question is, does anyone know a clever hack to see if a particular > > branch is now isolated never taken? Things like:
>[ code rewritten ]
>$ imcc -Op1 -d60 unu.pasm >block 2 label UNUSED deleted >block 4 label FALSEPOS deleted >found dead block 2 > ins deleted (dead block) print "never reached\n" > ins deleted (dead block) ret >main >in foo
Hmm.. There seems to be an issue with this in a non-trivial case. Using a cvs update from today, along with some changes I made with BASIC, here's a demo of what looks like a bug:
Clinton A. Pierce wrote: > At 09:11 AM 5/21/2003 +0200, Leopold Toetsch wrote: > Using a cvs update from today, along with > some changes I made with BASIC, here's a demo of what looks like a bug: >> error:imcc:sort_loops: loop 1 contains first but notlast of outer loop 0 >> C:\projects\perl\parrot\languages\BASIC\compiler> > imcc seems happy to assemble the PASM otherwise, and it even runs fine.
The optimizer has to work a lot on your code because its mainly one big bunc of code (compilation unit). The optimizer of course has some errors too, but finding them in 3290 lines of PASM code is out of my skills.
So please first have a look at some imcc rules, if you want to have imcc optimize that code: - local labels don't have an underscore in front - global labels have - use some form of calling conventions[1], i.e. stack calling conventions:
bsr _PLATFORM_SETUP ...
_PLATFORM_SETUP: saveall ... restoreall ret
[1] the current only one imcc supports ;-)
You might have a look at your one-liner in PASM and what the CFG (control flow graph) looks like: $ ../../imcc/imcc -d10 -Op1 TARG_test.pasm >1 2>&1 $ less 1 /Dumping the CFG
Of course imcc should be able to optimize such a piece of PASM but currently has some flaws WRT subroutines and ret statements.
>>> error:imcc:sort_loops: loop 1 contains first but notlast of outer loop 0
I changed this to a non fatal debug message. The hello world now runs as well as mandel.bas.
One additional note two my last reply: changing suroutine labels to global labels is of course not enough, you should really target PIR (IMCC) code not PASM to get all benefits of optimization. This would probably allow for a much faster expression evaluator by using virtuell registers a la "$I100 = $I101 + $I102".
>>imcc seems happy to assemble the PASM otherwise, and it even runs fine.
>The optimizer has to work a lot on your code because its mainly one big >bunc of code (compilation unit). The optimizer of course has some errors >too, but finding them in 3290 lines of PASM code is out of my skills.
>So please first have a look at some imcc rules, if you want to have imcc >optimize that code: >- local labels don't have an underscore in front >- global labels have >- use some form of calling conventions[1], i.e. stack calling conventions:
> bsr _PLATFORM_SETUP > ...
> _PLATFORM_SETUP: > saveall > ... > restoreall > ret
>[1] the current only one imcc supports ;-)
Okay, I've looked at everything I could gather on IMCC from the source tree and I've got a list of questions before I can IMCC-ize BASIC completely. (It *is* marching in that direction, slowly.) Consider these the questions of a raving clueless lunatic, and calmly point me to documentation if I've missed it somewhere.
[I realize that I horribly over-complicated some aspects of BASIC and it's slower than it needs to be. I'm doing *way* too much at runtime and am trying to solve that. Speed is my motivation for IMCC-izing and simplifying everything. I'd like a 3-ply chess game in BASIC to be playable, dammit.]
* When dealing with an expression that I've determined needs to have intermediate values stored like this (a simplistic example):
1+(2*3+4*5)
I'd naively interpret this as:
2 * 3 -> store result on stack 4 * 5 -> becomes operand 1 restore operand 2 from stack multiply (result becomes operand 1) set operand 2 to "1" add (result becomes operand 1) return operand 1 as the expression result
What's the approved way of implementing this "stack"? Which stack should I use? Right now, the expression parser allocates a PerlArray and just pushes the result. This doesn't get weird, of course, until in the middle of an expression you need to call a function with another scope, in which means I have to get another PerlArray; or when there's functions with multiple arguments. Am I doing this right?
* BASIC sometimes doesn't lend itself to "compilation units" very well. With modern-day Q-BASIC function calls:
function myfunc$(a, b) myfunc$="results!" end function myfunc$(3.14, 9999)
Is this a compilation unit? There's no scope or anything. Nothing to save, restore, or anything. No "one entry" point. Is this going to fly well with IMCC?
* BASIC also has some odd scoping rules. In most programs, everything's global. Within subroutines, everything's private unless it's been declared as shared in the outer scope, or explicitly passed in.
sub mysub print i # My own i, =0 print foo # foo from outer scope end sub function dothis(a) dothis=100*a # a is local, and has z's value end function
shared foo i=50 mysub() z=10 print dothis(z)
So would a BASIC program essentially want to declare *everything* local (given how IMCC works) and just those things that are explicitly declared SHARED would be declared as global in IMCC? Is IMCC going to be happy with that many globals (or that many locals)? Is there anything I can do to make it happier?
* Are BASIC arrays my own problem? (I can't see how it'd be possible for IMCC to handle those in any way.) Right now I'm using a PerlHash/PerlArray structure to manage those, and lookups are handled much in the same way that function calls are performed.
* Is there a nice way to handle mixed-precision operations? Will there be? Or is this just something I need to watch?
.local int a .local float b .local int c a=1 b=3.14 c=a*b end
* What's your advice on when to go through the whole calling convention stuff:
.local int r .arg y .arg x call _foo # (int)foo(x,y) .result r
.sub _foo saveall .local int r .param int a .param int b r=a*b .return r restoreall ret .end
and when to just slug some things onto a stack or into some registers and bsr there. I've got a few examples, your thoughts?
utility functions that the runtime will need (array lookups, etc...) builtin functions that the runtime calls ( int, abs, str$, mid$, etc...) user-defined functions user-defined subroutines
That's it for now. But its enough to get me started. Thank you!
> At 07:35 PM 5/21/2003 +0200, Leopold Toetsch wrote: > >>imcc seems happy to assemble the PASM otherwise, and it even runs fine.
> >The optimizer has to work a lot on your code because its mainly one big > >bunc of code (compilation unit). The optimizer of course has some errors > >too, but finding them in 3290 lines of PASM code is out of my skills.
> >So please first have a look at some imcc rules, if you want to have imcc > >optimize that code: > >- local labels don't have an underscore in front > >- global labels have > >- use some form of calling conventions[1], i.e. stack calling conventions:
> Okay, I've looked at everything I could gather on IMCC from the source tree > and I've got a list of questions before I can IMCC-ize BASIC > completely. (It *is* marching in that direction, slowly.) Consider these > the questions of a raving clueless lunatic, and calmly point me to > documentation if I've missed it somewhere.
I'll answer what I can... don't trust me too much though :-)
> [I realize that I horribly over-complicated some aspects of BASIC and it's > slower than it needs to be. I'm doing *way* too much at runtime and am > trying to solve that. Speed is my motivation for IMCC-izing and > simplifying everything. I'd like a 3-ply chess game in BASIC to be > playable, dammit.]
> * When dealing with an expression that I've determined needs to have > intermediate values stored like this (a simplistic example):
> 1+(2*3+4*5)
> I'd naively interpret this as:
> 2 * 3 -> store result on stack > 4 * 5 -> becomes operand 1 > restore operand 2 from stack > multiply (result becomes operand 1) > set operand 2 to "1" > add (result becomes operand 1) > return operand 1 as the expression result
You're thinking of this as a stack machine -- it's not. It's a register machine.
2 * 3 -> store into $I0 4 * 5 -> store into $I1 $I0 + $I1 -> store into $I2 1 + $I2 -> store into result
Imcc will handle the spilling for you.
> What's the approved way of implementing this "stack"? Which stack should I > use? Right now, the expression parser allocates a PerlArray and just > pushes the result. This doesn't get weird, of course, until in the middle > of an expression you need to call a function with another scope, in which > means I have to get another PerlArray; or when there's functions with > multiple arguments. Am I doing this right?
No. Use Imcc's .arg and .result. They simplify things.
Anyway, I don't think IMCC has a problem with jumping into the middle of a compilation unit. But, you might just have that program as one big compilation unit, since, as you say, there's no scope or anything.
> Is this a compilation unit? There's no scope or anything. Nothing to > save, restore, or anything. No "one entry" point. Is this going to fly > well with IMCC?
> * BASIC also has some odd scoping rules. In most programs, everything's > global. Within subroutines, everything's private unless it's been declared > as shared in the outer scope, or explicitly passed in.
> sub mysub > print i # My own i, =0 > print foo # foo from outer scope > end sub > function dothis(a) > dothis=100*a # a is local, and has z's value > end function
> So would a BASIC program essentially want to declare *everything* local > (given how IMCC works) and just those things that are explicitly declared > SHARED would be declared as global in IMCC? Is IMCC going to be happy with > that many globals (or that many locals)? Is there anything I can do to > make it happier?
I'm not sure. Maybe you could use the C<global> ops for globals, and use IMCC .locals for locals.
> * Are BASIC arrays my own problem? (I can't see how it'd be possible for > IMCC to handle those in any way.) Right now I'm using a PerlHash/PerlArray > structure to manage those, and lookups are handled much in the same way > that function calls are performed.
Think so.
> * Is there a nice way to handle mixed-precision operations? Will there > be? Or is this just something I need to watch?
> .local int a > .local float b > .local int c > a=1 > b=3.14 > c=a*b > end
Well, BASIC is a little bit typed, right? In that, you know whether a particular variable is a string or a number at compile-time. So just add the extra ops for conversion if the operands don't match.
> * What's your advice on when to go through the whole calling convention stuff:
> .local int r > .arg y > .arg x > call _foo # (int)foo(x,y) > .result r
> .sub _foo > saveall > .local int r > .param int a > .param int b > r=a*b > .return r > restoreall > ret > .end
> and when to just slug some things onto a stack or into some registers and > bsr there. I've got a few examples, your thoughts?
> utility functions that the runtime will need (array lookups, etc...) > builtin functions that the runtime calls ( int, abs, str$, mid$, > etc...) > user-defined functions > user-defined subroutines
Hmm, personally, I'd use the imcc .param/.return/etc. for everything. It's not the *real* calling conventions for now, but when imcc gets around to supporting those, you'll have already complied.