Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Compile op and building compilers

3 views
Skip to first unread message

Dan Sugalski

unread,
Sep 20, 2004, 6:46:45 PM9/20/04
to perl6-i...@perl.org
Okay, this is coming up again and I want to get it nailed down.

Current semantics, as defined:

$Px = compreg $Sy

returns a compiler for language $Sy.

compreg $Sx, $Py

defines $Py as the compiler for language $Sx.

$Px = compile $Py, $Sz

uses compiler $Py (from a compreg call) to compile the source $Sz and
return a sub $Px for it. That sub may then be invoked as need be,
stuffed in a namespace, or whatever.

That's fine, it works, woohoo and all. (Though I'm not sure the
same-name-swapped-args compreg for finding and registering's a good
idea. Or, rather, I think it's a bad idea, my fault, we should fix)

Now, the issue is how to actually build a compiler. Right now a
compiler is a simple thing -- it's a method hanging off the __invoke
vtable slot of the PMC. I'm not sure I like that, as it seems really,
really hackish. Hacks are inevitable, of course, but it seems a bit
early for that. (We ought to at least wait until we do a beta
freeze...) On the other hand it does make a certain amount of sense
-- it's a compilation subroutine we're getting, so we ought to invoke
it, and I can certainly live with that.

Time to weigh in with opinions, questions, and whatnot. There's not
much reason to JFDI and make the decisions final, so weigh away and
we'll just nail it all down on wednesday.
--
Dan

--------------------------------------it's like this-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Steve Fink

unread,
Sep 23, 2004, 1:29:54 AM9/23/04
to Dan Sugalski, perl6-i...@perl.org
On Sep-20, Dan Sugalski wrote:
>
> Now, the issue is how to actually build a compiler. Right now a
> compiler is a simple thing -- it's a method hanging off the __invoke
> vtable slot of the PMC. I'm not sure I like that, as it seems really,
> really hackish. Hacks are inevitable, of course, but it seems a bit
> early for that. (We ought to at least wait until we do a beta
> freeze...) On the other hand it does make a certain amount of sense
> -- it's a compilation subroutine we're getting, so we ought to invoke
> it, and I can certainly live with that.
>
> Time to weigh in with opinions, questions, and whatnot. There's not
> much reason to JFDI and make the decisions final, so weigh away and
> we'll just nail it all down on wednesday.

My preference, as I've stated before, is to leave compilers as
invoke-able PMCs -- and further, I think that compilers will sometimes
be coroutines, or return multiple continuations, or play other such
tricks available via C<invoke> (if appropriate for what they do). Which
is easy if you forget about compilation as being something special, but
instead just say it's invocable and thereby inherit all of the PIR
syntactic sugar for Subs.

On the other hand, that opinion assumes that compilers are used in the
funky ways that I am thinking of, which involves a lot of switching
between languages, using other languages' facilities for implementing
pieces of your language, etc. If Parrot is primarily going to be mixing
languages by having one language call another's libraries, then I can
see some utility in having a separate C<compile> op, even if its only
purpose is to explicitly declare that compilers must take only a single
string and produce a callable PMC, and no more. (Though I wonder if you
might want sometimes use a filename rather than a
string-containing-enormous-chunk-of-code). Screwballs like me would then
make our languages compiled via a different mechanism, and we wouldn't
play in the same sandbox as "regular" compilers. However, then we'd need
to decide whether those types of compilers should be registered via
compreg, or whether anything registered via compreg is required to do
something meaningful when invoked with a single string argument
containing code (or whatever C<compile> ends up doing; that's just what
it does now.)

A question: when last we talked about this, you mentioned that you
didn't envision it being useful for compilers to take arguments. I think
you were only talking about configuration, but in any case, what sorts
of mechanisms do you feel are appropriate for setting options, pointing
to libraries or include paths, etc? Also, is Parrot supposed to provide
a rich enough set of core functionality that compilers will never need
to communicate directly with the "host" language? As a simple example,
say you have an embedded language that wants to add a new local
variable. Parrot has pads for this purpose, but what if you need to
specify some sort of rich type information or register it with some
host language-specific registry singleton of some sort? I don't know if
these sorts of things are useful, but they're easily within the scope of
imagination. :-)

Leopold Toetsch

unread,
Sep 23, 2004, 4:31:29 AM9/23/04
to Steve Fink, perl6-i...@perl.org
Steve Fink <st...@fink.com> wrote:
> On Sep-20, Dan Sugalski wrote:
>>
>> Now, the issue is how to actually build a compiler. Right now a
>> compiler is a simple thing -- it's a method hanging off the __invoke
>> vtable slot of the PMC.

First we should unify the return value. We currently have basically two
kinds of compilers:
1) return an Eval PMC (builtins PASM, PIR)
2) user compilers which return a subroutine

Both return invocable subs, but terminating the run-loop started by 1)
needs an C<end> opcode. That's more a historical hack then useful.
Following calling conventions, code returned by 1) should also run in the
caller's run-loop and terminate by invoking the return continuation.

OTOH it depends on what granularity we allow for evaled code. If that's
always a closure, it should follow calling conventions. If its more
fine-grained (statement or expression only) some special treatment might
be in order.

Finally there is some nasty kind of perl5 code, which is currently
covered in a special way with the C<branch_cs> opcode:

# LAB:
# $i++;
# eval("goto LAB if ($i==6)");
# print "$i\n";

Dunno, if we can force the use of a continuation with such kind of code.

> [ ... ], then I can


> see some utility in having a separate C<compile> op,

I think a special opcode is ok, the more that we possibly want to pass
in some arguments too, or ...

> ... (Though I wonder if you


> might want sometimes use a filename rather than a
> string-containing-enormous-chunk-of-code).

Yep.

> A question: when last we talked about this, you mentioned that you
> didn't envision it being useful for compilers to take arguments.

I think we need first a scheme to be able to pass on language specific
command line arguments. Something similar that e.g. gcc does for passing
arguments to the linker, or:

./parrot --perl6 .. perl6 options .. --parrot .. parrot options -- ..

The HLL-specific part of the commandline could be collected in an $HLL-argv
and be passed to the compiler in C< P5 >.

Further we should have registered file types, so that e.g.

./parrot f.py

does the right thing. There is currently some code duplication in the
startup of imcc/main.c and in code used by the C<load_bytecode> opcode,
with hard-coded compiler selection based on the extensions .pasm, .past,
and everything else meaning .imc.

leo

0 new messages