Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

subroutines and python status

8 views
Skip to first unread message

Michal Wallace

unread,
Jul 31, 2003, 7:08:27 AM7/31/03
to perl6-i...@perl.org

Hey all,

I'm trying to get functions working
in python, and I'm not sure the best way
to do this.

What seems natural to me is to define
subroutines in the middle of the code
as I walk the parse tree:

.sub __main__
goto endsub
.sub _f
print ":(\n"
ret
.end
endsub:
$I0 = addr _f
end
.end


But of course, this prints a sad face. :/

I've read imcc/docs/parsing.pod, so I know
why it does this... But what's the alternative?

I can store all my subroutine definitions in
a list or something and then dump them out
after the "__main__" routine. Is that the
right approach? It seems strange to me,
but I'm new at this.

---

Incidentally, I spent all day working on pirate,
and it now generates (and runs!) code for a bunch
of python structures:

- lists, strings, ints
- assignment and multi-assignment ( x,y=1,2 )
- while, for, if/elif/else, break, continue
- math operations (+, -, *, /, %)
- boolean logic (and, or, not)
- comparison operators

It now runs amk's euclid.py perfectly now.
Do we have a way to compare the speed vs python? :)

Also, I wrote a pretty-printer for the lists in
parrot, and you can call it (and presumably
other parrot subs) directly from python code:

if 1 > 2:
_pyprint("one is greater than two...")
print "neat,huh?"

If I could get this subroutine stuff figured
out, you could call functions written in
python, too. :)

http://sixthdev.versionhost.com/viewcvs.cgi/pirate/

Sincerely,

Michal J Wallace
Sabren Enterprises, Inc.
-------------------------------------
contact: mic...@sabren.com
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
--------------------------------------


Brent Royal-Gordon

unread,
Jul 31, 2003, 8:55:27 AM7/31/03
to Michal Wallace, perl6-i...@perl.org
Michal Wallace:

> I can store all my subroutine definitions in
> a list or something and then dump them out
> after the "__main__" routine. Is that the
> right approach? It seems strange to me,
> but I'm new at this.

That seems to be the way to do it, speaking as someone who's working on a
Perl 5-to-PIL converter (using the B optree-introspection modules).

The problem here is that .sub has meanings beyond just "here's a
subroutine". .sub is actually a compilation unit, which complicates things
terribly. You *could* just use branches and labels to create a new sub
without actually making a new compilation unit:

.sub main
...
branch mysub_after
mysub:
...
ret
mysub_after:
...
.end

But that makes imcc run slower, so I don't recommend it.

--Brent Dax <br...@brentdax.com>
Perl and Parrot hacker

Leopold Toetsch

unread,
Jul 31, 2003, 9:12:03 AM7/31/03
to Michal Wallace, perl6-i...@perl.org
Michal Wallace wrote:

> Hey all,


> What seems natural to me is to define
> subroutines in the middle of the code
> as I walk the parse tree:


You can do that:
.sub __main__
bsr _main
end
.end
.sub _main
.sub _f
print ":)\n"
ret
.end
.sub _g
print ";-)\n"
ret
.end
bsr _f
bsr _g
ret
.end

So you have just to emit code, to call your real main at the beginning.
You could also have a look at docs/calling_conventions.pod, which
currently is being implemented. But if you have your code generation for
subs/params/return values in one place, its for sure not complicated to
switch calling conventions later.


> Incidentally, I spent all day working on pirate,
> and it now generates (and runs!) code for a bunch
> of python structures:

Wow.


> Sincerely,
>
> Michal J Wallace

leo

Luke Palmer

unread,
Jul 31, 2003, 11:09:34 AM7/31/03
to mic...@sabren.com, perl6-i...@perl.org

I think your approach may be fine. You can store them in a list and
dump them at the cost of a little extra memory (not a concern at this
point), but you can also put them inline, so long as you have
something like this at the beginning of the file:

.sub __START__
call __main__
.end

So it emits that code right away, because it's the first compilation
unit imcc sees.

> ---
>
> Incidentally, I spent all day working on pirate,
> and it now generates (and runs!) code for a bunch
> of python structures:
>
> - lists, strings, ints
> - assignment and multi-assignment ( x,y=1,2 )
> - while, for, if/elif/else, break, continue
> - math operations (+, -, *, /, %)
> - boolean logic (and, or, not)
> - comparison operators

Very Cool.

> It now runs amk's euclid.py perfectly now.
> Do we have a way to compare the speed vs python? :)

We just modify it to repeat 100,000 times or so, and compare that way.

Which I did. Parrot comes in about 3x slower than python on euclid.
From looking at the imcc code, though, I think it could be much much
better.

One of my questions is, why do you make so many PerlNums when there
isn't a trace of a floating point number to be found...?

In any case, great work!

Luke

Michal Wallace

unread,
Jul 31, 2003, 2:54:47 PM7/31/03
to Leopold Toetsch, perl6-i...@perl.org
On Thu, 31 Jul 2003, Leopold Toetsch wrote:

> You can do that:
> .sub __main__
> bsr _main
> end
> .end
> .sub _main

...


> So you have just to emit code, to call your real main at the beginning.


Well that worked, and even let me get rid of the
endsub label:

.sub __start__
call __main__
.end
.sub __main__
.sub _f
print ":)"
ret
.end
$I0 = addr _f
print $I0
end
.end


That prints ":)", followed by the address,
which is what I wnat. I can't seem to duplicate
the problem I was having now, but somehow last
night, if I commented out the "addr" line, the
_f sub wouldn't run. In other words, it was like
the addr call was actually invoking the routine.

Maybe I was just tired. :)

> You could also have a look at docs/calling_conventions.pod, which
> currently is being implemented.

Thanks. I hadn't seen the pdd version.
I was going off the other version in imcc/docs
[ in pirate, not in this example :) ]


> > Incidentally, I spent all day working on pirate,
> > and it now generates (and runs!) code for a bunch
> > of python structures:
>
> Wow.

Actually, between imcc and the python compiler
module, it's not nearly as hard as I thought it
would be. So far, I think the parrot version is
actually a lot simpler than the python compiler,
just because imcc is doing so much of the work.

Michal Wallace

unread,
Jul 31, 2003, 2:56:39 PM7/31/03
to Brent Royal-Gordon, perl6-i...@perl.org
On Thu, 31 Jul 2003, Brent Royal-Gordon wrote:

> Michal Wallace:
> > I can store all my subroutine definitions in
> > a list or something and then dump them out
> > after the "__main__" routine.
>

> That seems to be the way to do it, speaking as someone who's working
> on a Perl 5-to-PIL converter (using the B optree-introspection
> modules).

I think I'm going to go ahead and take this
approach. Thanks! :)

Michal Wallace

unread,
Jul 31, 2003, 3:10:00 PM7/31/03
to Luke Palmer, perl6-i...@perl.org
On 31 Jul 2003, Luke Palmer wrote:

> > It now runs amk's euclid.py perfectly now.
> > Do we have a way to compare the speed vs python? :)
> We just modify it to repeat 100,000 times or so, and compare that way.

Oh, duh. :)



> Which I did. Parrot comes in about 3x slower than python on euclid.
> From looking at the imcc code, though, I think it could be much much
> better.

No doubt.

> One of my questions is, why do you make so many PerlNums when there
> isn't a trace of a floating point number to be found...?

Because I didn't read the docs that said PerlNum means "float". :)
I'll switch it to PerlInt (or maybe int?) later... It's also using far
more temporary variables than it needs. Right now I'm thinking that no
matter how complicated the expression, it really only needs two extra
registers: the result and a temporary variable, because all the
operators are either unary or binary... I might try that after I get
functions and classes working.

> In any case, great work!

:) thanks!

Luke Palmer

unread,
Jul 31, 2003, 3:51:57 PM7/31/03
to mic...@sabren.com, perl6-i...@perl.org
> > One of my questions is, why do you make so many PerlNums when there
> > isn't a trace of a floating point number to be found...?
>
> Because I didn't read the docs that said PerlNum means "float". :)
> I'll switch it to PerlInt (or maybe int?) later...

Yeah, all your auxillary data; i.e. the flags you check for control
flow, &c. should be int registers. Python ints should still probably
be pmcs.

> It's also using far more temporary variables than it needs. >Right
> now I'm thinking that no matter how complicated the expression, it
> really only needs two extra registers: the result and a temporary
> variable, because all the operators are either unary or binary... I
> might try that after I get functions and classes working.

Indeed. Functionality is the most important thing at the moment; we
can worry about speed later.

You mind submitting a patch to put this in the languages/pirate
directory of the parrot distro? I'd like to stay up to date, and
probably do some work (as, I imagine, would others).

Luke

Leopold Toetsch

unread,
Jul 31, 2003, 3:21:40 PM7/31/03
to Michal Wallace, perl6-i...@perl.org
Michal Wallace <mic...@sabren.com> wrote:
> .sub __start__
> call __main__
> .end
> .sub __main__
> .sub _f
> print ":)"
> ret
> .end
> $I0 = addr _f
> print $I0
> end
> .end


> That prints ":)", followed by the address,

No, can't imagine that:

$ parrot -o- pirate.imc
__start__:
bsr __main__
_f:
print ":)"
ret
__main__:
set_addr I16, _f
print I16
end

leo

Leopold Toetsch

unread,
Jul 31, 2003, 5:04:21 PM7/31/03
to Luke Palmer, perl6-i...@perl.org
Luke Palmer <fibo...@babylonia.flatirons.org> wrote:

> You mind submitting a patch to put this in the languages/pirate

I'd appreciate that very much. Pie-thon, here we come ...

> Luke

leo

Melvin Smith

unread,
Jul 31, 2003, 6:31:29 PM7/31/03
to Michal Wallace, Leopold Toetsch, perl6-i...@perl.org
At 02:54 PM 7/31/2003 -0400, Michal Wallace wrote:
>Actually, between imcc and the python compiler
>module, it's not nearly as hard as I thought it
>would be. So far, I think the parrot version is
>actually a lot simpler than the python compiler,
>just because imcc is doing so much of the work.

Leo and I (and the rest of us) like to hear comments like this
that actually validate the work put into the tools. Although
they have a long way to go, its a heck of a lot nicer than this
time a year ago.

Thanks again,

-Melvin


Melvin Smith

unread,
Jul 31, 2003, 6:33:45 PM7/31/03
to Luke Palmer, mic...@sabren.com, perl6-i...@perl.org
At 01:51 PM 7/31/2003 -0600, Luke Palmer wrote:
>You mind submitting a patch to put this in the languages/pirate
>directory of the parrot distro? I'd like to stay up to date, and
>probably do some work (as, I imagine, would others).

I'd like to officially complain that "pirate" is a cooler name than
my own "cola" and I haven't figured out what to do about it yet. :)

-Melvin


Joseph Ryan

unread,
Aug 1, 2003, 3:07:43 AM8/1/03
to l...@toetsch.at, perl6-i...@perl.org
Leopold Toetsch wrote:

Speaking of adding new projects to languages, I have a partially complete
JVM->PIR translator done. It's complete, with the exception of:

1: The two threading ops arent translated
2: I need to translate the core libraries. I'm hoping GNU Classpath
will be of
some help here.
3: I'm missing some runtime exceptions, which I just havent gotten around to
yet.

Other than that, its pretty complete.

However, the code it generates isn't quite runnable. Pasm seems to be
missing
a few instructions, specifically add_method and add_attribute instructions.
So, I just made them up. As you can imagine, this causes a few errors
:) That
means that beyond trivial cases, the code is mostly untested.

So, would anyone want this in the tree? Or should I wait until it is better
tested and documented?

You take a look at it at:

http://jryan.perlmonk.org/images/jirate.tar.gz

Let me know what you think.

- Joe

K Stol

unread,
Aug 1, 2003, 12:29:23 PM8/1/03
to Luke Palmer, mic...@sabren.com, Melvin Smith, perl6-i...@perl.org

Actually, I named my little project "pirate" (s.
http://members.home.nl/joeijoei/parrot for this) already, but it's a bit of
a dead end already (although I learnt much of it), so I don't mind.

Klaas-Jan

> -Melvin
>
>
>


Leon Brocard

unread,
Aug 1, 2003, 4:30:16 AM8/1/03
to perl6-i...@perl.org
K Stol sent the following bits through the ether:


> Actually, I named my little project "pirate" (s.
> http://members.home.nl/joeijoei/parrot for this) already, but it's a bit of
> a dead end already (although I learnt much of it), so I don't mind.

Quick, we need more parrot jokes...

I don't like things becoming dead-ends. How much work do you think
it'd be to extend it some more and update it to latest Lua? Would it
be worth checking this into parrot CVS?

Leon
--
Leon Brocard.............................http://www.astray.com/
scribot.................................http://www.scribot.com/

... Dang this hobby is expensive!

K Stol

unread,
Aug 1, 2003, 1:59:48 PM8/1/03
to Leon Brocard, perl6-i...@perl.org

----- Original Message -----
From: "Leon Brocard" <ac...@astray.com>
To: <perl6-i...@perl.org>
Sent: Friday, August 01, 2003 1:30 AM
Subject: Re: subroutines and python status

> K Stol sent the following bits through the ether:
>
> > Actually, I named my little project "pirate" (s.
> > http://members.home.nl/joeijoei/parrot for this) already, but it's a bit
of
> > a dead end already (although I learnt much of it), so I don't mind.
>
> Quick, we need more parrot jokes...

:-)

>
> I don't like things becoming dead-ends. How much work do you think
> it'd be to extend it some more and update it to latest Lua?

Well, at some point while writing the code generator, I had 2 problems.

1: I needed some Parrot features that weren't working yet, like events (I
need an op to post events or something)
so some essential features of the language couldn't be implemented.
2: I misdesigned the code generator; that is, at the point where I couldn't
start over, it was too late, the code generator was too big already (it was
unmaintainable). But because I had a time schedule, I kept it this way (the
product itself wasn't the most important thing, I was writing an
undergraduate report for the last semester of my education (for the record:
the project served me well, I finished this education))

> Would it
> be worth checking this into parrot CVS?
>

Only if the thing would be working, otherwise it would only be a source of
confusion and frustration.
Now I'm just thinking very hard to decide if I've got enough spare time to
rewrite the code generator....

Klaas-Jan


Dan Sugalski

unread,
Aug 1, 2003, 9:50:48 AM8/1/03
to l...@toetsch.at, Luke Palmer, Michal Wallace, perl6-i...@perl.org

As would I. If you're willing, Michal, we can check it in and get you
CVS repository access.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Michal Wallace

unread,
Aug 1, 2003, 10:27:52 AM8/1/03
to Dan Sugalski, l...@toetsch.at, Luke Palmer, perl6-i...@perl.org
On Fri, 1 Aug 2003, Dan Sugalski wrote:

> At 11:04 PM +0200 7/31/03, Leopold Toetsch wrote:
> >Luke Palmer <fibo...@babylonia.flatirons.org> wrote:
> >> You mind submitting a patch to put this in the languages/pirate
>
> >I'd appreciate that very much. Pie-thon, here we come ...
>
> As would I. If you're willing, Michal, we can check it in and get
> you CVS repository access.


Hey guys,

I'd kind of like to keep it where it is - for now, anyway.

But I'm *more* than happy to give people access to the
repository. I just set up users for you three (I'll
send your logins in a second), and if anyone else
wants access, just drop me a note.

Michal Wallace

unread,
Aug 3, 2003, 5:25:10 AM8/3/03
to K Stol, perl6-i...@perl.org
On Fri, 1 Aug 2003, K Stol wrote:

> > From: "Leon Brocard" <ac...@astray.com>
...
> > I don't like things becoming dead-ends. How much work do you think
> > it'd be to extend it some more and update it to latest Lua?
...
> 2: I misdesigned the code generator; that is, at the point where I
> couldn't start over, it was too late, the code generator was too big
> already (it was unmaintainable). But because I had a time schedule,
> I kept it this way (the product itself wasn't the most important
> thing, I was writing an undergraduate report for the last semester
> of my education (for the record: the project served me well, I
> finished this education))

> > Would it be worth checking this into parrot CVS?
>
> Only if the thing would be working, otherwise it would only be a
> source of confusion and frustration. Now I'm just thinking very
> hard to decide if I've got enough spare time to rewrite the code
> generator....

Hmm. I've only messed around with Lua for a few hours
though, and it was several months ago, but the Lua
language seems to be pretty similar to python.

Really, there's a ton of overlap between the various
"high level" languages that parrot wants to support.
Maybe we could put together a generic code generator
that everyone could use? Obviously, it would have to
be set up so you could override the parts for each
language, but it shouldn't be too terribly hard.

What do you think? Want to try squishing pirate/python
and pirate/lua together? :)

K Stol

unread,
Aug 3, 2003, 2:52:11 PM8/3/03
to Michal Wallace, perl6-i...@perl.org

----- Original Message -----
From: "Michal Wallace" <mic...@sabren.com>
To: "K Stol" <k...@home.nl>
Cc: <perl6-i...@perl.org>
Sent: Sunday, August 03, 2003 2:25 AM
Subject: generic code generator? [was: subroutines and python status]


> On Fri, 1 Aug 2003, K Stol wrote:
>
> > > From: "Leon Brocard" <ac...@astray.com>
> ...
> > > I don't like things becoming dead-ends. How much work do you think
> > > it'd be to extend it some more and update it to latest Lua?
> ...
> > 2: I misdesigned the code generator; that is, at the point where I
> > couldn't start over, it was too late, the code generator was too big
> > already (it was unmaintainable). But because I had a time schedule,
> > I kept it this way (the product itself wasn't the most important
> > thing, I was writing an undergraduate report for the last semester
> > of my education (for the record: the project served me well, I
> > finished this education))
>
> > > Would it be worth checking this into parrot CVS?
> >
> > Only if the thing would be working, otherwise it would only be a
> > source of confusion and frustration. Now I'm just thinking very
> > hard to decide if I've got enough spare time to rewrite the code
> > generator....

At this moment, I'm looking at a new version of Lua, the previous 'pirate'
compiled (well, sort of :-) Lua 4
Lua 5 has some features, such as coroutines (If I remembered well) and all
kinds of neat stuff for which Parrot
has built-in support (and it dropped some/a feature(s) from Lua 4). I think
I'll try to create a parser for Lua 5, and to recreate a Lua/Parrot compiler
(should go a lot easier now that I had the time to think about the errors I
made).


>
> Hmm. I've only messed around with Lua for a few hours
> though, and it was several months ago, but the Lua
> language seems to be pretty similar to python.

A few days ago I had a look at Python, and I noticed the similarities, too.
Also the fact that python was noted to be a language that can be used in an
embedded way reminded me of this similarity (Lua was more or less created
for this: extending/embedding).


>
> Really, there's a ton of overlap between the various
> "high level" languages that parrot wants to support.
> Maybe we could put together a generic code generator
> that everyone could use? Obviously, it would have to
> be set up so you could override the parts for each
> language, but it shouldn't be too terribly hard.

Sounds like quite a challenge, but a good idea, and I think worth a try.

>
> What do you think? Want to try squishing pirate/python
> and pirate/lua together? :)

Yeah, I like the idea. Let's try this out.

Klaas-Jan

Michal Wallace

unread,
Aug 3, 2003, 6:29:31 AM8/3/03
to K Stol, perl6-i...@perl.org
On Sun, 3 Aug 2003, K Stol wrote:

> At this moment, I'm looking at a new version of Lua, the previous
> 'pirate' compiled (well, sort of :-) Lua 4 Lua 5 has some features,
> such as coroutines (If I remembered well) and all kinds of neat
> stuff for which Parrot has built-in support (and it dropped some/a
> feature(s) from Lua 4). I think I'll try to create a parser for Lua
> 5, and to recreate a Lua/Parrot compiler (should go a lot easier now
> that I had the time to think about the errors I made).

Cool. :) I'm just now reading through your report.


> > Really, there's a ton of overlap between the various
> > "high level" languages that parrot wants to support.
> > Maybe we could put together a generic code generator
> > that everyone could use? Obviously, it would have to
> > be set up so you could override the parts for each
> > language, but it shouldn't be too terribly hard.
>
> Sounds like quite a challenge, but a good idea, and I think worth a try.
>
> > What do you think? Want to try squishing pirate/python
> > and pirate/lua together? :)
>
> Yeah, I like the idea. Let's try this out.

Great! I figure since you've already got lua 4
working, we can leverage what you've already
got and then just add the new features for
python and lua 5.

If you're still around, want to meet up online real
quick? I'm logged in as sabren in #parrot on
irc.infobot.org

Stephen Thorne

unread,
Aug 3, 2003, 7:19:49 PM8/3/03
to Michal Wallace, perl6-i...@perl.org, K Stol
On Sun, 3 Aug 2003 19:25, Michal Wallace wrote:
> On Fri, 1 Aug 2003, K Stol wrote:
> Really, there's a ton of overlap between the various
> "high level" languages that parrot wants to support.
> Maybe we could put together a generic code generator
> that everyone could use? Obviously, it would have to
> be set up so you could override the parts for each
> language, but it shouldn't be too terribly hard.
>
> What do you think? Want to try squishing pirate/python
> and pirate/lua together? :)

A nice high level code generator would be in my interests as well. Seeing as
I'm currently working on php/parrot and I've got 'hello world' standard imcc
code generation going. I'd really like to be able to save alot of the low
level work.

With regards to my own project, would it be appropriate to ask for parrot CVS
access in order to publish the php compiler in the parrot source tree? One of
the files is under the Zend license, being a direct derivation from
zend_language_scanner.y, are there any licensing restrictions about what goes
into perl cvs?

Stephen.

Michal Wallace

unread,
Aug 4, 2003, 10:48:49 PM8/4/03
to K Stol, perl6-i...@perl.org
On Sun, 3 Aug 2003, K Stol wrote:

> > What do you think? Want to try squishing pirate/python
> > and pirate/lua together? :)
>
> Yeah, I like the idea. Let's try this out.


Well, I finished reading your report[1] and
posted some of my (rather unorganized) thoughts
up at [2]

It does seem like there are some snags getting
languages to talk to each other, even with the
calling conventions, but even so, I'm even more
convinced now that a generic, overridable
code-generator is the way to go.

It seems to me that if we want to maximize the
number of languages using it, the generic
compiler shouldn't depend on anything but
C and parrot... But until we get it working,
I'd like to stick to a dynamic language like
python/perl/lua/scheme. And, well, my code's
already in python... :) [though I'd actually
love to try out some lua 5]

What I'm picturing is a template system for
specifying chunks of IMCC. Something like this:


ast generic:
on @while(test, body):
% while = gensym("while")
% endwhile = gensym("endwhile")
% test = gensym("$I")

{while}:
{test} = @expr
unless {test} goto {endwhile}
@body
{endwhile}:

on @if(test, elif*, else?):
...

ast python(generic):
on @while(test, body, else?):
...


Okay, I don't have a good syntax in mind yet,
the point is it's a template language and you
can subclass/override/extend the template.
Maybe there's no syntax and it just uses
cleanly coded classes in some oo language.
Or perl6 with it's grammars and rules. I
don't know.

Once the templates are defined, you pass
the compiler your AST and it walks it and
applies the template.

So the C api basically is just about building
the tree and saying "generate with this
language file". Then the language designer's
job is just to transform an outside ast
into a generic ast.

Anyway, I'm talking a lot of nonsense. I'd
rather just see what I can do about decoupling
my code generator from python and sharing
it with another language instead.

[1] lua report: http://members.home.nl/joeijoei/parrot/report.pdf
[2] http://pirate.versionhost.com/viewcvs.cgi/pirate/INTEROP?rev=1.1

Stephen Thorne

unread,
Aug 4, 2003, 10:55:55 PM8/4/03
to Michal Wallace, K Stol, perl6-i...@perl.org
On Tue, 5 Aug 2003 12:48, Michal Wallace wrote:
> It does seem like there are some snags getting
> languages to talk to each other, even with the
> calling conventions, but even so, I'm even more
> convinced now that a generic, overridable
> code-generator is the way to go.
>
> It seems to me that if we want to maximize the
> number of languages using it, the generic
> compiler shouldn't depend on anything but
> C and parrot... But until we get it working,
> I'd like to stick to a dynamic language like
> python/perl/lua/scheme. And, well, my code's
> already in python... :) [though I'd actually
> love to try out some lua 5]

I like the concept, but I have a comment about the implementation. For PHP,
and even for Python, it is necessery to do code generation on the fly, for
things like eval() and dynamic imports (php's ""include"" is always a dynamic
import).

Thus the code generator is best suited to be in a language that can be run
from within the parrot machine, otherwise statements like 'eval()' would not
be possible without binding parrot to a non-portable C library.

I would instead suggest that we pick a suitable 'dynamic' language to write
the code generator in, so it can be self-hosting.

Regards,
Stephen Thorne.

Michal Wallace

unread,
Aug 4, 2003, 11:09:54 PM8/4/03
to Stephen Thorne, K Stol, perl6-i...@perl.org
On Tue, 5 Aug 2003, Stephen Thorne wrote:

> > It seems to me that if we want to maximize the
> > number of languages using it, the generic
> > compiler shouldn't depend on anything but
> > C and parrot... But until we get it working,
> > I'd like to stick to a dynamic language like
> > python/perl/lua/scheme. And, well, my code's
> > already in python... :) [though I'd actually
> > love to try out some lua 5]

> I like the concept, but I have a comment about the
> implementation. For PHP, and even for Python, it is necessery to do
> code generation on the fly, for things like eval() and dynamic
> imports (php's ""include"" is always a dynamic import).

definitely.



> Thus the code generator is best suited to be in a language that can
> be run from within the parrot machine, otherwise statements like
> 'eval()' would not be possible without binding parrot to a
> non-portable C library.
>
> I would instead suggest that we pick a suitable 'dynamic' language
> to write the code generator in, so it can be self-hosting.

Sure. That's why I said stick to C or parrot. I'm not sure
I understand why C wouldn't be portable... (I don't know c
at all but I thought that was the point)?? I'd much rather
use parrot. Where "parrot" means something else compiled
down to parrot, and "something else" means python. :)

Joseph Ryan

unread,
Aug 5, 2003, 3:24:51 AM8/5/03
to Michal Wallace, K Stol, perl6-i...@perl.org
Michal Wallace wrote:

I think that trying to define a new syntax for a new meta-language is a bad
idea. The goal of a GCG (Generic Code Generator) should be to
allieviate the
compiler writers of the responsiblity of generating code. Forcing them to
generate different code doesn't help solve the problem. (-:

However, at the risk of sounding lame, what if the GCG syntax was
instead some
sort of standard meta-language structure like YAML or XML? As in, the
syntax
wouldn't be a syntax at all, but just a dump of the AST with
standardized node
names. I think this would have a number of benefits:

1.) Instead of forcing the compiler writer to generate code, the
compiler writer
would only have to transform the parse tree into a structure that is
name-consistant with the GCG's standard, and then use any of a number of
existing libraries to dump the tree as YAML/XML.

2.) Since there are more YAML/XML parsers than I can count implemented in
nearly modern useful language I can think of, the GCG could be generated in
any language without causing a stall on starting on the generic code
generation
part of the project. (you know, the important part)

3.) It would be possible to handle language-specific nodes by defining some
sort of "raw" node whose value could be raw imcc code.

Anyways, just a few thoughts. A tool like this would be *very* useful,
I think. Good luck. (-:

- Joe


Dan Sugalski

unread,
Aug 5, 2003, 12:07:44 PM8/5/03
to Michal Wallace, Stephen Thorne, K Stol, perl6-i...@perl.org
At 11:09 PM -0400 8/4/03, Michal Wallace wrote:
>On Tue, 5 Aug 2003, Stephen Thorne wrote:
>
>> Thus the code generator is best suited to be in a language that can
>> be run from within the parrot machine, otherwise statements like
>> 'eval()' would not be possible without binding parrot to a
>> non-portable C library.
>>
>> I would instead suggest that we pick a suitable 'dynamic' language
>> to write the code generator in, so it can be self-hosting.
>
>Sure. That's why I said stick to C or parrot. I'm not sure
>I understand why C wouldn't be portable... (I don't know c
>at all but I thought that was the point)?? I'd much rather
>use parrot. Where "parrot" means something else compiled
>down to parrot, and "something else" means python. :)

The original thought was to use the new perl 6 grammar engine/code to
do this, but I think it'll be a while before that's ready to go.

Rather than invent an entirely new language for this (which is
somewhat problematic) why not go for something already reasonably
well-known? YACC and BNF grammars seem like a good place to start,
especially as most of the languages have some form of grammar defined
for them.

It's only a first step, as then everyone beats the heck out of the
resulting token stream, but it's a place to start. 80-90% of the
result will end up being generically parseable as well--"x + y" will
generate the same code for all languages if x and y are PMCs, for
example, so I'd bet we could have a lot of standard products designed.

K Stol

unread,
Aug 5, 2003, 4:53:21 PM8/5/03
to Joseph Ryan, Michal Wallace, perl6-i...@perl.org

Yeah, I like the template idea.


> >
> >Okay, I don't have a good syntax in mind yet,
> >the point is it's a template language and you
> >can subclass/override/extend the template.
> >Maybe there's no syntax and it just uses
> >cleanly coded classes in some oo language.
> >Or perl6 with it's grammars and rules. I
> >don't know.
> >
>
> I think that trying to define a new syntax for a new meta-language is a
bad
> idea. The goal of a GCG (Generic Code Generator) should be to
> allieviate the
> compiler writers of the responsiblity of generating code. Forcing them to
> generate different code doesn't help solve the problem. (-:
>
> However, at the risk of sounding lame, what if the GCG syntax was
> instead some
> sort of standard meta-language structure like YAML or XML?

XML also has a syntax...sounds just like the template idea to me.

As in, the
> syntax
> wouldn't be a syntax at all, but just a dump of the AST with
> standardized node
> names. I think this would have a number of benefits:
>
> 1.) Instead of forcing the compiler writer to generate code, the
> compiler writer
> would only have to transform the parse tree into a structure that is
> name-consistant with the GCG's standard, and then use any of a number of
> existing libraries to dump the tree as YAML/XML.
>
> 2.) Since there are more YAML/XML parsers than I can count implemented in
> nearly modern useful language I can think of, the GCG could be generated
in
> any language without causing a stall on starting on the generic code
> generation
> part of the project. (you know, the important part)

The parsing should not be a problem (well, unless we come up with a really
nasty syntax)
Using a tool such as Yacc really helps for fast prototyping.


>
> 3.) It would be possible to handle language-specific nodes by defining
some
> sort of "raw" node whose value could be raw imcc code.

This would be really handy.


>
> Anyways, just a few thoughts. A tool like this would be *very* useful,
> I think. Good luck. (-:
>
> - Joe
>

I quickly scanned all messages on this topic, so I really read it fast (and
probably missed some things I guess).
Here are my ideas. I have to think things over well, but here are my quick
thoughts.

I like the ideas of templates. Furthermore, ...

> I think that trying to define a new syntax for a new meta-language is a
bad
> idea. The goal of a GCG (Generic Code Generator) should be to
> allieviate the
> compiler writers of the responsiblity of generating code. Forcing them to
> generate different code doesn't help solve the problem. (-:

I don't think the compiler writer has to *generate* this metacode, but it
can be used just like YACC and LEX:
just create a code generater generater based on some file. Most ideal would
of course be a compiler that can be constructed like:

LEX input file -> Lex -> scanner
YACC input file -> Yacc -> parser
TREECC input file -> TreeCC -> AST code
PIRATE[1] input file -> Pirate -> Code generator

But that's for later :-)

[1] I just called this project "pirate" because of the similarly named
compilers for which it could be used (Lua/Python, and this "pirate" is
Taking Over the Ship of Code generation :-P, oh well....)

Well, just got home, so I'll go read it better soon.

Klaas-Jan

Michal Wallace

unread,
Aug 5, 2003, 1:35:54 PM8/5/03
to Joseph Ryan, K Stol, perl6-i...@perl.org
On Tue, 5 Aug 2003, Joseph Ryan wrote:

> >Okay, I don't have a good syntax in mind yet,
> >the point is it's a template language and you
> >can subclass/override/extend the template.
> >Maybe there's no syntax and it just uses
> >cleanly coded classes in some oo language.
> >Or perl6 with it's grammars and rules. I
> >don't know.
> >

> I think that trying to define a new syntax for a new meta-language
> is a bad idea. The goal of a GCG (Generic Code Generator) should be
> to allieviate the compiler writers of the responsiblity of
> generating code. Forcing them to generate different code doesn't
> help solve the problem. (-:

Good point. I don't think I was very clear yesterday. Let
me try again. Let's say you're generating... I dunno.. haskell:

haskell_parser -> ast -> pirate -> parrot_code --> imcc -> pbc
^
|
parrot_code__templates


So the haskell parser only has to generate a pirate ast structure.
Either that's a very basic string (I like your XML idea) *or*, in
the future, the parser calls pirate tree-building-methods directly.

The tree-building methods are why I was talking about C, but for
now, I don't mind doing a bunch of xml-generation every time we
call eval/exec. (It's probably the simplest thing to do right now
and should be pretty easy for everyone)


> 1.) Instead of forcing the compiler writer to generate code, the
> compiler writer would only have to transform the parse tree into a
> structure that is name-consistant with the GCG's standard, and then
> use any of a number of existing libraries to dump the tree as
> YAML/XML.

I like it! :)


> 2.) Since there are more YAML/XML parsers than I can count
> implemented in nearly modern useful language I can think of, the GCG
> could be generated in any language without causing a stall on
> starting on the generic code generation part of the project. (you
> know, the important part)

Agreed!


> 3.) It would be possible to handle language-specific nodes by
> defining some sort of "raw" node whose value could be raw imcc code.

That's where the templates come in, but since my crack
at a template language last night sucked so bad, I'm
thinking that at least for prototyping I'm going to
use a python class.

So I'm thinking I'm going to try refactoring so that
I can do this:

pypirate something.py > something.xml
cat something.xml > pirate -l python > something.imc
imcc something.imc

or just:

pypirate something.py | pirate -l python | imcc

or just:

pypyrate -r something.py

That also means the "pirate" command can be written in
any language we like. Probably eventually that'll be
perl6, but for now (unless someone else wants to
volunteer some code) I plan to work from the code
I have for python.

Michal Wallace

unread,
Aug 5, 2003, 1:49:27 PM8/5/03
to Dan Sugalski, Stephen Thorne, K Stol, perl6-i...@perl.org
On Tue, 5 Aug 2003, Dan Sugalski wrote:

> The original thought was to use the new perl 6 grammar engine/code
> to do this, but I think it'll be a while before that's ready to go.

I think perl6 is definitely the way to go, once it's ready.

BTW, what's the deal with Bundle::Perl6? I tried installing it from
cpan yesterday but it wouldn't install with my perl. (I forget why)
Is it meant to be usable today?


> Rather than invent an entirely new language for this (which is
> somewhat problematic) why not go for something already reasonably
> well-known? YACC and BNF grammars seem like a good place to start,
> especially as most of the languages have some form of grammar
> defined for them.

Well the "new language" was just for the imcc code templates.
Now I think the better path is to define a code-generating class
for each ast node.

Then PythonWhileNode can be a subclass of the generic WhileNode.
(Python while statements have an extra "else" block that most
languages don't have)


> It's only a first step, as then everyone beats the heck out of the
> resulting token stream, but it's a place to start. 80-90% of the
> result will end up being generically parseable as well--"x + y" will
> generate the same code for all languages if x and y are PMCs, for
> example, so I'd bet we could have a lot of standard products
> designed.

Yep. For now, everything pypirate generates *is* pmc based,
since python doesn't have type declarations. Generic pirate
should probably have a "declare" node though. (Especially
if it's to handle perl6, right?)

Dan Sugalski

unread,
Aug 7, 2003, 10:04:12 PM8/7/03
to Michal Wallace, Joseph Ryan, K Stol, perl6-i...@perl.org
At 1:35 PM -0400 8/5/03, Michal Wallace wrote:
>On Tue, 5 Aug 2003, Joseph Ryan wrote:
>
>> >Okay, I don't have a good syntax in mind yet,
>> >the point is it's a template language and you
>> >can subclass/override/extend the template.
>> >Maybe there's no syntax and it just uses
>> >cleanly coded classes in some oo language.
>> >Or perl6 with it's grammars and rules. I
>> >don't know.
>> >
>
>> I think that trying to define a new syntax for a new meta-language
>> is a bad idea. The goal of a GCG (Generic Code Generator) should be
>> to allieviate the compiler writers of the responsiblity of
>> generating code. Forcing them to generate different code doesn't
>> help solve the problem. (-:
>
>
>Good point. I don't think I was very clear yesterday. Let
>me try again. Let's say you're generating... I dunno.. haskell:
>
>
>
> haskell_parser -> ast -> pirate -> parrot_code --> imcc -> pbc
> ^
> |
> parrot_code__templates
>
>
>So the haskell parser only has to generate a pirate ast structure.
>Either that's a very basic string (I like your XML idea) *or*, in
>the future, the parser calls pirate tree-building-methods directly.

I'd much rather go for the AST directly for a number of reasons. Not
the least of which is a deep personal loathing for XML, but putting
that aside it seems sub-optimal to have a binary structure which we
then serialize to text and then deserialize all as part of a single
parse and compile stage. XML has the added disadvantage of needing a
good XML parser, which probably means expat or something like it.
(Yeah, I know, we could restrict ourselves to a subset of XML so we
don't have to have a full parser, but that'll never last--someone'll
expand it to a full parser at some point)

Going for a full-fledged AST builder/flattener meets some of the
long-term goals as well, as it's been planned to be the stage between
the parser and IMCC for ages (pretty much since IMCC first appeared)
so it'd be worthwhile.

We should consider AST transforms as well, since a number of
optimizations are best performed on the AST, and many languages will
want to get hold of the AST before it goes any further and process it
in some way. (This would be how Lisp macros would act, for example)

> > 1.) Instead of forcing the compiler writer to generate code, the
>> compiler writer would only have to transform the parse tree into a
>> structure that is name-consistant with the GCG's standard, and then
>> use any of a number of existing libraries to dump the tree as
>> YAML/XML.
>
>I like it! :)

This'd be a cool thing to be sure, and useful as a human-readable
form. (Though I'd prefer the AST that's frozen into the bytecode be
in a form that's more efficient to deserialize) I'd best go and
update the license terms then, to make sure there aren't any
problems. (Old issues raised by GCC an age ago)

Michal Wallace

unread,
Aug 8, 2003, 5:31:30 PM8/8/03
to Dan Sugalski, Joseph Ryan, K Stol, perl6-i...@perl.org
On Thu, 7 Aug 2003, Dan Sugalski wrote:

> > haskell_parser -> ast -> pirate -> parrot_code --> imcc -> pbc
> > ^
> > |
> > parrot_code__templates
> >
> >
> >So the haskell parser only has to generate a pirate ast structure.
> >Either that's a very basic string (I like your XML idea) *or*, in
> >the future, the parser calls pirate tree-building-methods directly.
>
> I'd much rather go for the AST directly for a number of reasons. Not
> the least of which is a deep personal loathing for XML, but putting
> that aside it seems sub-optimal to have a binary structure which we
> then serialize to text and then deserialize all as part of a single
> parse and compile stage. XML has the added disadvantage of needing a
> good XML parser, which probably means expat or something like it.
> (Yeah, I know, we could restrict ourselves to a subset of XML so we
> don't have to have a full parser, but that'll never last--someone'll
> expand it to a full parser at some point)


Klaas-Jan and I spiked out a prototype last night that took
s-expressions. We got this working:

(while (const 1) (pass))

Where While was a class that could be subclassed for PythonWhile
(python has an extra "else" clause in there)

I think everyone agrees that long term, there's no point in
passing around xml or s-expressions because it'll be dog
slow, but for now, for prototyping, passing the flattened
AST in as an s-expression on STDIN means that anyone can
write to it and we don't have to worry about a tree-building
interface until after the code is working.


> Going for a full-fledged AST builder/flattener meets some of the
> long-term goals as well, as it's been planned to be the stage between
> the parser and IMCC for ages (pretty much since IMCC first appeared)
> so it'd be worthwhile.

I agree. The main reason to hold off *FOR NOW* is bindings. If
everyone wrote their parsers in C, then we could just use that.
But I don't feel like learning C, so... Right now the code is
in python. I suspect eventually this will be written in perl6
or someting.


> We should consider AST transforms as well, since a number of
> optimizations are best performed on the AST, and many languages will
> want to get hold of the AST before it goes any further and process it
> in some way. (This would be how Lisp macros would act, for example)

Absolutely. I suspect that for python, I'll be turning list
comprehensions into the corresponding "foreach" nodes before
serializing the tree.


> > > 1.) Instead of forcing the compiler writer to generate code, the
> >> compiler writer would only have to transform the parse tree into a
> >> structure that is name-consistant with the GCG's standard, and then
> >> use any of a number of existing libraries to dump the tree as
> >> YAML/XML.
> >
> >I like it! :)
>
> This'd be a cool thing to be sure, and useful as a human-readable
> form. (Though I'd prefer the AST that's frozen into the bytecode be
> in a form that's more efficient to deserialize) I'd best go and
> update the license terms then, to make sure there aren't any
> problems. (Old issues raised by GCC an age ago)

I'm not sure I follow this.

0 new messages