Some concern about ChezScheme...

1,131 views
Skip to first unread message

Paulo Matos

unread,
Feb 5, 2019, 8:01:20 AM2/5/19
to Racket Users
Hi all,

Now that I got your attention... :)
Although the title is not purely click-bait, it is motivated by personal
requirements.

Most of us are happy with the move to Chez (actually haven't heard
anyone opposing it), but I would like to point to something I have felt
over the past year and to understand if this is just my feeling or not
from others (maybe Matthew) who have had any experience working with Chez.

I have been, for over a year on and off, trying to port Chez to RISC-V.
My problem here is really understanding the Chez arquitecture, rather
than the RISC-V one with which I have been working for a few years now
as a consultant.

The whole point of the work was not to get Chez on RISC-V per se but to
get Racket on RISC-V. My initial email to the Chez ML was replied to by
Andy Keep. He replied in great detail on how to create a Chez backend -
I have cleaned up his reply and have been slowly adding to it [1].

That was a great start but from there things started to fall apart.
Further emails to my questions were generally not replied to (of the 4
messages sent, 1 was replied to) [2].

Then there is some backend rot... I noticed for example, that there was
no Makefile for a threaded version of arm32 although the backend file is
there meaning it should be supported. It seems that it's just that
nobody every tried to build it. Then software floating point is an
option in a backend config file but if you enable it, bootstrapping
doesn't work because the compiler really expects you to have some
floating point registers.

Matthew mentions the move to Chez will help maintainability and I am
sure he's right because he has been working with Racket for a long time
but my experience comes from looking at backend files. When you look at
them you end up being forced to look elsewhere, specifically the
cpnanopass.ss file [3]. Well, this file is the stuff of nightmares...
It's over 16000 (sixteen thousand!!!) lines of dense scheme code, whose
comments are not necessarily Chez-Beginner friendly (maybe Alexis wants
to rewrite it? [4]).

So I am a bit concerned about this. I somehow get the feeling that
what's going to happen is that Chez is going to slowly degenerate to a
Racket sub-project, and nobody is going to really use Chez directly.
Therefore this means Matthew et al. will end up maintaining it along
with Racket itself. As far as I understand it, both A. Keep and R.
Dybvig are both at Cisco and Chez is a side-project from which they are
slowly distancing themselves. Chez becoming a sub-project of Racket
might seem far-fetched until you noticed Matthew is already the number 4
contributor of Chez since it was open-sourced [5].

The only question I have is really, what do other people feel about
this. Am I making any sense? Have I missed some hidden Chez community
that's working day and night into improving it? Or is Chez current sole
purpose of existence to support Racket?

[1] https://github.com/LinkiTools/ChezScheme-RISCV/blob/wip-riscv/PORTING.md
[2]
https://groups.google.com/forum/#!searchin/chez-scheme/Paulo$20Matos%7Csort:date
[3] https://github.com/cisco/ChezScheme/blob/master/s/cpnanopass.ss
[4] https://twitter.com/lexi_lambda/status/1092539293791330305
[5] https://github.com/cisco/ChezScheme/graphs/contributors

--
Paulo Matos

Greg Trzeciak

unread,
Feb 5, 2019, 1:02:16 PM2/5/19
to Racket Users


On Tuesday, February 5, 2019 at 2:01:20 PM UTC+1, Paulo Matos wrote:

So I am a bit concerned about this. I somehow get the feeling that
what's going to happen is that Chez is going to slowly degenerate to a
Racket sub-project, and nobody is going to really use Chez directly.


... and the two became one...  and from that day they were called RaChez [read: ruckus]

I could not resist the pun but hope you will get more meaningful response then mine. 
Although if someone is looking for racket2 alternative name, #lang ruckus seems fitting (given circumstances).

Matthew Flatt

unread,
Feb 5, 2019, 1:05:48 PM2/5/19
to Paulo Matos, Racket Users
Hi Paulo,

Not to discourage other answers to your call for opinions, but here's
mine.

Granting your point about the structure of the code in Chez Scheme,
everything is relative. I still think Chez Scheme is a better starting
point than the existing Racket implementation as code to reorganize,
document, and improve. Adding a comment or two in the source is likely
a part of that process. :) We're not at a point where it makes sense to
recommend big changes to the organization of the Chez Scheme source,
but that could happen.

I think you're mistaken about Andy and Kent's position on Chez Scheme.
They will certainly not distance themselves from the project any more
than I could conceivably distance myself from Racket. But they have
priorities besides improving something that is probably just about
perfect for their purposes.

We're under no illusion that moving to Chez Scheme will change much
about who contributes to the core Racket implementation. Based on past
experience (e.g, moving `racket/gui` from C++ to Racket), the
contributor rolls will grow, but not radically. I'm delighted that you
have been willing try a RISC-V port, and even the attempt is a kind of
success and will help things improve things through feedback like this.

Personally, while my contributions to Chez Scheme so far have been
modest, I have already factored into my costs the worst-case scenario
of fully maintaining Chez Scheme as used by Racket. Even if that
happens, it still looks like a good deal in the long run.

Matthew
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Neil Van Dyke

unread,
Feb 5, 2019, 4:44:04 PM2/5/19
to Racket Users
I had a related (but different and small) concern about the new
dependency a while ago (this was off-list), but it sounded like that
risk was covered, and also that Matthew has really gotten into the Chez
code.

BTW, sometime around when the move to Chez settles, it would be good if
many people were somewhat familiar with current Racket internals. 
(Personally, I know a little of it, from writing C extensions, and from
occasional intense optimizing, but I should take a fresh look, and the
Chez move seems like a great time.)

BTW, students, especially: if you've not yet been on a few projects in
which you've looked at diffs for every single change (maybe by having
them emailed), it's a learning experience, giving you a better feel for
the system and its evolution and the process, and Racket might be a good
one for that.  Discussions around changes, too, if you have access to
those.  I might soon do this myself, for Racket and/or Chez upstream,
although it feels suspiciously like work. :)

P.S., Yay, RISC-V! :)

Alex Harsanyi

unread,
Feb 5, 2019, 5:21:14 PM2/5/19
to Racket Users

I guess I also have some concerns about the move to Chez, and largely for the same reasons:

* the Chez community is very small, at least when looking at the chez-scheme Google Group and Github activity.  I am glad that I'm not the only one who noticed that.

* the "maintainability" angle is questionable.  I read all the reports and watched the presentation on YouTube, but all I found can be summarized as  "Matthews opinion is that it will be more maintainable", while his opinion carries a lot of weight, it remains an opinion. Time will tell if Racket-on-Chez will actually be more maintainable.  BTW, in my professional career I have seen a few large ode re-writes whose only benefit was to be "improved maintainability", without a clear definition of what that means. After the company spending a lot of time and money, this maintainability improvement did not materialize...  based on that experience, I remain cautious -- sorry, I have been burnt to many times :-)

* performance wise, it is clear to me by now, that the best I can hope for is for overall performance to remain largely the same when Racket 8.0? is released.  In the current snapshot, the overall performance is worse.  There is a separate google-groups thread about this, and I will try to help out on this, mostly by providing test cases.

Based on what I have read and seen, my best understanding of Racket-on-Chez is that Racket as a language research platform, will probably benefit from this move and  Racket as a teaching platform will not be affected either way.  What that means for software development using Racket, remains to be seen.

I apologize for the negative message...
Alex.

Neil Van Dyke

unread,
Feb 5, 2019, 6:47:12 PM2/5/19
to Racket Users
I appreciate the engineering diligence behind Alex's and Paulo's concerns.

Given the exemplary track record on Racket, I'm comfortable putting
faith in Matthew's assessments (like a trusted engineering colleague,
beyond the quantifiable like interim benchmarks), and there's at least
some obvious goal alignment, and there's also been a good amount of
transparency and conspicuous methodical work on the Chez work.

Thanks to Alex, I just realized that I don't recall the plans for
performance at switchover time, and going from there.  It'd be good to
get an update/reminder on the current thinking, as people have to plan
for long-term.

(My experience with a Racket production server farm is that we often
didn't pick up each new Racket version immediately, and one time I
backported a Racket enhancement across multiple versions when we had to
be especially conservative, but we wanted to keep the option to move to
use the latest version at any time, and we liked to know that we're on a
good track for long-term.)

(BTW, some of us run Racket on ancient Intel Core 2 and older smartphone
ARM, plus have Racket on a beefy new dedicated real-metal compute
server, and we use
"https://www.neilvandyke.org/racket/install-racket-versioned.sh"... so
we will know if anyone tries any funny-stuff! :)

Matt Jadud

unread,
Feb 6, 2019, 7:42:29 AM2/6/19
to Paulo Matos, Racket Users
On Tue, Feb 5, 2019 at 8:01 AM 'Paulo Matos' via Racket Users <racket...@googlegroups.com> wrote:

Matthew mentions the move to Chez will help maintainability and I am
sure he's right because he has been working with Racket for a long time
but my experience comes from looking at backend files. When you look at
them you end up being forced to look elsewhere, specifically the
cpnanopass.ss file [3]. Well, this file is the stuff of nightmares...
It's over 16000 (sixteen thousand!!!) lines of dense scheme code, whose
comments are not necessarily Chez-Beginner friendly (maybe Alexis wants
to rewrite it? [4]).

Interestingly, having been in the classroom* around '98-2000  when some of these nanopass ideas were being developed (or, really, when I think they were really hitting stride in the classroom---I'm sure they were being developed well before), I find [3] to be exceedingly readable. Well, not "exceedingly": I think it would benefit from some breaking apart into separate modules. However, it uses the nanopass framework for specifying a series of well-defined languages, each of which can be checked/tested between pipeline stages. 

Some of the more gnarly code is in the register allocation... which is unsurprising. I do like that I can flip to the end, see the driver for all the passes, and each pass is a separate, match-like specification of a transformation from one language (datatype) to another. Ignoring the fact that there's support code in the file, 16KLOC suggests around 500 lines per pass (at roughly 30 passes, it looks like); 500 lines seems to me to be a manageable unit of code for a single pass of a compiler that should, if written true-to-form, does just one thing per pass. (This is, I suspect, a classic "YMMV" kind of comment.)

I can't say that I'm about to step in and join the compiler team (save us all from the thought!). I do think that it's nice to see the idea a nanopass compiler 1) in production and 2) having the maturity to become part of the production back-end of Racket. If [1] is where some/much of Racket's backend currently lives, I am ecstatic that the backend will be more Scheme (Chez? Racket?) than C/C++.

Cheers,
Matt


* As an aside, one of the few times I remember Kent Dybvig making a "joke" in class was when he introduced the pass "remove complex operands." It was called "remove-complex-opera*." At Indiana, where Opera is a Thing, I think it was particularly funny as an inside joke of sorts. He devolved for a moment into what I can only describe as giggles---but, it was subtle just the same. It brings me a certain amount of joy to see "np-remove-complex-opera*" in [3].

Paulo Matos

unread,
Feb 6, 2019, 12:12:52 PM2/6/19
to racket...@googlegroups.com


On 05/02/2019 19:05, Matthew Flatt wrote:
> Hi Paulo,
>
> Not to discourage other answers to your call for opinions, but here's
> mine.
>
> Granting your point about the structure of the code in Chez Scheme,
> everything is relative. I still think Chez Scheme is a better starting
> point than the existing Racket implementation as code to reorganize,
> document, and improve. Adding a comment or two in the source is likely
> a part of that process. :) We're not at a point where it makes sense to
> recommend big changes to the organization of the Chez Scheme source,
> but that could happen.

Hearing you say this is a great relieve. It certainly shows that you
have thought that part of growing and improving Racket is improving Chez
itself. Therefore part of contributing to Racket might be contributing
to Chez itself.

>
> I think you're mistaken about Andy and Kent's position on Chez Scheme.
> They will certainly not distance themselves from the project any more
> than I could conceivably distance myself from Racket. But they have
> priorities besides improving something that is probably just about
> perfect for their purposes.
>

Sure, I completely understand your point. I am not bashing on Andy and
Kent. My position is that as a user Racket has a lot more community
support than Chez. Also, part (if not all) of your daily work seems to
be on Racket. Chez seems to be part of Andy's and Kent's past in the
sense that it doesn't feel like they are actively engaged with it any
longer. Again, this is no bashing of their positions, simply my feeling
as a user of both Racket and Chez. People have different paths and they
are in their own right to pursue other interests.

> We're under no illusion that moving to Chez Scheme will change much
> about who contributes to the core Racket implementation. Based on past
> experience (e.g, moving `racket/gui` from C++ to Racket), the
> contributor rolls will grow, but not radically. I'm delighted that you
> have been willing try a RISC-V port, and even the attempt is a kind of
> success and will help things improve things through feedback like this.
>

I find the direction that Racket is taking extremely exciting and the
your viewpoint to where you see this going a breath of fresh air.

> Personally, while my contributions to Chez Scheme so far have been
> modest, I have already factored into my costs the worst-case scenario
> of fully maintaining Chez Scheme as used by Racket. Even if that
> happens, it still looks like a good deal in the long run.
>

Thanks. I guess it all makes sense.
--
Paulo Matos

Paulo Matos

unread,
Feb 6, 2019, 12:16:18 PM2/6/19
to racket...@googlegroups.com


On 05/02/2019 22:44, Neil Van Dyke wrote:
> BTW, sometime around when the move to Chez settles, it would be good if
> many people were somewhat familiar with current Racket internals.

That would be absolutely great. I think if there is a small team of
contributors alongside Matthew improving Chez in order to get Racket
running on more archs and faster, it would definitely help further the
Racket cause.

This comment will get me started on writing down somewhere what I know
(and don't know) about the Racket / Chez internals and do some further
research.

> P.S., Yay, RISC-V! :)
>

Happy to get some help on the port. :)

--
Paulo Matos

Paulo Matos

unread,
Feb 6, 2019, 12:30:31 PM2/6/19
to racket...@googlegroups.com


On 06/02/2019 13:42, Matt Jadud wrote:
> On Tue, Feb 5, 2019 at 8:01 AM 'Paulo Matos' via Racket Users
> <racket...@googlegroups.com <mailto:racket...@googlegroups.com>>
> wrote:
>
>
> Matthew mentions the move to Chez will help maintainability and I am
> sure he's right because he has been working with Racket for a long time
> but my experience comes from looking at backend files. When you look at
> them you end up being forced to look elsewhere, specifically the
> cpnanopass.ss file [3]. Well, this file is the stuff of nightmares...
> It's over 16000 (sixteen thousand!!!) lines of dense scheme code, whose
> comments are not necessarily Chez-Beginner friendly (maybe Alexis wants
> to rewrite it? [4]).
>
>
> Interestingly, having been in the classroom* around '98-2000  when some
> of these nanopass ideas were being developed (or, really, when I think
> they were really hitting stride in the classroom---I'm sure they were
> being developed well before), I find [3] to be exceedingly readable.
> Well, not "exceedingly": I think it would benefit from some breaking
> apart into separate modules. However, it uses the nanopass framework for
> specifying a series of well-defined languages, each of which can be
> checked/tested between pipeline stages. 
>

I was quite surprised to read these nanopass ideas have been around for
so long. I might have heard of them about half a decade ago at the most.
I actually thought they were pretty recent... always learning...

OK, after reading your comment and skimming through the code it might be
that my problem is not being totally aware of the details of nanopass
compilation and therefore looking to the code and instead of being able
to abstract away portions of the code for different functions, just
seeing a huge blob of incomprehensible scheme with absolutely no comments.

> Some of the more gnarly code is in the register allocation... which is
> unsurprising. I do like that I can flip to the end, see the driver for
> all the passes, and each pass is a separate, match-like specification of
> a transformation from one language (datatype) to another. Ignoring the
> fact that there's support code in the file, 16KLOC suggests around 500
> lines per pass (at roughly 30 passes, it looks like); 500 lines seems to
> me to be a manageable unit of code for a single pass of a compiler that
> should, if written true-to-form, does just one thing per pass. (This is,
> I suspect, a classic "YMMV" kind of comment.)
>

I guess a long comment describing some of this in the beginning of the
file would certainly be useful. In any case, as someone who dealt with a
lot of code and most of it development tools related I have never seen
anything like this. It would certainly be a lot clearer if each of the
passes had their own file. For example, in GCC all passes have their own
file and they are amazingly well commented. So if you open a file like
the register renaming pass
(https://github.com/gcc-mirror/gcc/blob/master/gcc/regrename.c) although
it is close to 2000 lines of C, it's pretty readable (assuming you know
how GCC IR works, of course). Also, you know this code is doing a
specific job, instead of doing 'all jobs', as in the case of the
cpnanopass file. But given Matthew's other message, I don't want this to
come across as me whining about the state of Chez but instead a call for
action to improve the situation. :)

> I can't say that I'm about to step in and join the compiler team (save
> us all from the thought!). I do think that it's nice to see the idea a
> nanopass compiler 1) in production and 2) having the maturity to become
> part of the production back-end of Racket. If [1] is where some/much of
> Racket's backend currently lives, I am ecstatic that the backend will be
> more Scheme (Chez? Racket?) than C/C++.
>
> Cheers,
> Matt
>
> [1] https://github.com/racket/racket/blob/master/racket/src/racket/src/compile.c
>

Scheme code is usually denser than C, therefore I am certainly less
scared by 2200 lines of C than I am by 16000 lines of scheme.

> * As an aside, one of the few times I remember Kent Dybvig making a
> "joke" in class was when he introduced the pass "remove complex
> operands." It was called "remove-complex-opera*." At Indiana, where
> Opera is a Thing, I think it was particularly funny as an inside joke of
> sorts. He devolved for a moment into what I can only describe as
> giggles---but, it was subtle just the same. It brings me a certain
> amount of joy to see "np-remove-complex-opera*" in [3].
>
> --
> You received this message because you are subscribed to the Google
> Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to racket-users...@googlegroups.com
> <mailto:racket-users...@googlegroups.com>.

Matthias Felleisen

unread,
Feb 6, 2019, 12:50:26 PM2/6/19
to Paulo Matos, racket...@googlegroups.com


On Feb 6, 2019, at 12:30 PM, 'Paulo Matos' via Racket Users <racket...@googlegroups.com> wrote:

I was quite surprised to read these nanopass ideas have been around for
so long.


1. The educational idea came first: 

A Nanopass framework for compiler education. • Volume 15, Issue 5 • September 2005 , pp. 653-667

https://www.cambridge.org/core/journals/journal-of-functional-programming/article/educational-pearl-a-nanopass-framework-for-compiler-education/1E378B9B451270AF6A155FA0C21C04A3

2. The experience report of applying the idea to a commercial compiler came about 10 years later: 

A Nanopass framework for commercial compiler development. ICFP ’13 , pp 343-350


— Matthias



Niklas Larsson

unread,
Feb 6, 2019, 1:32:39 PM2/6/19
to racket...@googlegroups.com
Andy Keep did a presentation on writing a nanopass compiler a couple of years ago https://www.youtube.com/watch?v=Os7FE3J-U5Q

That and the code on his github were very helpful when I tried to understand the nanopass framework. 

// Niklas

Paulo Matos

unread,
Feb 6, 2019, 2:22:54 PM2/6/19
to Matthias Felleisen, racket...@googlegroups.com
Thanks for the references. That really useful.

Interestingly according to Matt these ideas were already floating around at his uni as early as 98?
--
Paulo Matos

Matthias Felleisen

unread,
Feb 6, 2019, 2:28:30 PM2/6/19
to Paulo Matos, racket...@googlegroups.com

My recollection is that Kent taught with this approach because it simplified homeworks for students and graders and I encouraged him to write it up for the “education pearl” section that I launched for JFP in ’03. It took several years to collect the papers and get them written and publish them (back then, journals still had “print queues”) — Matthias
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.

Hendrik Boom

unread,
Feb 6, 2019, 5:31:05 PM2/6/19
to racket...@googlegroups.com
> https://dl.acm.org/citation.cfm?id=2500618 <https://dl.acm.org/citation.cfm?id=2500618>

* Back in the 60's I was told that a compile for the IBM 1401 operated
as many many passes -- maybe 8 or 20 or so? The program would be
processed from ne magnetic tape to another many many times.

Not because the language was so interlocked. But because the machine's
memory was so small that an entire normal pass wouldn't fit in memory.

* And in I believe the late 70's or early 80's the pqcc was in
development as a research tool in optimisation.

It too ws structured as many passes over a single data structure
representing the program. One consistent notation was used throughout.
Why? So they could freely experiment with inserting, replacing, and
permuting passes to see what effect they had on the produced object
code.

-- hendrik

Greg Hendershott

unread,
Feb 6, 2019, 10:06:10 PM2/6/19
to Racket Users
> * As an aside, one of the few times I remember Kent Dybvig making a "joke" in class was when he introduced the pass "remove complex operands." It was called "remove-complex-opera*." At Indiana, where Opera is a Thing, I think it was particularly funny as an inside joke of sorts. He devolved for a moment into what I can only describe as giggles---but, it was subtle just the same. It brings me a certain amount of joy to see "np-remove-complex-opera*" in [3].

/me changes all his passwords from "correct horse battery staple" to
"remove complex opera"

George Neuner

unread,
Feb 8, 2019, 2:21:52 AM2/8/19
to racket...@googlegroups.com
>https://dl.acm.org/citation.cfm?id=2500618 <https://dl.acm.org/citation.cfm?id=2500618>
>
>— Matthias


The idea that a compiler should be structured as multiple passes each
doing just one clearly defined thing is quite old. I don't have
references, but I recall some of these ideas being floated in the late
80's, early 90's [when I was in school].

Interestingly, LLVM began (circa ~2000) with similar notions that the
compiler should be highly modular and composed of many (relatively
simple) passes. Unfortunately, they quickly discovered that, for a C
compiler at least, having too many passes makes the compiler very slow
- even on fast machines. Relatively quickly they started combining
the simple passes to reduce the running time.


George

Matthias Felleisen

unread,
Feb 8, 2019, 8:37:37 AM2/8/19
to George Neuner, racket...@googlegroups.com
I strongly recommend that you read the article(s) to find out how different nanopasses are from the multiple-pass compilers, which probably date back to the late 60s at least. — Matthias

Christopher Lemmer Webber

unread,
Feb 8, 2019, 9:21:23 AM2/8/19
to Matthew Flatt, Paulo Matos, Racket Users
Matthew Flatt writes:

> Personally, while my contributions to Chez Scheme so far have been
> modest, I have already factored into my costs the worst-case scenario
> of fully maintaining Chez Scheme as used by Racket. Even if that
> happens, it still looks like a good deal in the long run.

That's nice to hear (though I hope it doesn't need to happen).
Part of the reason I haven't been too worried is that in conversations
with you it's sounded like the goal of Racket-on-Chez has also been to
allow for *multiple* backends (eg, I still think a Guile backend would
be interesting some day). That seems like it reduces the risk
substantially.

Jordan Johnson

unread,
Feb 8, 2019, 11:14:28 AM2/8/19
to racket users
On Feb 6, 2019, at 11:28, Matthias Felleisen <matt...@felleisen.org> wrote:
On Feb 6, 2019, at 2:22 PM, 'Paulo Matos' via Racket Users <racket...@googlegroups.com> wrote:
Interestingly according to Matt these ideas were already floating around at his uni as early as 98?
My recollection is that Kent taught with this approach because it simplified homeworks for students and graders and I encouraged him to write it up for the “education pearl” section that I launched for JFP in ’03. 

When I took Kent’s compilers class (which would’ve been around ’98 or ’99), which he co-taught with Dan Friedman, they were in the early-ish stages of exploring the idea; it might have been the first year of implementation, and I don’t doubt it helped the graders since the class was fairly large. I can say from a student perspective, it was really helpful for understanding what was going on: each pass had one very specific purpose — a few maybe had two, but almost all passes were straightforward on their own. By the end of the semester we had a Scheme-to-Sparc-ASM compiler written in about 36 passes.

At the time, the only special tooling we had was a form of match that had a shorthand for “recur and bind variable(s) to the result(s)”, so we had to write all the just-copy-the-AST branches of our recursion manually. A big chunk of the work in implementing nanopass was setting up a language-definition framework that’d allow the copying code to be generated.

Cheers,
Jordan

George Neuner

unread,
Feb 9, 2019, 12:57:09 PM2/9/19
to racket...@googlegroups.com
I did read the article and it seems to me that the "new idea" is the
declarative tool generator framework rather than the so-called
"nanopass" approach.

The distinguishing characteristics of "nanopass" are said to be:

(1) the intermediate-language grammars are formally specified and
enforced;
(2) each pass needs to contain traversal code only for forms that
undergo meaningful transformation; and
(3) the intermediate code is represented more efficiently as records


IRs implemented using records/structs go back to the 1960s (if not
earlier).


Formally specified IR grammars go back at least to Algol (1958). I
concede that I am not aware of any (non-academic) compiler that
actually has used this approach: AFAIAA, even the Algol compilers
internally were ad hoc. But the *idea* is not new.

I can recall as a student in the late 80's reading papers about
language translation and compiler implementation using Prolog
[relevant to this in the sense of being declarative programming]. I
don't have cites available, but I was spending a lot of my library
time reading CACM and IEEE ToPL so it probably was in one of those.


I'm not sure what #2 actually refers to. I may be (probably am)
missing something, but it would seem obvious to me that one does not
write a whole lot of unnecessary code.

The article talks about deficiencies noted with various support tools
and drivers that were provided to aid students in implementing
so-called "micropass" compilers, but who wrote those tools? Not the
students. If there was much superfluous code being used or generated,
well whose fault was that?

Aside: I certainly could see it being a reaction to working with Java
where tree walking code has to be contorted to fit into the stupid
multiple dispatch and vistor patterns.


YMMV, (and it will)
George

John Clements

unread,
Feb 9, 2019, 1:33:40 PM2/9/19
to George Neuner, Racket Users
Hmm… I think I disagree. In particular, I think you’re missing the notion of a DSL that allows these intermediate languages to be specified much more concisely by allowing users to write, in essence, “this language is just like that one, except that this node is added and this other one is removed.” I think it’s this feature, and its associated automatic-translation-of-untouched-nodes code, that makes it possible to consider writing a 50-pass parser that would otherwise have about 50 x 10 = 500 “create a node by applying the transformation to the sub-elements” visitor clauses. Right?

In fact, as someone who’s about to teach a compilers class starting in April and who’s almost fatally prone to last-minute pivots, I have to ask: is anyone that you know (o great racket users list) currently using this approach or these tools? Last year I went with what I think of as the Aziz Ghuloum via Ben Lerner approach, starting with a trivial language and widening it gradually. I see now that Ghuloum was actually teaching at IU when he wrote his 2006 Scheme Workshop paper, and that although he cites about fifteen Dybvig papers, the nanopass papers don’t seem to be among them.

Hmm…

John



Benjamin Scott Lerner

unread,
Feb 9, 2019, 2:39:48 PM2/9/19
to John Clements, George Neuner, Racket Users
Credit where it's due: Joe Politz (now at UCSD) came up with the first adaptation of Ghuloum's approach, and I've been riffing on his notes :-)

On Feb 9, 2019 1:34 PM, 'John Clements' via Racket Users <racket...@googlegroups.com> wrote:



 Last year I went with what I think of as the Aziz Ghuloum via Ben Lerner approach, starting with a trivial language and widening it gradually. I see now that Ghuloum was actually teaching at IU when he wrote his 2006 Scheme Workshop paper, and that although he cites about fifteen Dybvig papers, the nanopass papers don’t seem to be among them.

Hmm…

John



ra...@airmail.cc

unread,
Feb 9, 2019, 5:35:50 PM2/9/19
to George Neuner, racket...@googlegroups.com
Could nanopass, at least in theory, fuse multiple (or even all) passes
into one at compile time. To create a very efficient compiler which is
also logically broken down and readable in the source code?

Matthias Felleisen

unread,
Feb 9, 2019, 5:47:32 PM2/9/19
to ra...@airmail.cc, racket users


On Feb 9, 2019, at 5:35 PM, ra...@airmail.cc wrote:

Could nanopass, at least in theory, fuse multiple (or even all) passes into one at compile time.  To create a very efficient compiler which is also logically broken down and readable in the source code?


Yes, precisely because the languages ought to be purely declarative and, especially in a Racket setting, could be isolated from the imperative parts of the language. 

No, as they currently are used in Chez because they use side-effects on occasion to communicate between passes. See Experience report and up-stream URL to Chez module. 

Great research topic, and yes, we looked into it but moved on — Matthias


Hendrik Boom

unread,
Feb 9, 2019, 5:49:16 PM2/9/19
to Racket Users
Just wndering -- What was the original purpose in moving Racket to Chez?

-- hendrik

Alexis King

unread,
Feb 9, 2019, 6:00:42 PM2/9/19
to Hendrik Boom, Racket Users
> On Feb 9, 2019, at 16:49, Hendrik Boom <hen...@topoi.pooq.com> wrote:
>
> Just wndering -- What was the original purpose in moving Racket to Chez?

You probably want to read Matthew’s original email on the subject, from about two years ago:

https://groups.google.com/d/msg/racket-dev/2BV3ElyfF8Y/4RSd3XbECAAJ

Alexis

George Neuner

unread,
Feb 9, 2019, 7:59:15 PM2/9/19
to John Clements, racket users

On 2/9/2019 1:33 PM, 'John Clements' via Racket Users wrote:
> > On Feb 8, 2019, at 15:01, George Neuner <gneu...@comcast.net> wrote:
> >
> >
> > The distinguishing characteristics of "nanopass" are said to be:
> >
> > (1) the intermediate-language grammars are formally specified and
> > enforced;
> > (2) each pass needs to contain traversal code only for forms that
> > undergo meaningful transformation; and
> > (3) the intermediate code is represented more efficiently as records
> >
> >
> > IRs implemented using records/structs go back to the 1960s (if not
> > earlier).
> >
> >
> > Formally specified IR grammars go back at least to Algol (1958). I
> > concede that I am not aware of any (non-academic) compiler that
> > actually has used this approach: AFAIAA, even the Algol compilers
> > internally were ad hoc. But the *idea* is not new.
> >
> > I can recall as a student in the late 80's reading papers about
> > language translation and compiler implementation using Prolog
> > [relevant to this in the sense of being declarative programming]. I
> > don't have cites available, but I was spending a lot of my library
> > time reading CACM and IEEE ToPL so it probably was in one of those.
> >
> >
> > I'm not sure what #2 actually refers to. I may be (probably am)
> > missing something, but it would seem obvious to me that one does not
> > write a whole lot of unnecessary code.
>
>
> Hmm… I think I disagree. In particular, I think you’re missing the notion of a DSL that allows these intermediate languages to be specified much more concisely by allowing users to write, in essence, “this language is just like that one, except that this node is added and this other one is removed.” I think it’s this feature, and its associated automatic-translation-of-untouched-nodes code, that makes it possible to consider writing a 50-pass parser that would otherwise have about 50 x 10 = 500 “create a node by applying the transformation to the sub-elements” visitor clauses. Right?

I was referring to the development of the compiler itself.  My comments
were directed at the perceived problems of bloat in the "micropass"
compiler infrastructure that "nanopass" is supposed to fix.  ISTM the
providers of the tools must be held accountable for any bloat they cause
... not the ones who use the tools.

In the context of real world [rather than academic] development, my
experience is that most DSLs are not of the form "<X>, plus this bit,
minus that bit", but rather are unique languages having unique semantics
that only are *perceived* to be similar to <X> due to borrowing of
syntax.  Most users of <X> don't really understand its semantics, and
whatever <X>-like DSLs they create will share their flawed understanding.

Real world DSLs - whether compiled or interpreted - often are grown
incrementally in a "micropass" like way.  But most developers today do
not have a CS education and will not be using any kind of compiler
development "framework".  Many even eschew tools like parser generators
because their formal approaches are perceived to be "too complicated". 
Under these circumstances, there will not be much superfluous code
written [or generated] that is not either a false start to be thrown
away, or some kind of unit test.


To be clear, I have no problem with "nanopass" - I think it's a fine
idea.  But I wonder how well it will transition into the real world when
students exposed to the methodology leave school and go to work.  Will
the tools be available for business use and will they be up to the
demands of real world development?  Will they reduce development effort
enough to be adopted outside academia?

Hendrik Boom

unread,
Feb 9, 2019, 10:09:19 PM2/9/19
to Racket Users
Thank you. This is very clear.

-- hendrik

John Clements

unread,
May 27, 2019, 10:56:49 PM5/27/19
to George Neuner, Racket Users
I’m responding to my own message, because (thanks to Andy Keep) I’ve now discovered a big chunk of the answer.

Specifically, it looks Jeremy Siek’s compilers class includes a textbook written by him and Ryan Newton whose preface appears to answer all of my questions; specifically, that they did merge Ghuloum’s approach with nanopasses.

https://iu.instructure.com/courses/1735985

https://github.com/IUCompilerCourse/Essentials-of-Compilation

> On Feb 9, 2019, at 10:33, John Clements <clem...@brinckerhoff.org> wrote:
>
>
>
>> On Feb 8, 2019, at 15:01, George Neuner <gneu...@comcast.net> wrote:
>>
> Hmm… I think I disagree. In particular, I think you’re missing the notion of a DSL that allows these intermediate languages to be specified much more concisely by allowing users to write, in essence, “this language is just like that one, except that this node is added and this other one is removed.” I think it’s this feature, and its associated automatic-translation-of-untouched-nodes code, that makes it possible to consider writing a 50-pass parser that would otherwise have about 50 x 10 = 500 “create a node by applying the transformation to the sub-elements” visitor clauses. Right?
>
> In fact, as someone who’s about to teach a compilers class starting in April and who’s almost fatally prone to last-minute pivots, I have to ask: is anyone that you know (o great racket users list) currently using this approach or these tools? Last year I went with what I think of as the Aziz Ghuloum via Ben Lerner approach, starting with a trivial language and widening it gradually. I see now that Ghuloum was actually teaching at IU when he wrote his 2006 Scheme Workshop paper, and that although he cites about fifteen Dybvig papers, the nanopass papers don’t seem to be among them.
>
> Hmm…
>
> John
>



Paulo Matos

unread,
May 28, 2019, 4:56:49 AM5/28/19
to racket...@googlegroups.com


On 28/05/2019 04:56, 'John Clements' via Racket Users wrote:
> I’m responding to my own message, because (thanks to Andy Keep) I’ve now discovered a big chunk of the answer.
>
> Specifically, it looks Jeremy Siek’s compilers class includes a textbook written by him and Ryan Newton whose preface appears to answer all of my questions; specifically, that they did merge Ghuloum’s approach with nanopasses.
>
> https://iu.instructure.com/courses/1735985
>
> https://github.com/IUCompilerCourse/Essentials-of-Compilation
>

These are very good references on compilation I was not aware of. Thanks
for posting.

--
Paulo Matos
Reply all
Reply to author
Forward
0 new messages