Should Rune have an AST?

4 views
Skip to first unread message

Bill Cox

unread,
Jun 4, 2023, 10:23:38 AM6/4/23
to Rune language discussion
I am starting on porting a parser generator I wrote a few years back to Rune.  This will give Rune the ability to support Domain Specific Languages (DSLs), where users can extend Rune's syntax and define the behavior for the new syntax.  I've delayed the decision on whether to have an AST or not as long as possible: a decision is needed now before I can continue working on the bootstrap.

The C Rune compiler parsers directly into the HIR (high-level intermediate representation), using Bison, a free software version of Yacc.This is faster and uses a bit less memory than creating an AST, but has the following problems:
  • The rules were modified to include "headers" all over, where we create an HIR parent object, such as a Function, before parsing the parameter list.
  • Rune code would need to be attached to each production (one | in a rune).  Bison and Yacc do this, but do not support Rune.
  • The syntax file is harder to read and manipulate when the code to be executed for each reduction is scattered throughout the syntax rules.
  • We can complete an AST parser generator faster, because reduce actions written in Rune have to be compiled by the Rune compiler, which would have to be the C Rune compiler.
Some folks might argue that another limitation in skipping the AST step is we lose the ability to apply Rust-like macros to the AST before generating the HIR.  This isn't a significant issue, IMO, because manipulating the HIR is not significantly harder than manipulating the AST, and has the advantage of more readable code.  The HIR can be manipulated in pretty much any way you like, using Rune's "transformers' '.

I am leaning slightly towards building an AST in part because it makes my brain hurt less.  I think in the 1960's and 1970's ASTs were not in favor because they simply could not afford the memory hit in their one-file-at-a tile compilers.  Having an AST for just one module in memory is insignificant compared to the data structures Rune has for compiling a shared library or binary at once.

Should the Rune compiler generate a module's AST, and immediately convert it to HIR?

Bill


Andrew Wilson

unread,
Jun 5, 2023, 9:35:14 AM6/5/23
to Bill Cox, Rune language discussion
Yes, the Rune compiler should generate an AST.  It allows one to break the front-end into multiple passes: each pass becomes much simpler to maintain, and one can add new passes to perform new analyses more easily over time.

In my experience, trying to do everything all at once leads to fragile code that becomes expensive to maintain.

--
You received this message because you are subscribed to the Google Groups "Rune language discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rune-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rune-discuss/CAH9QtQGUOpS0UDJpp6UZyesB%2Bcs8DYRxa1VFz-S-jBaKiSVPWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


--
Andrew Wilson
Software Engineer, Android TV Eng Prod

Bill Cox

unread,
Jun 12, 2023, 12:52:36 PM6/12/23
to Andrew Wilson, Rune language discussion
IIUC, an AST is just a cleaned up parse tree.  There are a few goobers in the current parse tree, but we might eliminate them by modifying the PEG rules.  For example, right now binary expressions are represented as (expr (<operator keyword> expr)).  It would be nice to have e.g. add(expr expr).  This can be done by listing each operator as a separate rule, rather than having an operatorType sub-rule, and then the actual operator keywords can be auto-removed by making them weak.

Is there anything I'm not thinking of needed, or is can the parse tree be our AST?

Andrew Wilson

unread,
Jun 12, 2023, 12:54:47 PM6/12/23
to Bill Cox, Rune language discussion
Sounds good to me. 

Bill Cox

unread,
Jun 12, 2023, 1:12:46 PM6/12/23
to Andrew Wilson, Rune language discussion
Actually, there does seem to be enough work to make converting the parse tree in to an AST to be worthwhile.  I was not able to modify the rules to trivially convert our PEG rules into left-associative expression trees.  They are currently grouped the wrong way (right associative).

This needs some work...

Andrew Wilson

unread,
Jun 12, 2023, 1:14:17 PM6/12/23
to Bill Cox, Rune language discussion
you know that there is a paper on making Peg parsing accept left-recursive grammars:

Andrew Wilson

unread,
Jun 12, 2023, 1:16:26 PM6/12/23
to Bill Cox, Rune language discussion
Reply all
Reply to author
Forward
0 new messages