Writing a "command-line front-end" for a language

48 views
Skip to first unread message

Reuben Thomas

unread,
Sep 3, 2022, 7:01:23 AMSep 3
to Racket Users
I'm a relative Racket newbie, and I've just enjoyed Beautiful Racket.

I am working on a Racket implementation of a simple assembler (for the Hack VM in the NAND2Tetris course).

I have a partial implementation up and running using #lang lines. I would like to add a more traditional command-line interface, so I can (eventually) say:

hackasm foo.asm

on a file without a #lang line.

My code is available at https://github.com/rrthomas/hackasm

Here's the nub of the problem: I can't work out how to call the language evaluator "manually". I have implemented the language as a dialect, so that the "main.rkt" module is "free" to be used for the command-line interface. (Perhaps this can be fixed too, that would be nice!)

A typical assembler file might start like this:

#lang hackasm/asm
@2
D=A
@3

When I run this file (e.g. in DrRacket), I get some output as expected:

0000000000000010
1110110000010000
0000000000000011

(The assembler outputs ASCII-encoded binary!)

The contents of my main.rkt looks like this:

#lang br/quicklang
(require "parser.rkt" "tokenizer.rkt" (submod "asm.rkt" reader))

(module+ main
  (require racket/cmdline)
  (let ((filename
         (command-line
          #:program "hackasm" ;
          #:args (filename)
          filename)))
    (read-syntax filename (open-input-file filename))))

So far, all I've worked out how to do is run the language's read-syntax function (imported from parser.rkt), and thereby return the parsed syntax object as the result.

What I'd like to do is call the evaluator on the parse tree, but after a lot of scratching my head over the Racket documentation and search results, I cannot work out how to do that. I presume the code should look something like:

(eval (??? (read-syntax filename (open-input-file filename))))

where in the end I'm eval-ing a form, and where ??? represents something that turns the syntax object into a runnable module.

Apologies for the length of this post (I was unsure how I could make it shorter), and thanks in advance for any help!

--

Shu-Hung You

unread,
Sep 3, 2022, 2:10:22 PMSep 3
to Racket Users
---------- Forwarded message ---------
From: Shu-Hung You <shh...@u.northwestern.edu>
Date: Sat, Sep 3, 2022 at 1:03 PM
Subject: Re: [racket-users] Writing a "command-line front-end" for a language
To: Reuben Thomas <r...@sc3d.org>


Running `racket foo.asm` will produce the desired output, so a shell
script that directly passes the arguments to Racket could work.
Otherwise, just use (dynamic-require filename #f) in main.rkt.

At the technical level, foo.asm is in fact an ordinary Racket module,
just like any other .rkt file. Therefore it can be run in the same way
using APIs that require and instantiate modules.

-----------

On a side note, the forum has mostly moved to Discourse
(https://racket.discourse.group/).
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/CAOnWdohy31fHyvUd9rbY8tZFLJKUpevgnZ8jPH2-5_QtSFm%2BhA%40mail.gmail.com.

Philip McGrath

unread,
Sep 3, 2022, 11:13:51 PMSep 3
to Racket Users, Reuben Thomas, Shu-Hung You
On Sat, Sep 3, 2022, at 2:09 PM, Shu-Hung You wrote:
> ---------- Forwarded message ---------
> From: Shu-Hung You <shh...@u.northwestern.edu>
> Date: Sat, Sep 3, 2022 at 1:03 PM
> Subject: Re: [racket-users] Writing a "command-line front-end" for a language
> To: Reuben Thomas <r...@sc3d.org>
>
>
> Running `racket foo.asm` will produce the desired output, so a shell
> script that directly passes the arguments to Racket could work.
> Otherwise, just use (dynamic-require filename #f) in main.rkt.
>
> At the technical level, foo.asm is in fact an ordinary Racket module,
> just like any other .rkt file. Therefore it can be run in the same way
> using APIs that require and instantiate modules.
>
> -----------
>
> On a side note, the forum has mostly moved to Discourse
> (https://racket.discourse.group/).
>

This is all correct, and you can also make just `./foo.asm` work: https://docs.racket-lang.org/guide/scripts.html

However, in some cases you might really want a program other than `racket` as the entry point for your language: for instance, maybe you want to have flags for controlling where the output goes. One example of such a program is the `scribble` executable included in the main Racket distribution. The implementation is in <https://github.com/racket/scribble/blob/master/scribble-lib/scribble/run.rkt>, and the associated "info.rkt" file (<https://github.com/racket/scribble/blob/master/scribble-lib/scribble/info.rkt>) arranges for `raco setup` to create a `scribble` to run it. (This example uses the old mzscheme-launcher-names/mzscheme-launcher-libraries instead of the newer racket-launcher-names/racket-launcher-libraries: see documentation at <https://docs.racket-lang.org/raco/setup-info.html#%28idx._%28gentag._18._%28lib._scribblings%2Fraco%2Fraco..scrbl%29%29%29>.)

A couple additional details:

> On Sat, Sep 3, 2022 at 6:01 AM 'Reuben Thomas' via Racket Users
> <racket...@googlegroups.com> wrote:
>>
>> I have a partial implementation up and running using #lang lines. I would like to add a more traditional command-line interface, so I can (eventually) say:
>>
>> hackasm foo.asm
>>
>> on a file without a #lang line.
>>
>> My code is available at https://github.com/rrthomas/hackasm
>>
>> [...]
>>
>> So far, all I've worked out how to do is run the language's read-syntax function (imported from parser.rkt), and thereby return the parsed syntax object as the result.
>>
>> What I'd like to do is call the evaluator on the parse tree, but after a lot of scratching my head over the Racket documentation and search results, I cannot work out how to do that.

It is possible to use Racket to implement languages that don't use #lang, but you would loose many advantages like IDE support and well-defined separate compilation, and you would need to use some fairly low-level mechanisms. Unless there is a hard requirement, I'd recommend that you just use #lang in your programs. For example, the whole family of languages supported by the `scribble` command-line tool use #lang. (Indeed, #lang is how the tool can support a whole *family* of languages.)

> On Sat, Sep 3, 2022 at 6:01 AM 'Reuben Thomas' via Racket Users
> <racket...@googlegroups.com> wrote:
>>
>> I have implemented the language as a dialect, so that the "main.rkt" module is "free" to be used for the command-line interface. (Perhaps this can be fixed too, that would be nice!)
>>
>> [...]
>>
>> The contents of my main.rkt looks like this:
>>
>> #lang br/quicklang
>> (require "parser.rkt" "tokenizer.rkt" (submod "asm.rkt" reader))
>>
>> (module+ main
>> (require racket/cmdline)
>> (let ((filename
>> (command-line
>> #:program "hackasm" ;
>> #:args (filename)
>> filename)))
>> (read-syntax filename (open-input-file filename))))
>>

There are many possible ways to organize this: to some extent it's a matter of how you expect your language and cli to be used, and to some extent it's a matter of taste. I wouldn't consider your current organization "wrong", necessarily. But, if you'd like hackasm to be a multi-purpose entry point, one way to do that would be:

1. Move "expander.rkt" to "main.rkt"

2. Add a reader submodule like the one in "asm.rkt", but using just hackasm where you currently have hackasm/expander in the result of read-systax. Optionally, you might consider using syntax/module-reader (Guide: <https://docs.racket-lang.org/guide/syntax_module-reader.html> Reference: <https://docs.racket-lang.org/syntax/reader-helpers.html#%28mod-path._syntax%2Fmodule-reader%29>) with the #:whole-body-readers? option for your reader submodule.

3. Add a main submodule like the one in the current version of "main.rkt", but implemented more like scribble/run. You might want to write it as `(module* main racket/base ...)` unless you want to implement the command line tool in your hackasm language.

If you do that, `(require hackasm)`, `(module foo hackasm)` and `#lang hackasm` will all access your language, and running the file will run your command-line tool.

Personally, I might instead combine "expander.rkt" and "asm.rkt" into "main.rkt", but put the cli tool in "cli.rkt". The name you give the tool via racket-launcher-names doesn't have to match the file name of the implementation, and I might find it easier to experiment with the language in DrRacket if the tool were separate. But it all depends on what seems most useful to you and your users.

-Philip

Reuben Thomas

unread,
Sep 4, 2022, 5:21:04 AMSep 4
to shh...@u.northwestern.edu, Racket Users
On Sat, 3 Sept 2022 at 19:10, Shu-Hung You <shh...@u.northwestern.edu> wrote:
Running `racket foo.asm` will produce the desired output, so a shell
script that directly passes the arguments to Racket could work.
Otherwise, just use (dynamic-require filename #f) in main.rkt.

Thanks for helping!

Don't both of these methods require a #lang line in the input file? That's not part of the assembly format, so I want to be able to specify the language in the main module. Indeed, when I try it with a file with a #lang line, dynamic-require works; when I remove that line, I get an error about a missing module declaration (no surprise). I can see an obvious workaround, namely to slurp the file and prepend a module declaration before dynamic-requiring it, but that's ugly.

So it seems that in fact what I want is to call something like dynamic-require with a module object. But I'm not sure what to call or how to get one of those: read-syntax returns a syntax object, not a module, while I don't (yet) know how to apply my expander's #%module-begin to it to obtain a module.

At the technical level, foo.asm is in fact an ordinary Racket module,
just like any other .rkt file. Therefore it can be run in the same way
using APIs that require and instantiate modules.

Right! That's what I've obviously not fully understood yet.

(Thanks for the side node about moving to Discourse—it's a while since I've been active on the list!)

--

Reuben Thomas

unread,
Sep 4, 2022, 5:31:39 AMSep 4
to Philip McGrath, Racket Users, Shu-Hung You
On Sun, 4 Sept 2022 at 04:13, Philip McGrath <phi...@philipmcgrath.com> wrote:

However, in some cases you might really want a program other than `racket` as the entry point for your language: for instance, maybe you want to have flags for controlling where the output goes. One example of such a program is the `scribble` executable included in the main Racket distribution. The implementation is in <https://github.com/racket/scribble/blob/master/scribble-lib/scribble/run.rkt>, and the associated "info.rkt" file (<https://github.com/racket/scribble/blob/master/scribble-lib/scribble/info.rkt>) arranges for `raco setup` to create a `scribble` to run it. (This example uses the old mzscheme-launcher-names/mzscheme-launcher-libraries instead of the newer racket-launcher-names/racket-launcher-libraries: see documentation at <https://docs.racket-lang.org/raco/setup-info.html#%28idx._%28gentag._18._%28lib._scribblings%2Fraco%2Fraco..scrbl%29%29%29>.)

Thanks for the pointer.
 
It is possible to use Racket to implement languages that don't use #lang, but you would loose many advantages like IDE support and well-defined separate compilation, and you would need to use some fairly low-level mechanisms. Unless there is a hard requirement, I'd recommend that you just use #lang in your programs.

I'm trying to write a standalone assembler (nothing to do with Racket), so I'm happy to lose this advantage!

There are many possible ways to organize this

Thanks for this, that's exactly what I was after.

--

Reuben Thomas

unread,
Sep 4, 2022, 10:00:38 AMSep 4
to Philip McGrath, Racket Users, Shu-Hung You
OK, I've had another look, and I still can't see how to do this, so I would appreciate a hint. I have updated my repo to add a launcher script, but again this only works with files that have a #lang line. As I said before, I have worked out how to parse a file without a #lang line (just pass it to my language's read-syntax), but I can't work out how to turn that into a module and execute it. I guess I need something like dynamic-require that takes a syntax-object and an expander module or #%begin-module macro as arguments?

--

Philip McGrath

unread,
Sep 4, 2022, 2:39:25 PMSep 4
to Reuben Thomas, Racket Users, Shu-Hung You
On Sun, Sep 4, 2022, at 10:00 AM, Reuben Thomas wrote:
On Sun, 4 Sept 2022 at 10:31, Reuben Thomas <r...@sc3d.org> wrote:
On Sun, 4 Sept 2022 at 04:13, Philip McGrath <phi...@philipmcgrath.com> wrote:

However, in some cases you might really want a program other than `racket` as the entry point for your language: for instance, maybe you want to have flags for controlling where the output goes. One example of such a program is the `scribble` executable included in the main Racket distribution. The implementation is in <https://github.com/racket/scribble/blob/master/scribble-lib/scribble/run.rkt>, and the associated "info.rkt" file (<https://github.com/racket/scribble/blob/master/scribble-lib/scribble/info.rkt>) arranges for `raco setup` to create a `scribble` to run it. (This example uses the old mzscheme-launcher-names/mzscheme-launcher-libraries instead of the newer racket-launcher-names/racket-launcher-libraries: see documentation at <https://docs.racket-lang.org/raco/setup-info.html#%28idx._%28gentag._18._%28lib._scribblings%2Fraco%2Fraco..scrbl%29%29%29>.)

Thanks for the pointer.
 
It is possible to use Racket to implement languages that don't use #lang, but you would loose many advantages like IDE support and well-defined separate compilation, and you would need to use some fairly low-level mechanisms. Unless there is a hard requirement, I'd recommend that you just use #lang in your programs.

I'm trying to write a standalone assembler (nothing to do with Racket), so I'm happy to lose this advantage!

You may indeed want a tool that supports files without #lang if you are working with an existing language and there isn't a way to make the #lang line acceptable to its existing grammar. That's about the only situation I can think of. Almost always, things will work better in the long run if you can make #lang work somehow, even for languages that have "nothing to do with Racket". For example:
  • Zuo programs start with #lang, even though the primary implementation of `#lang zuo/kernel` is written in C and runs without Racket. (IIUC Sam Tobin-Hochstadt has an almost-complete implementation of zuo/kernel in Racket, but the C implementation is needed to build Racket itself.) See <https://docs.racket-lang.org/zuo/>.
  • The `jsonic` and BASIC languages from Beautiful Racket have fairly little to do with Racket, but #lang is the basis for the syntax coloring and indentation mechanisms introduced in <https://beautifulracket.com/jsonic-2/drracket-integration.html>. Despite the title of the chapter, this isn't limited to DrRacket: you also get editor support for your language in Emacs' racket-mode, VSCode, and other clients of the Language Server Protocol.
    (Small caveat: I have not actually read Beautiful Racket, just looked at it admiringly, recommended it to others, and wished MB had written it a year earlier than he did.)

OK, I've had another look, and I still can't see how to do this, so I would appreciate a hint. I have updated my repo to add a launcher script, but again this only works with files that have a #lang line. As I said before, I have worked out how to parse a file without a #lang line (just pass it to my language's read-syntax), but I can't work out how to turn that into a module and execute it. I guess I need something like dynamic-require that takes a syntax-object and an expander module or #%begin-module macro as arguments?

For running a file without a #lang, you will need to use `eval`. Probably the right example to follow is the implementation of the plt-r6rs command-line tool (part of the main Racket distribution; source at <https://github.com/racket/r6rs/blob/master/r6rs-lib/r6rs/run.rkt>), which uses `get-module-code` from <https://docs.racket-lang.org/syntax/module-helpers.html#%28def._%28%28lib._syntax%2Fmodcode..rkt%29._get-module-code%29%29> and similar functions to control low-level evaluation and compilation. See more details about plt-r6rs at <https://docs.racket-lang.org/r6rs/Running_Top-Level_Programs.html> and in its --help output.

A different way to use eval would be to use `define-namespace-anchor` in the module that provides your hackasm language, explicitly read-syntax the given file, and evaluate the expressions with current-namespace parameterized to a namespace created using namespace-anchor->namespace.

Yet another option would be to split your files in two: have files in your language without #lang and separate files in a wrapper #lang that uses include-at/relative-to/reader <https://docs.racket-lang.org/reference/include.html> to get the contents of the non-#lang files. A simple approach would have a wrapper file for every #lang-less file, and it could automatically include a file of the same name but a different extension. More elaborate wrapper languages would also be possible: it could work like a sort of linker specification. I mention this because it can be a useful technique sometimes, though I don't think it's what you want. For an example of how this sort of thing can be useful, see `#lang s-exp srfi/provider`, which simplifies making SRFI libraries available under several different names, for compatibility: it's implemented in <https://github.com/racket/srfi/blob/master/srfi-lite-lib/srfi/provider.rkt> and used e.g. in <https://github.com/racket/srfi/blob/master/srfi-lite-lib/srfi/1.rkt> and <https://github.com/racket/srfi/blob/master/srfi-lib/srfi/%253a63/arrays.rkt>.

Hope this helps!

-Philip

Shu-Hung You

unread,
Sep 4, 2022, 9:20:31 PMSep 4
to Reuben Thomas, Racket Users
On Sun, Sep 4, 2022 at 4:21 AM Reuben Thomas <r...@sc3d.org> wrote:
>
> On Sat, 3 Sept 2022 at 19:10, Shu-Hung You <shh...@u.northwestern.edu> wrote:
>>
>> Running `racket foo.asm` will produce the desired output, so a shell
>> script that directly passes the arguments to Racket could work.
>> Otherwise, just use (dynamic-require filename #f) in main.rkt.
>
>
> Thanks for helping!
>
> Don't both of these methods require a #lang line in the input file? That's not part of the assembly format, so I want to be able to specify the language in the main module. Indeed, when I try it with a file with a #lang line, dynamic-require works; when I remove that line, I get an error about a missing module declaration (no surprise). I can see an obvious workaround, namely to slurp the file and prepend a module declaration before dynamic-requiring it, but that's ugly.
>
> So it seems that in fact what I want is to call something like dynamic-require with a module object. But I'm not sure what to call or how to get one of those: read-syntax returns a syntax object, not a module, while I don't (yet) know how to apply my expander's #%module-begin to it to obtain a module.
>

Okay, if you want to bypass the #lang protocol entirely, here is the
needed code. As you have expected, it uses eval and then calls
dynamic-require.

diff --git a/asm.rkt b/asm.rkt
index f2f1e89..4d024d8 100644
--- a/asm.rkt
+++ b/asm.rkt
@@ -6,6 +6,7 @@

(define (read-syntax path port)
(define parse-tree (parse path (make-tokenizer port path)))
- (strip-bindings
- #`(module hackasm-mod hackasm/expander
- #,parse-tree)))
+ (datum->syntax
+ #f
+ `(,#'module hackasm-mod hackasm/expander
+ ,parse-tree)))
diff --git a/main.rkt b/main.rkt
index 9f2af0b..9cccf22 100644
--- a/main.rkt
+++ b/main.rkt
@@ -8,4 +8,6 @@
#:program "hackasm" ; FIXME: get name from project
#:args (filename)
filename)))
- (dynamic-require filename #f)))
\ No newline at end of file
+ (parameterize ([current-namespace (make-base-empty-namespace)])
+ (eval (read-syntax filename (open-input-file filename)))
+ (dynamic-require '(quote hackasm-mod) #f))))

There are two technical details. The eval function takes pretty much
everything --- plain values, syntax objects, or just S-expressions,
etc. For eval, the difference between syntax objects and S-expressions
is that syntax objects carry binding information with them, therefore
eval can correctly run the code without the risk of misinterpreting
identifiers. The syntax object that your read-syntax returns is almost
runnable, so I use eval to evaluate the resulting module form (i.e.
#'(module hackasm-mod hackasm/expander ...)). This will declare a
module called ‘hackasm-mod’ together with its code in the current
namespace's module registry. Subsequently, dynamic-require
instantiates the module ‘hackasm-mod’ to run its body. The reference
https://docs.racket-lang.org/reference/module.html#%28form._%28%28quote._~23~25kernel%29._module%29%29
specifies what evaluating a module form results in (search for
"evaluation"). In dynamic-require, the module path (quote hackasm-mod)
refers to the module declared with the name ‘hackasm-mod’. In the more
common cases, module paths would be complete file paths. The page
https://docs.racket-lang.org/guide/module-paths.html explains the
syntax of module paths.

In the original read-syntax, I suppose strip-bindings removes all
binding information associated with the given syntax objects.
Consequently, eval would fail to interpret the resulting module form
because the "module" identifier in it is unbound and thus has no
meaning. To fix the issue, I explicitly use the #'module identifier
found in asm.rkt (which is the Racket binding of ‘module’, brought
into context by #lang br/quicklang). Then datum->syntax turns the
entire list into a syntax object with no binding information.
Equivalently, instead of changing read-syntax you can manually fix the
#'module identifier before `eval` using the low-level APIs syntax-e
and datum->syntax.

Reuben Thomas

unread,
Sep 8, 2022, 9:03:35 AMSep 8
to Philip McGrath, Racket Users, Shu-Hung You
On Sun, 4 Sept 2022 at 19:39, Philip McGrath <phi...@philipmcgrath.com> wrote:

You may indeed want a tool that supports files without #lang if you are working with an existing language and there isn't a way to make the #lang line acceptable to its existing grammar.

That's exactly it!
 
Despite the title of the chapter, this isn't limited to DrRacket: you also get editor support for your language in Emacs' racket-mode, VSCode, and other clients of the Language Server Protocol.

This is very cool! I didn't know until now. In particular, pleasing to this Emacs user!
  • (Small caveat: I have not actually read Beautiful Racket, just looked at it admiringly, recommended it to others, and wished MB had written it a year earlier than he did.)
I think I've already said I found it a good introduction; I will reiterate that recommendation!

Thanks very much for the further advice.

--

Reuben Thomas

unread,
Sep 8, 2022, 9:04:19 AMSep 8
to shh...@u.northwestern.edu, Racket Users
On Mon, 5 Sept 2022 at 02:20, Shu-Hung You <shh...@u.northwestern.edu> wrote:

Okay, if you want to bypass the #lang protocol entirely, here is the
needed code. As you have expected, it uses eval and then calls
dynamic-require.

Thanks very much for this code and detailed explanation, that was a great help.

--
Reply all
Reply to author
Forward
0 new messages