Details on in-editor macroexpansion in DrRacket?

64 views
Skip to first unread message

Colin Fleming

unread,
May 14, 2018, 12:43:15 AM5/14/18
to racke...@googlegroups.com
Hi all,

I work on Cursive, which is an IDE for Clojure code based on IntelliJ. I've spoken to several of you at various conferences over the years.

I'm interested in understanding better how the macroexpansion to support editor functionality in DrRacket works. By contrast, Cursive doesn't expand macros during editing and relies on an extension API to teach it about macro semantics. It's not ideal, but it does have some advantages - it's safe and it's fast, and you can add support for some really crazy things that macros do. However currently it requires me to add support for popular macros, although I do have plans to open source that part so that users will be able to add support for either third-party macros that they use, or macros that they've developed themselves.

My understanding is that using macroexpansion in the way that DrRacket does requires a fairly deep integration between the macroexpander, the macros themselves and the IDE - is that a fair statement?

I guess that things like source location for forms which are carried from the unexpanded source to the macroexpansion are more or less automatic. How does this work for synthesised forms which are either composed from forms in the original source, or don't exist in the original source at all? For example, playing around with DrRacket I can see that after (struct node ([left : Tree] [right : Tree])), hovering over node-left or node-right will indicate that the node part comes from the struct definition, and the left/right part will come from the fields. Are there annotations specifying partial ranges within symbols to allow this?

I know there are features for tagging forms with information that will appear in tooltips and the like, e.g. Typed Racket showing type information in the editor. Are there other facilities for communicating with the user?

How well does all this work for macros which aren't written with DrRacket in mind?

And finally, of course - is there some documentation about all this I could look at? Is this all implemented with syntax object properties?

Thanks for any and all help!

Cheers,
Colin


Alex Knauth

unread,
May 14, 2018, 11:31:56 AM5/14/18
to Colin Fleming, Racket Dev

On May 14, 2018, at 12:42 AM, Colin Fleming <colin.ma...@gmail.com> wrote:

Hi all,

I work on Cursive, which is an IDE for Clojure code based on IntelliJ. I've spoken to several of you at various conferences over the years.

I'm interested in understanding better how the macroexpansion to support editor functionality in DrRacket works. By contrast, Cursive doesn't expand macros during editing and relies on an extension API to teach it about macro semantics. It's not ideal, but it does have some advantages - it's safe and it's fast, and you can add support for some really crazy things that macros do. However currently it requires me to add support for popular macros, although I do have plans to open source that part so that users will be able to add support for either third-party macros that they use, or macros that they've developed themselves.

My understanding is that using macroexpansion in the way that DrRacket does requires a fairly deep integration between the macroexpander, the macros themselves and the IDE - is that a fair statement?

I guess that things like source location for forms which are carried from the unexpanded source to the macroexpansion are more or less automatic. How does this work for synthesised forms which are either composed from forms in the original source, or don't exist in the original source at all? For example, playing around with DrRacket I can see that after (struct node ([left : Tree] [right : Tree])), hovering over node-left or node-right will indicate that the node part comes from the struct definition, and the left/right part will come from the fields. Are there annotations specifying partial ranges within symbols to allow this?

Since `struct` creates a new name that wasn't in the source, but built from parts of the source, it uses the syntax property `'sub-range-binders` to communicate this to DrRacket [1].

However, if a macro doesn't "make up" names like struct does, it doesn't need to worry about this.

I know there are features for tagging forms with information that will appear in tooltips and the like, e.g. Typed Racket showing type information in the editor. Are there other facilities for communicating with the user?

These tooltips are controlled by the syntax property `'mouse-over-tooltips` [1].

How well does all this work for macros which aren't written with DrRacket in mind?

It depends on whether they make up or introduce names that weren't originally there in the source. If a macro only produces definitions or expressions using identifiers it was given as an input, the macro generally doesn't need to be written with DrRacket in mind.

And finally, of course - is there some documentation about all this I could look at? Is this all implemented with syntax object properties?

Yes, the advanced features that work for macros that make up names or display tooltips are communicated to DrRacket through syntax properties like `'sub-range-binders` and `'mouse-over-tooltips`. These properties are documented here:

Daniel Feltey

unread,
May 15, 2018, 1:04:24 PM5/15/18
to Alex Knauth, Colin Fleming, Racket Dev
Hi Colin,

Cursive looks like a really cool project. DrRacket would definitely
benefit from better support for the sort of structural editing that
Cursive enables.


> My understanding is that using macroexpansion in the way that
> DrRacket does requires a fairly deep integration between the
> macroexpander, the macros themselves and the IDE - is that a
> fair statement?

I don't think they are as deeply integrated as they might appear
to be.  DrRacket and the macro expander have certainly evolved
together over 25 years of development and the macro expander has
had to support more features to enable some of the tools that we
use everyday in DrRacket, but today the two are developed mostly
independently.

DrRacket supports background expansion of Racket programs and
tools like Check Syntax by expanding programs on a separate
core (using places[1]) and by analyzing the resulting fully
expanded program. Check Syntax in particular looks at the
identifiers that show up in a fully expanded program and builds
data structures that associate the definitions and uses of
variables that will later be used to draw the binding arrows that
you see when you hover over identifiers in DrRacket. As Alex
mentioned, Check Syntax also uses the `mouse-over-tooltips`
syntax property in order to show tool tips in DrRacket. Check
syntax also uses variable binding structure for a a number of
other purposes including showing relevant documentation based on
the binding of an identifier, opening the file where a given
identifier is defined, and supporting simple refactorings such as
variable renaming to name few.


> How well does all this work for macros which aren't written
> with DrRacket in mind?

For the most part, things just work. That is if you implement a
macro that introduces a new identifier then DrRacket, usually, is
able to associate the uses of that identifier with the form that
introduces it. When this doesn't work, DrRacket provides hooks
for macro implementers to cooperate with Check Syntax via syntax
properties like `disappeared-use` and `disappeared-binding` which
Check Syntax uses to draw binding arrows for identifiers that may
not exist in the fully expanded program. As Alex explained, you
can also use the `sub-range-binders` syntax property to show the
connections between identifiers constructed only partially from
those that appear in a macro invocation as the `struct` macro
does.

Here is a small macro that demonstrates that most of the time
things just work in DrRacket:

```
#lang racket

(require (for-syntax syntax/parse syntax/transformer))

(define-syntax (my-letrec stx)
  (syntax-parse stx
    [(my-letrec ([x expr]) body)
     #`(let ([temp (box #f)])
         (let-syntax ([x (make-variable-like-transformer
                               #'(unbox temp)
                               #'(lambda (v) (set-box! temp v)))])
           (set-box! temp expr)
           body))]))


(my-letrec ([fact (λ (x) (if (zero? x) 1 (* x (fact (sub1 x)))))])
  (fact 5))
```

When this program is fully expanded there are no more references
to the identifier `fact` in the program, but DrRacket is still
able to correctly draw arrows between the binding of fact and its
uses in the body of the `my-letrec` form. What happens in this
program is that the references to `fact` end up stored in the
`origin` syntax property wherever the original program referred
to `fact`. The macro expander stores the original syntax that a
piece of syntax expanded from in this field, and Check Syntax
uses this information to reconstruct the binding structure of the
original program without needing any help from the developed of
the macro. There are cases where this doesn't quite work out for
more complicated macros, and in those cases manually attaching
the `disappeared-use` and `disappeared-binding` properties allows
programmers to specify the correct binding structure in a way
that Check Syntax can use.


> And finally, of course - is there some documentation about all
> this I could look at? Is this all implemented with syntax
> object properties?

Check Syntax is described a little bit here[2] which describes
the set of syntax properties that Check Syntax uses to draw
arrows. Currently, syntax properties are the best tool we have
for communicating between programs and tools that analyze them,
and to the best of my knowledge most of this style of tool use
syntax properties to pass along static program information to be
analyzed.

This strategy enables a lot of cool features, however, we wrote a
paper[3] a couple years ago that shows how syntax properties can
be used to implement simple refactorings for languages
implemented in Racket, and David Christiansen's very cool Todo
List[4] tool uses a similar technique.

I hope this answers some of your questions, but feel free to
reach out if you have any other questions.

Dan Feltey

[1]: http://docs.racket-lang.org/reference/places.html?q=places
[2]: http://docs.racket-lang.org/tools/Check_Syntax.html?q=check%20syntax
[3]: http://eecs.northwestern.edu/u/daniel.feltey/papers/languages-the-racket-way.pdf
[4]: https://docs.racket-lang.org/todo-list/index.html

--
You received this message because you are subscribed to the Google Groups "Racket Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-dev+unsubscribe@googlegroups.com.
To post to this group, send email to racke...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-dev/87D736B0-C013-48CD-9222-0AA2B41C1420%40knauth.org.

For more options, visit https://groups.google.com/d/optout.

Colin Fleming

unread,
May 15, 2018, 10:49:50 PM5/15/18
to Daniel Feltey, Alex Knauth, Racket Dev
Thanks Dan and Alex! I understand this a lot better now. Thanks for the explanation about how the automatic tracking of disappearing forms works too Dan, I was curious about that. I don't fully understand that but I'll sit down with the macro debugger and experiment, since I'm keen to see how the debugger works too.

That paper is also very interesting - I've skimmed it but I'll definitely come back and read it carefully.

Unfortunately macroexpansion in Clojure is much more primitive than in Racket, so these sorts of things are not possible. It's basically a Common Lisp-style defmacro system, so there's no tracking of previous forms or anything like that. Most (but not all) Clojure forms do support metadata which could be used in a similar way to syntax properties, but all the bookkeeping is manual and Clojure macros are notorious for throwing it away. For example, a simple macro which just expands its body using syntax-quote will lose all the metadata that was present on the original form by default. The end result is that for any non-trivial macro to work with Cursive it would have to be written with that support in mind, and sadly Cursive is not as ubiquitous as DrRacket.

I do have a couple of followup questions. Is performance or memory use ever a problem when doing this repeatedly in an interactive setting? When the user is editing a really large file (I assume Racket has some!), is it responsive enough? Macroexpansion can also result in a lot of code, especially since it seems like the Racket expander is essentially tracking all previous forms through the expansion process. Is memory use ever an issue?

Also, does DrRacket do any caching, for example of data structures calculated from the expansions of modules? Say I use your `(require (for-syntax syntax/parse syntax/transformer))` in two separate files - is DrRacket able to re-use the analysis of the required modules across those two files?

Oh, and re: the structural editing, one thing which is new in Cursive is the new version of parinfer, which infers parentheses based on indentation. It's a really nice solution for maintaining parens, especially for new users who don't want to learn 30 paredit commands. You can see a demo of it here - try indenting and outdenting forms to see how it works. It has some tricky corners, but mostly it works very intuitively.

Thanks,
Colin


Daniel Feltey

unread,
May 17, 2018, 11:16:16 AM5/17/18
to Colin Fleming, Alex Knauth, Racket Dev
Hi Colin,

> I do have a couple of followup questions. Is performance or
> memory use ever a problem when doing this repeatedly in an
> interactive setting? When the user is editing a really large
> file (I assume Racket has some!), is it responsive enough?
> Macroexpansion can also result in a lot of code, especially
> since it seems like the Racket expander is essentially tracking
> all previous forms through the expansion process. Is memory use
> ever an issue?

Performance can definitely be an issue, especially since
expanding Racket programs can involve executing arbitrary code.
I decided to run an experiment to see how long it takes for
DrRacket to send a program to the other place for expansion and
Check Syntax to run its analysis. As a likely lower bound, it
takes about 50 milliseconds to expand and analyze the program
that contains nothing more than `#lang racket/base`. On a small
50 line program I wrote recently to generate some random data the
whole process took around 200ms. On a larger, ~3000 line file
that heavily uses Racket's (macro implemented) class system the
whole process takes around 8 seconds.

Once this analysis is complete, actually drawing the arrows
between definitions and uses is fairly responsive. The
unfortunate thing is that every edit to the file within DrRacket
invalidates the results of Check Syntax and the program must be
expanded and analyzed anew. I have some ideas about how to make
Check Syntax continue to work as programs are being edited so
that users don't have to spend time waiting for Check Syntax to
complete to see binding arrows and use its other tools, but I
haven't had time to explore them yet.

Regarding memory, DrRacket is known to use a fairly large amount
of memory, while I've been writing this email Activity Monitor on
my laptop has been reporting that DrRacket is using about 1.25GB
of memory. It's a bit harder to determine how background
expansion and Check Syntax affect this number, but when I ran the
previous experiment it seems that while the program is expanding
memory use went up by about 10-20MB.



> Also, does DrRacket do any caching, for example of data
> structures calculated from the expansions of modules? Say I use
> your `(require (for-syntax syntax/parse syntax/transformer))`
> in two separate files - is DrRacket able to re-use the analysis
> of the required modules across those two files?

When DrRacket analyzes the program from my last email to generate
binding arrows and other information it doesn't actually look at
the `syntax/parse` or `syntax/transformer` files at all. It
determines the files and libraries where identifiers come from
using lexical scope and information attached to syntax objects by
the macroexpander. More generally, DrRacket doesn't currently
reuse any analyses across files. It will save compiled files that
it uses for running programs to avoid unnecessary recompilation,
but not much else. The main mechanism we have to avoid work is
saving bytecode files, but most of the time spent in compilation
is in macro expansion and the bytecode files have their macros
already expanded away. DrRacket does take advantage of this by
keeping its own cache of bytecode files that are generated by
Check Syntax. For example, if we have two files `a.rkt` and
`b.rkt` open in DrRacket and each requires the same file `c.rkt`
then when Check Syntax runs on `a.rkt` it will generate a
compiled version of `c.rkt` as a dependency which will be reused
when `b.rkt` runs.


I hope this helps
Dan

On Tue, May 15, 2018 at 9:49 PM, Colin Fleming <colin.ma...@gmail.com> wrote:
Thanks Dan and Alex! I understand this a lot better now. Thanks for the explanation about how the automatic tracking of disappearing forms works too Dan, I was curious about that. I don't fully understand that but I'll sit down with the macro debugger and experiment, since I'm keen to see how the debugger works too.

That paper is also very interesting - I've skimmed it but I'll definitely come back and read it carefully.

Unfortunately macroexpansion in Clojure is much more primitive than in Racket, so these sorts of things are not possible. It's basically a Common Lisp-style defmacro system, so there's no tracking of previous forms or anything like that. Most (but not all) Clojure forms do support metadata which could be used in a similar way to syntax properties, but all the bookkeeping is manual and Clojure macros are notorious for throwing it away. For example, a simple macro which just expands its body using syntax-quote will lose all the metadata that was present on the original form by default. The end result is that for any non-trivial macro to work with Cursive it would have to be written with that support in mind, and sadly Cursive is not as ubiquitous as DrRacket.

I do have a couple of followup questions. Is performance or memory use ever a problem when doing this repeatedly in an interactive setting? When the user is editing a really large file (I assume Racket has some!), is it responsive enough? Macroexpansion can also result in a lot of code, especially since it seems like the Racket expander is essentially tracking all previous forms through the expansion process. Is memory use ever an issue?

Also, does DrRacket do any caching, for example of data structures calculated from the expansions of modules? Say I use your `(require (for-syntax syntax/parse syntax/transformer))` in two separate files - is DrRacket able to re-use the analysis of the required modules across those two files?

Oh, and re: the structural editing, one thing which is new in Cursive is the new version of parinfer, which infers parentheses based on indentation. It's a really nice solution for maintaining parens, especially for new users who don't want to learn 30 paredit commands. You can see a demo of it here - try indenting and outdenting forms to see how it works. It has some tricky corners, but mostly it works very intuitively.

Thanks,
Colin

Reply all
Reply to author
Forward
0 new messages