Pattern: reusing the same name in macro-generated definitions

52 views
Skip to first unread message

zeRusski

unread,
Apr 4, 2019, 3:58:48 PM4/4/19
to Racket Users
While reading rackunit source I stumbled on a pattern that I can't figure out. Why the heck does it work? Condensed to its essence it amounts to introducing indirection with a macro-generated define:

#lang racket
(require (for-syntax syntax/parse)
         syntax/parse/define)
 

(define-simple-macro (define-foo (name:id formal:id ...) body:expr ...)
  (begin
    (define (foo-impl formal ...) body ...)
    (define-syntax (name stx)
      (syntax-parse stx
        [(_ . args) #'(foo-impl . args)]
        [_:id #'(λ args (apply foo-impl args))]))))
 

(define-foo (bar op a b) (op a b))
(define-foo (baz op a b) (op a b))
;; Why am I not getting this error?
;; --------------------------------
; module: identifier already defined
;   at: foo-impl


See that foo-impl there? The same name is being reused every time the define-foo macro is called. I would've expected Racket to shout at me that I'm attempting to redefine something, but it doesn't and magically it works. Why? Say, in Elisp or Clojure I would've gensymed that symbol. What am I missing? Does Racket perform some clever renaming to preserve hygiene or something? Could someone please help me reason through this.


PS: also I left that second clause there just cause it just dawned on me how cool it is that we can use macros in identifier position :)

Ben Greenman

unread,
Apr 4, 2019, 4:02:58 PM4/4/19
to zeRusski, Racket Users
Racket's macros are hygienic. They'll gensym for you.

zeRusski

unread,
Apr 4, 2019, 4:45:49 PM4/4/19
to Racket Users
I know in principle but on occasion I fail to understand the implications.  Let me think aloud. I don't have to be perfectly accurate, maybe just about right. Hygiene here means that every symbol there e.g. arguments my macro receives carry their "environment" with them. There exists some oracle which can tell when two symbols refer to the same thing probably by checking environments somehow. Since I just typed that foo-impl there in the template it must be getting some fresh tag or "environment" attached to it to avoid capturing something with the same name defined at the macro call site, right? Ok. How the define before foo-impl is special then? We both know the "define" I mean. Or is the newly attached "environment" is in fact not empty and comes enriched with a bunch of Racket stuff? How do I reason when its safe to just type a name and when it isn't? In fact, here. I just defined a foo-impl outside. If I now remove the macro-defined foo-impl the code will still work correctly and grab the outer definition. So define inside a template refers to the usual define, but foo-impl doesn't? Why?

(define (foo-impl op a b) (op a b))

(define-simple-macro (define-foo (name:id formal:id ...) body:expr ...)
     ... same ...

(define-foo (bar op a b) (op a b))
(define-foo (baz op a b) (op a b))
(bar + 1 2)
;; => 3

Greg Hendershott

unread,
Apr 4, 2019, 5:28:32 PM4/4/19
to zeRusski, Racket Users
If I understand correctly, the fourth paragraph here is relevant?

https://docs.racket-lang.org/reference/syntax-model.html#%28part._transformer-model%29

So, `foo-impl` is a binding introduced by the macro and gets that
macro invocation's fresh macro-introduction scope.

Whereas for example `name` is syntax coming from outside the macro,
and doing `(define-foo (blerg ___) ___)` twice would be an error due
to redefining `blerg`.
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

zeRusski

unread,
Apr 5, 2019, 6:23:56 AM4/5/19
to Racket Users

If I understand correctly, the fourth paragraph here is relevant?

  https://docs.racket-lang.org/reference/syntax-model.html#%28part._transformer-model%29


I dreaded someone pointing me there. I read it a year ago, took a lot of head scratching and careful reading before I convinced myself that I'd grokked it. Both the vocabulary used and apparently my understanding dissipated after a year. Had to read it again :)
 
So, `foo-impl` is a binding introduced by the macro and gets that
macro invocation's fresh macro-introduction scope.

Whereas for example `name` is syntax coming from outside the macro,
and doing `(define-foo (blerg ___) ___)` twice would be an error due
to redefining `blerg`.

Ok. Let's see if I can explain away all mysteries by carefully following the syntax and expansion model. Someone please read it through and poke holes. In the process, if I'm not mistaken, we are going to discover an error in the Scopes section of the docs.

Step aside everyone, I'm putting my Matthew hat on. Here goes nothing! 

We are trying to answer the following questions about this piece of code (I annotated some identifiers with indexes [in sq brackets]):

(define[1] (foo-impl[1] op a b) (op a b))

(define-simple-macro (define-foo (name:id formal:id ...) body:expr ...)
  (begin
    (define[2] (foo-impl[2] formal ...) body ...)
    (define-syntax (name stx)
         .....
      (foo-impl[3] . args)
         .....)))

(define-foo (bar op a b) (op a b))[4]
(define-foo (baz op a b) (op a b))[5]
(bar + 1 2)


Q1: My original question was why the two call sites [4] and [5] do not complain about redefinition of /foo-impl/. After all every time the transformer is invoked it generates a definition of the same identifier /foo-impl/, which I can easily see in macro-expansion at the relevant call site. However, if I were to type /foo-impl/ definitions by hand at top-level or module level Racket would yell at me. Why the two cases look about the same (i.e. end up producing visually the same code) but invoke different reaction from the compiler?

Suppose the transformer at [4] did its job and we are now evaluating the code it produced, that is the binding [2] it introduced and that use of /foo-impl/ at [3]. Generated [2] will have at least two scopes: one from the macro definition site i.e. /define-foo/, the other is the fresh macro-introduction scope. When the reference to /foo-impl/ at [3] gets resolved we'll be looking for bindings of /foo-impl/ whose scope sets are subsets of the reference, that is of the identifier at [3]. In fact we find two such bindings: [1] and [2]. Which one do we choose? We choose the one whose scope set is the superset of any other binding we discovered. Here [2] has at least one extra scope (macro-intro scope) compared to [1], so we use [2].

Now, why is there are no ambiguity as to which /foo-impl/ to use when we expand and eval [5]? Well, we'll go through the same motions, but there will be one extra /foo-impl/ binding generated at [4], so we'll have to choose from the total of 3 bindings when resolving any /foo-impl/ ref at [5]. And again we choose the [2] that is the result of expansion of [5].  Why? Well, it'll have that fresh macro intro scope passed to the transformer from [5] and it differs from the macro intro scope at [4], so there is no ambiguity between the two generated bindings at [4] and [5]. For any ref to /foo-impl/ generated by [5] we reject the /foo-impl/ binding generated at [4] because its scope set isn't a subset of any ref at [5]. To choose between [1] and generated [2] we use the same logic as in the previous paragraph: [2] wins cause its scope set is bigger.

So, the three bindings  of identifier /foo-impl/ that the code above introduces at [1], [4] and [5] (latter two generated from the template at [2]) are not at all the same, at least not in the syntax model of Racket. Identifiers aren't merely compared by name, their scope sets have the final say in how identifiers are resolved.

Q2: My macro introduces a new binding for /foo-impl/ at [2]. How is that identifier different from the /define/ identifier at [2]? That is to ask why the /define/ at [2] is bound as we expect to the /define/ in Racket, while any reference to /foo-impl/ in the subsequent template code refers to the binding at [2].

The part about any /foo-impl/ ref in the template to the binding at [2] we already answered in Q1. Other bindings e.g. /define/ at [2] are again resolved as we discussed in Q1. This particular /define/ would resolve using the macro definition site, one of /define-foo/.

Q3: If we were to remove /foo-impl/ binding at [2] any template code would happily refer to /foo-impl/ binding at [1]. How so?

Again Q1 kinda answers that. Put simply there are fewer /foo-impl/ bindings to choose from, and the one at [1] happens to have the scope set that is a subset of any use in the template e.g. at [3].

Q4: If instead of [5] I were to just copy paste [4], would Racket yell about attempt to redefine /bar/? Why? (this is inspired by Greg's comment about defining /blerg/).

Yes it would. /bar/ id in both expansions of /define-foo/ would have the exact same scope-set (from the macro use site) and won't have a macro intro scope to disambiguate. That's cause we passed /bar/ id to the macro as syntax object (technically, each /bar/ would get a fresh macro intro scope on the way "into" the transformer, but it gets removed on the way "out" i.e. in the code generated by the transformer; in reverse any identifier in the macro template would start with no macro-intro scope but end up with one in the generated code - the process referred to in the docs as "flipping" the macro intro scopes).


In conclusion. Racket macro system requires some careful thought, looking at macro expansion will only take you so far. It took a lot out of me to think things through and put em in writing. Ben was right. Indeed, Racket macros are hygienic but I guess saying that it'll "gensym" for you is basically waving a lot of details away. Because instead of renaming we use these scope sets (glorified tokens or tags, really) and syntax objects carry those sets with them and can borrow, erase, lend them, I guess Racket can encode really bizarre scoping rules when you so desire (or you don't and just screwed up). It is certainly rich and expressive. But I wonder if there are shortcuts one could take to quickly reason in situations like these? Ones that would give the right answer 99% of time without deliberating about scope sets n all. Maybe it just gets easier every time you do it.



An identifier refers to a particular binding when the reference’s symbol and the identifier’s symbol are the same, and when the reference’sscope set is a subset of the binding’s scope set.

Should probably read:

An identifier refers to a particular binding when the reference’s symbol and the identifier’s symbol are the same, and when the reference’s scope set is a superset of the binding’s scope set.

or perhaps equivalently

An identifier refers to a particular binding when the reference’s symbol and the identifier’s symbol are the same, and when the binding’s scope set is a subset of the reference’s scope set

Unless I'm mistaken that much should be obvious from the examples and amounts to the fact that as we go deeper (in nesting) into the code tree the number of scopes attached to identifiers can only grow, therefore it follows that any reference to something "previously" defined would have "more" scopes not fewer compared to its potential bindings. (caveat: strictly this may not be always true cause a macro transformer could get fancy and borrow another identifier's scope for whatever it generates, but whatever). 

Also the next sentence about ref resolution kinda hints at the correct wording:

For a given identifier, multiple bindings may have scope sets that are subsets of the identifier’s; in that case, the identifier refers to the binding whose set is a superset of all others; if no such binding exists, the reference is ambiguous

Was that convincing? Do I misunderstand how scope sets work after all? Do I need to PR?

Thanks


Jens Axel Søgaard

unread,
Apr 5, 2019, 7:35:51 AM4/5/19
to zeRusski, Racket Users
Den tor. 4. apr. 2019 kl. 21.58 skrev zeRusski <vladile...@gmail.com>:

(define-simple-macro (define-foo (name:id formal:id ...) body:expr ...)
  (begin
    (define (foo-impl formal ...) body ...)
    (define-syntax (name stx)
      (syntax-parse stx
        [(_ . args) #'(foo-impl . args)]
        [_:id #'(λ args (apply foo-impl args))]))))
 

(define-foo (bar op a b) (op a b))
(define-foo (baz op a b) (op a b))
;; Why am I not getting this error?
;; --------------------------------
; module: identifier already defined
;   at: foo-impl


See that foo-impl there? The same name is being reused every time the define-foo macro is called. 
I would've expected Racket to shout at me that I'm attempting to redefine something, but it doesn't and magically it works. 
Why?

A simple model to keep in your head:
  Each macro keeps a count, i,  of how many times it has been applied.
  Each time a the output of a macro contains a definition of name not present in the output it appends _i to the name.

Thus

  (define-foo (bar op a b) (op a b))
will define   foo_imp_1   and
  (define-foo (bar op a b) (op a b))
will define   foo_imp_2   respectively.

Simple models do not explain all situations, but it does handle simple situations.

/Jens Axel



 

Hendrik Boom

unread,
Apr 5, 2019, 7:52:15 AM4/5/19
to Racket Users
On Fri, Apr 05, 2019 at 01:35:37PM +0200, Jens Axel Søgaard wrote:
> Den tor. 4. apr. 2019 kl. 21.58 skrev zeRusski <vladile...@gmail.com>:
>
> (define-simple-macro (define-foo (name:id formal:id ...) body:expr ...)
> >> (begin
> >> (define (foo-impl formal ...) body ...)
> >> (define-syntax (name stx)
> >> (syntax-parse stx
> >> [(_ . args) #'(foo-impl . args)]
> >> [_:id #'(λ args (apply foo-impl args))]))))
> >
> >
> >
> >
> >> (define-foo (bar op a b) (op a b))
> >> (define-foo (baz op a b) (op a b))
> >> ;; Why am I not getting this error?
> >> ;; --------------------------------
> >> ; module: identifier already defined
> >> ; at: foo-impl
> >
> >
> >
> > See that *foo-impl* there? The same name is being reused every time the
> > *define-foo* macro is called.
> >
> I would've expected Racket to shout at me that I'm attempting to redefine
> > something, but it doesn't and magically it works.
> >
> Why?
> >
>
> A simple model to keep in your head:
> Each macro keeps a count, i, of how many times it has been applied.
> Each time a the output of a macro contains a definition of name not
> present in the output it appends _i to the name.

Do you mean not present in the *input*?

-- hendrik

>
> Thus
> (define-foo (bar op a b) (op a b))
> will define foo_imp_1 and
> (define-foo (bar op a b) (op a b))
> will define foo_imp_2 respectively.
>
> Simple models do not explain all situations, but it does handle simple
> situations.
>
> /Jens Axel
>

Jens Axel Søgaard

unread,
Apr 5, 2019, 7:53:40 AM4/5/19
to Racket Users
Yes!
--
--
Jens Axel Søgaard

zeRusski

unread,
Apr 5, 2019, 11:02:55 AM4/5/19
to Racket Users
A simple model to keep in your head: 
  Each macro keeps a count, i,  of how many times it has been applied.
  Each time a the output of a macro contains a definition of name not present in the input it appends _i to the name.

Ha, I get it. That's a good little heuristic. I'm keeping it. Thanks. That index is a simplified model of that whole "fresh macro-introduction scope" business. I like it.

Matthew Butterick

unread,
Apr 5, 2019, 11:16:07 AM4/5/19
to zeRusski, Racket Users

On Apr 5, 2019, at 3:23 AM, zeRusski <vladile...@gmail.com> wrote:

Now about that error in the docs:

I was skeptical, but I think you're right:

"More generally, we can define binding based on subsets: A reference’s binding is found as one whose set of scopes is a subset of the reference’s own scopes (in addition to having the same symbolic name)."


PS I didn't really get scopes until I saw Matthew Flatt's visual explanation using colors to represent scopes:

Reply all
Reply to author
Forward
0 new messages