Trouble writing unhygienic macro

44 vues
Accéder directement au premier message non lu

Jonathan Simpson

non lue,
27 mai 2019, 22:19:0327/05/2019
à Racket Users
Hi all,

I'm writing a macro which needs to break hygiene to introduce a function definition. Here is my current hygienic -- and useless -- version:

(define-syntax (named-query stx)
 
(syntax-case stx (name-line)
   
[(_ (name-line (_ 0) (_ "name") magic-name))
     
(with-syntax ([name (string->symbol (syntax->datum #'magic-name))])
       
#'(define name
           
(lambda () (void))))]
   
[(_ (name-line (_ 0) (_ "name") magic-name) . rst)
     
(with-syntax ([name (string->symbol (syntax->datum #'magic-name))]
                   
[modified-rst (cons (datum->syntax #'rst always-true-line) #'rst)])
       
#'(define name
           
(lambda () (query . modified-rst))))]))

The macro invocation will look something like this:

(named-query
   
(name-line (offset 0) (name-type "name") "tga-image"))

I know I need to use datum->syntax to create an unhygienic function identifier. But I'm getting hung up on the pattern variable 'magic-name'. How the heck can I use it do this? First, I convert it from a string to a symbol but then what? I've made various attempts but I can't figure out a way to use it inside the template and do what I need to with it. My attempts to use with-syntax* also fail with errors like "modified-rst: unbound identifier in module (in phase 1, transformer environment)".

I think my general problem is that I don't understand when and how I can use pattern variables. In this case I think I need to create a datum from the pattern variable so I can run datum->syntax on it. But I haven't been able to figure out how to do that in a template. For instance, this also fails to create an identifier I can use outside of the macro:

(with-syntax ([name (datum->syntax #'magic-name (string->symbol (syntax->datum #'magic-name)))])

I really appreciate any help.

-- Jonathan

Greg Hendershott

non lue,
27 mai 2019, 22:54:5027/05/2019
à Jonathan Simpson,Racket Users
If users of your `named-query` macro will supply the name as an
identifier -- an unquoted symbol like some-name in this example:

(named-query (name-line (_ 0) (_ "name") some-name))

Then what your macro needs to do with the pattern variable is... just
use it -- as is -- in the template. (It is already a piece of syntax
that could be a valid identifier. You're all set.)


If the idea is that users will supply the name as a string like
"some-name", then yes your macro would need to do the

(string->symbol (syntax->datum #'magic-name))

thing you already have -- but *also* convert that result back to syntax:

(datum->syntax #'magic-name
(string->symbol (syntax->datum #'magic-name)))


p.s. That (datum->syntax _ (string->symbol (syntax->datum _))) triplet
has an equivalent handy shortcut -- `format-id`:

(format-id #'magic-name "~a" #'magic-name)


p.p.s. I had similar questions before and wrote this:

<https://www.greghendershott.com/fear-of-macros/pattern-matching.html>

Jonathan Simpson

non lue,
27 mai 2019, 23:28:3127/05/2019
à Racket Users
Thanks for the quick response. I wouldn't have gotten as far as I have so far without your 'Fear of Macros' page, so thanks for that as well!

On Monday, May 27, 2019 at 10:54:50 PM UTC-4, Greg Hendershott wrote:
If users of your `named-query` macro will supply the name as an
identifier -- an unquoted symbol like some-name in this example:

  (named-query (name-line (_ 0) (_ "name") some-name))

Then what your macro needs to do with the pattern variable is... just
use it -- as is -- in the template. (It is already a piece of syntax
that could be a valid identifier. You're all set.)



This macro is part of the expansion of a language. There is a form that defines a function and another form to call one. I may be missing something, but I didn't think I could use the name as valid syntax since this is the form that is creating it. If passing a symbol in would work then I could potentially change my lexer to do that instead of a string.
 
If the idea is that users will supply the name as a string like
"some-name", then yes your macro would need to do the

  (string->symbol (syntax->datum #'magic-name))

thing you already have -- but *also* convert that result back to syntax:

  (datum->syntax #'magic-name
                 (string->symbol (syntax->datum #'magic-name)))


I tried this in my with-syntax and for some reason the identifier was still not visible to the code that is calling it, which is in the same module. Any ideas? These macros are part of a language and aren't used directly from normal racket code. Perhaps there is some added complexity there?

-- Jonathan

Jonathan Simpson

non lue,
27 mai 2019, 23:44:0527/05/2019
à Racket Users
In case this helps, here is the output from the macro stepper for a sample macro invocation:

(named-query
        (name-line (offset 0) (name-type "name") "always-true")
        (level)
        (line (offset 0) (type (default "default")) (test (truetest "x"))))

-> 

(define:24 always-true
         (lambda:24 ()
           (query:24
            (line (offset 0) (type (default "default")) (test (truetest "x")))
            (level)
            (line (offset 0) (type (default "default")) (test (truetest "x"))))))

->

(define:25 always-true
         (lambda:25 ()
           (query:24
            (line (offset 0) (type (default "default")) (test (truetest "x")))
            (level)
            (line (offset 0) (type (default "default")) (test (truetest "x"))))))

->

(define-values:26 (always-true)
         (lambda:25 ()
           (query:24
            (line (offset 0) (type (default "default")) (test (truetest "x")))
            (level)
            (line (offset 0) (type (default "default")) (test (truetest "x"))))))

'query' is another macro, but hopefully not relevant to my problem.

This is using the version of my code where I added the call to datum->syntax after converting the string to a symbol. With this output, should 'always-true' be visible to other code in the same module?

-- Jonathan

Greg Hendershott

non lue,
28 mai 2019, 00:41:4328/05/2019
à Jonathan Simpson,Racket Users
It seemed like most of your question was about creating the name
identifier for the `define`. I focused on (and hopefully answered) that
part. But I didn't pick up on what you said the error message was:

>> attempts to use with-syntax* also fail with errors like "modified-rst:
>> unbound identifier in module (in phase 1, transformer environment)".

So, this part isn't about `magic-name`. It's about `modified-rst`
in your second clause:

(define-syntax (named-query stx)
(syntax-case stx (name-line)
[(_ (name-line (_ 0) (_ "name") magic-name))
(with-syntax ([name (string->symbol (syntax->datum #'magic-name))])
#'(define name
(lambda () (void))))]
[(_ (name-line (_ 0) (_ "name") magic-name) . rst)
(with-syntax ([name (string->symbol (syntax->datum #'magic-name))]
[modified-rst (cons (datum->syntax #'rst always-true-line) #'rst)])
#'(define name
(lambda () (query . modified-rst))))]))

A few things:

1. The invocation you mentioned:

>> The macro invocation will look something like this:
>>
>> (named-query
>> (name-line (offset 0) (name-type "name") "tga-image"))

doesn't seem to match that second clause? So I'm not sure how that
invocation is giving you that error message. Is it actually some other
invocation example?

2. I don't see where `always-true-line` comes from. Where is that
defined? What kind of values will it have?

3. Could you say more about what you're trying to do here?

`(cons (datum->syntax #'rst always-true-line) #'rst)`

Alexis King

non lue,
28 mai 2019, 01:45:3528/05/2019
à Jonathan Simpson,Racket Users
> On May 27, 2019, at 22:28, Jonathan Simpson <jjsi...@gmail.com> wrote:
>
> I may be missing something, but I didn't think I could use the name as valid syntax since this is the form that is creating it. If passing a symbol in would work then I could potentially change my lexer to do that instead of a string.

I might not be understanding your question properly, but I think the answer is: yes, if your macro were to accept an identifier in the `magic-name` position instead of a string, then you could put that identifier in the expansion and it would “just work.” See this macro, for example:[^1]

#lang racket
(require (for-syntax syntax/parse))

(define-syntax (macro-that-defines-a-name stx)
(syntax-parse stx
[(_ some-name)
#'(define some-name 42)]))

Hygiene ensures that `some-name` will be bound wherever the identifier itself comes from, so it will be bound in whatever context uses `macro-that-defines-a-name`:

> (macro-that-defines-a-name my-name)
> my-name
42

If you really want to pass a string to your macro instead of an identifier, you can use what you have in combination with `datum->syntax` to create a new identifier that “copies” scoping information from some piece of input syntax. You currently have this:

(string->symbol (syntax->datum #'magic-name))

...and that will produce a symbol, but it won’t have any scope information associated with it (since `syntax->datum` threw it all away).[^2]

To copy scoping information from the input, you can add a use of `datum->syntax`, supplying the “source” of the copy operation as the first argument:

(let ([magic-name-sym (string->symbol (syntax->datum #'magic-name))])
(with-syntax ([magic-name-id
(datum->syntax #'magic-name magic-name-sym)])
....))

This will create an identifier with the scopes I think you want.

Alexis

[^1]: You should use `syntax-parse` instead of `syntax-case`. There’s nothing wrong with using `syntax-case` per se, but `syntax-parse` is just better on all axes (except perhaps compilation time, but you know what they say about premature optimization). It will provide significantly better syntax error messages even if you stick purely to the `syntax-case` subset.

Also, with `syntax-parse`, you can improve upon the above examples somewhat. You can write a pattern binding of the shape `name:id` or `name:str` to restrict matching to an identifier or string, respectively, and syntax errors will be improved accordingly. Also, you can replace the uses of `let` and `with-syntax` with `syntax-parse`’s built-in `#:do` and `#:with` directives, which are used like this:

(syntax-parse stx
[(_ name:str)
#:do [(define name-sym (string->symbol (syntax->datum #'name)))]
#:with name-id (datum->syntax #'name name-sym)
#'(define name-id 42)])

[^2]: By default, when given a value that isn’t a syntax object, `with-syntax` automatically coerces it to one, copying whatever scoping information is present on the `with-syntax` form itself, i.e. the macro scopes. This is different from explicitly writing

(with-syntax ([foo (datum->syntax #f 'bar)])
....)

since that will bind `foo` to a syntax objects with no scopes at all. In contrast, writing `(with-syntax ([foo 'bar]) ....)` is equivalent to writing `(with-syntax ([foo #'bar]) ....)`.

As a final note, `syntax-parse`’s `#:with` form is slightly different from `with-syntax` in this respect, as it really will use `(datum->syntax #f ....)` instead of trying to guess at what scopes you might have wanted.

Sorawee Porncharoenwase

non lue,
28 mai 2019, 01:58:3928/05/2019
à Alexis King,Jonathan Simpson,Racket Users
On Mon, May 27, 2019 at 7:54 PM Greg Hendershott <rac...@greghendershott.com> wrote:
p.s. That (datum->syntax _ (string->symbol (syntax->datum _))) triplet
has an equivalent handy shortcut -- `format-id`:

  (format-id #'magic-name "~a" #'magic-name)

 This doesn't work in the string literal case because an argument of `format-id` must be an identifier (or symbol, or string, but not a syntax object of a string literal).

On Mon, May 27, 2019 at 10:45 PM Alexis King <lexi....@gmail.com> wrote:
As a final note, `syntax-parse`’s `#:with` form is slightly different from `with-syntax` in this respect, as it really will use `(datum->syntax #f ....)` instead of trying to guess at what scopes you might have wanted.

Aha! I didn't understand why sometimes `with-syntax` works while `#:with` doesn't until now. Thanks so much for this.
 
--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/49B6B0B9-1D03-4E66-B5C1-C70C45A22235%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jonathan Simpson

non lue,
28 mai 2019, 09:45:0828/05/2019
à Racket Users
Sorry. It is probably better to ignore the part about the second syntax-case clause for now. I didn't provide the necessary background to make sense of it. Once I get past my current problem it probably won't be relevant anyway. I'm currently hitting the unbound identifier error in both clauses so if I fix the first one I should be able to fix the second one. And the first one is the simpler case.

Jonathan Simpson

non lue,
28 mai 2019, 10:11:5928/05/2019
à Racket Users

On Tuesday, May 28, 2019 at 1:45:35 AM UTC-4, Alexis King wrote:
> On May 27, 2019, at 22:28, Jonathan Simpson <jjsi...@gmail.com> wrote:
>
> I may be missing something, but I didn't think I could use the name as valid syntax since this is the form that is creating it. If passing a symbol in would work then I could potentially change my lexer to do that instead of a string.

I might not be understanding your question properly, but I think the answer is: yes, if your macro were to accept an identifier in the `magic-name` position instead of a string, then you could put that identifier in the expansion and it would “just work.” See this macro, for example:[^1]

    #lang racket
    (require (for-syntax syntax/parse))

    (define-syntax (macro-that-defines-a-name stx)
      (syntax-parse stx
        [(_ some-name)
         #'(define some-name 42)]))

Hygiene ensures that `some-name` will be bound wherever the identifier itself comes from, so it will be bound in whatever context uses `macro-that-defines-a-name`:

    > (macro-that-defines-a-name my-name)
    > my-name
    42


I will try this approach. Thinking about this has given me some insight that may explain my problem.

Both the function definition and function calls are created by similar looking macros which pass strings as the function name. I've now taken steps to break hygiene in the defining macro, but the calling macro just converts the string to a symbol. It's invocation looks something like this:

(query
   
(line (offset 0) (type "use") "tga-image"))
 
Perhaps they aren't referring to the same binding. Maybe I need a datum->syntax in this macro as well. I don't have access to the code at the moment, so I can't try it, but does this make sense?

If this is the case then I can probably fix it by modifying the other macro as well, but modifying the lexer to emit symbols instead of strings seems like the best approach.

-- Jonathan
 

Matthew Butterick

non lue,
28 mai 2019, 22:26:3828/05/2019
à Jonathan Simpson,Racket Users

On May 28, 2019, at 7:11 AM, Jonathan Simpson <jjsi...@gmail.com> wrote:

Both the function definition and function calls are created by similar looking macros which pass strings as the function name. I've now taken steps to break hygiene in the defining macro, but the calling macro just converts the string to a symbol. It's invocation looks something like this:

(query
   
(line (offset 0) (type "use") "tga-image"))
 
Perhaps they aren't referring to the same binding. Maybe I need a datum->syntax in this macro as well. I don't have access to the code at the moment, so I can't try it, but does this make sense?

If this is the case then I can probably fix it by modifying the other macro as well, but modifying the lexer to emit symbols instead of strings seems like the best approach.


Here's an example of two macros that both make an identifier out of "tga-image": the first defines a function called `tga-image`, and the second calls it. Notice that the same syntax-context-switching fandango is needed in both cases to ensure that both `tga-image` ids are placed inside the same syntax context, so that the first one binds the second. (As someone pointed out earlier, you could also use `format-id` for this, which is shorthand for the same operation)


#lang racket

(define-syntax (definer-macro stx)
  (syntax-case stx ()
    [(_ magic-name)
     (with-syntax ([name (datum->syntax #'magic-name (string->symbol (syntax->datum #'magic-name)))])
       #'(define (name x) x))]))

(definer-macro "tga-image")

(define-syntax (caller-macro stx)
  (syntax-case stx ()
    [(_ magic-name arg)
     (with-syntax ([name (datum->syntax #'magic-name (string->symbol (syntax->datum #'magic-name)))])
       #'(name arg))]))

(caller-macro "tga-image" 42)

Jonathan Simpson

non lue,
28 mai 2019, 22:51:5228/05/2019
à Racket Users
Yes, this did the trick! I needed to add datum->syntax to the identifiers in BOTH macros. In my initial attempt I only added it to the defining macro.

More importantly, I now understand why I needed to do it this way. I plan to eventually refactor it to pass the identifier in directly instead of as a string, which will hopefully simplify things. This will work for now though.

Thanks to everyone. I'm continually impressed by the helpfulness and patience of this community.

-- Jonathan
Répondre à tous
Répondre à l'auteur
Transférer
0 nouveau message