Simple macro issues

91 views
Skip to first unread message

Simon Haines

unread,
Sep 9, 2019, 1:12:12 AM9/9/19
to Racket Users
I'm trying to write a macro that will turn a list of hex literals into a byte string.

(hex a b c 1 2 3) ; #"\n\v\f\1\2\3"

After many hours I have finally come up with this:

#lang racket
(define-syntax hex
  (syntax-rules ()
    [(_ num ...)
     (bytes
      (let ([e (syntax-e #'num)])
        (if (number? e) num
            (string->number (symbol->string e) 16))) ...)]))

(hex a b c 1 2 3)

Of course there are many issues with checking the parameters etc. My problem is this generates "a: unbound identifier in: a" because the arguments are evaluated? If I remove the last line it works in the REPL OK.

I suspect this is a small matter of my phases being mixed up, or a misunderstanding of when macros can be defined and used, or just outright ignorance on my part. I couldn't find any clues in the many, many references and tutorials I have read. I want to master them but I loathe creating macros, they always make me feel like an idiot, and I hope Racket2 simplifies them somehow.

Thanks for any help sorting this one out.

Sorawee Porncharoenwase

unread,
Sep 9, 2019, 1:31:29 AM9/9/19
to Simon Haines, Racket Users

This works for me:

#lang racket

(define (hex:char x)
  (if (number? x)
      x
      (string->number (symbol->string x) 16)))

(define-syntax-rule (hex num ...) (bytes (hex:char (quote num)) ...))

(hex a b c 1 2 3) ; #"\n\v\f\1\2\3"

It’s almost always a mistake to use a function that manipulate syntax object (syntax-e, etc.) inside syntax-rules, because syntax-rules don’t give you an access to the syntax object. If you do want to manipulate syntax object, use syntax-cases instead. In your case, however, the problem is easy enough that you don’t need to directly manipulate syntax objects.


--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/6d096642-b770-4655-acc5-08b36e176554%40googlegroups.com.

Simon Haines

unread,
Sep 9, 2019, 1:47:19 AM9/9/19
to Racket Users
Thanks Sorawee,

So what is 'num' inside define-syntax-rule if not a syntax object? And why did my earlier attempt create a macro that tried to evaluate its arguments? In other words, what are the steps I need to take, or the realisations I need to make, to work back from "a: unbound identifier in: a" to a solution like you provided? Thanks again.


On Monday, 9 September 2019 15:31:29 UTC+10, Sorawee Porncharoenwase wrote:

This works for me:

#lang racket

(define (hex:char x)
  (if (number? x)
      x
      (string->number (symbol->string x) 16)))

(define-syntax-rule (hex num ...) (bytes (hex:char (quote num)) ...))

(hex a b c 1 2 3) ; #"\n\v\f\1\2\3"

It’s almost always a mistake to use a function that manipulate syntax object (syntax-e, etc.) inside syntax-rules, because syntax-rules don’t give you an access to the syntax object. If you do want to manipulate syntax object, use syntax-cases instead. In your case, however, the problem is easy enough that you don’t need to directly manipulate syntax objects.


On Mon, Sep 9, 2019 at 12:12 PM Simon Haines <simon...@con-amalgamate.net> wrote:
I'm trying to write a macro that will turn a list of hex literals into a byte string.

(hex a b c 1 2 3) ; #"\n\v\f\1\2\3"

After many hours I have finally come up with this:

#lang racket
(define-syntax hex
  (syntax-rules ()
    [(_ num ...)
     (bytes
      (let ([e (syntax-e #'num)])
        (if (number? e) num
            (string->number (symbol->string e) 16))) ...)]))

(hex a b c 1 2 3)

Of course there are many issues with checking the parameters etc. My problem is this generates "a: unbound identifier in: a" because the arguments are evaluated? If I remove the last line it works in the REPL OK.

I suspect this is a small matter of my phases being mixed up, or a misunderstanding of when macros can be defined and used, or just outright ignorance on my part. I couldn't find any clues in the many, many references and tutorials I have read. I want to master them but I loathe creating macros, they always make me feel like an idiot, and I hope Racket2 simplifies them somehow.

Thanks for any help sorting this one out.

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket...@googlegroups.com.

Daniel Prager

unread,
Sep 9, 2019, 2:25:55 AM9/9/19
to Simon Haines, Racket Users
Hi Simon

I think you'll find that the if statement is misguided, since it treats numbers as decimal rather than hex valued.

Happily, this simplifies the solution.

#lang racket

(define-syntax-rule (hex h ...)
  (bytes (string->number (~a (quote h)) 16) ...))

(hex a b c 1 2 3 41 42 43) ; #"\n\v\f\1\2\3ABC"


Dan

Simon Haines

unread,
Sep 9, 2019, 2:42:02 AM9/9/19
to Racket Users
Thanks Dan, I am learning a fair bit today: a quoted number evaluates to its unquoted value, and the racket/format templates are available outside format (I had forgotten that).

I'm still trying to figure out why my original, terrible macro behaved the way it did, but I suspect I'll never know. I would have wasted a lot of time on that awful construct. I appreciate your help, many thanks.

Daniel Prager

unread,
Sep 9, 2019, 3:25:07 AM9/9/19
to Simon Haines, Racket Users
Hi Simon

I only use macros sparingly, and sympathise with your struggles to develop macro-fu.

Some simple macros can be written quite simply using define-syntax-rule/s and aren't that much more complex than writing functions.

To milk a bit more from this example, here's a similarly themed function:

#lang racket

(define (hex/f hs)
  (apply bytes
         (for/list ([h hs])
           (string->number (~a h) 16))))

(hex/f '(a b c 1 2 3 41 42 43)) ; #"\n\v\f\1\2\3ABC"



The (marginal) advantage of the macro, of course, is that we can omit the quote from the function call.

Dan

Philip McGrath

unread,
Sep 9, 2019, 4:40:27 AM9/9/19
to Simon Haines, Racket Users
On Mon, Sep 9, 2019 at 2:42 AM Simon Haines <simon....@con-amalgamate.net> wrote:
I'm still trying to figure out why my original, terrible macro behaved the way it did, but I suspect I'll never know. I would have wasted a lot of time on that awful construct. I appreciate your help, many thanks.

A great way to understand how your macros are (mis)behaving is to use the macro stepper in DrRacket to walk through an expansion.

Here's your original macro again, with a slightly smaller example that produces the same error ("a: unbound identifier in: a"):
#lang racket
(define-syntax hex
  (syntax-rules ()
    [(_ num ...)
     (bytes
      (let ([e (syntax-e #'num)])
        (if (number? e) num
            (string->number (symbol->string e) 16))) ...)]))
(hex a)

The macro stepper shows that `(hex a)` expands into this:
(bytes
  (let ([e (syntax-e #'a)])
    (if (number? e)
        a
        (string->number (symbol->string e) 16))))

Hopefully that makes some of the issues clear, starting with the use of `a` in the "then" branch of your `if` expression, which is indeed an unbound identifier.

In case you don't already know, you can write literal numbers in hex notation in Racket, so `(bytes #xa #xb #xc #x1 #x2 #x3 #x41 #x42 #x43)` evaluates to `#"\n\v\f\1\2\3ABC"`.

I strongly endorse using `syntax-parse` for writing macros, which gives you good error checking and many other benefits. Here is a version of your macro that expands to a literal byte-string, rather than an expression that will create a byte-string at run-time:
#lang racket

(require (for-syntax syntax/parse))

(define-for-syntax (int->hex n)
  ;; treats n as though it had been written in hex
  (let loop ([n n]
             [place 0]
             [acc 0])
    (cond
      [(= 0 n)
       acc]
      [else
       (define-values [q r]
         (quotient/remainder n 10))
       (loop q (add1 place) (+ acc (* r (expt 16 place))))])))
             

(define-syntax (hex stx)
  (define-syntax-class hex-byte
    #:description "hexadecimal byte"
    #:attributes [n]
    (pattern :exact-nonnegative-integer
             #:attr n (int->hex (syntax-e this-syntax))
             #:fail-when (and (not (byte? (attribute n))) this-syntax)
             "not a byte? when interpreted as hexadecimal")
    (pattern :id
             #:attr n (string->number (symbol->string
                                       (syntax-e this-syntax))
                                      16)
             #:fail-when (and (not (attribute n)) this-syntax)
             "not a hexadecimal number"
             #:fail-when (and (not (byte? (attribute n))) this-syntax)
             "hexadecimal number is not a byte?"))
  (syntax-parse stx
    [(_ :hex-byte ...)
     #`(quote #,(apply bytes (attribute n)))]))


(hex a b c 1 2 3 41 42 43) ; #"\n\v\f\1\2\3ABC"

-Philip

Simon Haines

unread,
Sep 9, 2019, 8:17:04 PM9/9/19
to Racket Users
Thanks Phillip for providing a very thorough example. There is much to digest in there, and some novel ideas I didn't know about (attaching values as attributes to syntax).

What wasted a lot of time for me is that, despite the macroexpander's results, the macro works as expected in the REPL. If you paste my original macro into DrRacket, run it, then type '(hex a)' into the REPL you get the expected result. In this case, '(expand (hex a))' doesn't help. This is possibly due to something like a combination of phases, environments and top-level bindings, but I couldn't figure it out.

In any case, thanks again. I will learn from your example ideas that I hope will help future me.

Philip McGrath

unread,
Sep 10, 2019, 12:41:05 AM9/10/19
to Simon Haines, Racket Users
On Mon, Sep 9, 2019 at 8:17 PM Simon Haines <simon....@con-amalgamate.net> wrote:
What wasted a lot of time for me is that, despite the macroexpander's results, the macro works as expected in the REPL. If you paste my original macro into DrRacket, run it, then type '(hex a)' into the REPL you get the expected result. In this case, '(expand (hex a))' doesn't help. This is possibly due to something like a combination of phases, environments and top-level bindings, but I couldn't figure it out.

This is a rather unpleasant pitfall of the REPL. If you try to evaluate the expression `(if #f some-unbound-identifier 1)`, you will see that it evaluates to `#f` in the REPL but raises an unbound identifier error in a module. At the REPL, `some-unbound-identifier` refers to a top-level variable, and it is allowed to support, for example, forward references to identifiers that will be defined in a subsequent interaction, or interactive re-definition of variables.
 
However, I have to admit that I hadn't realized before that references to undefined variables are allowed even without a lambda delaying evaluation, as long as they are in an untaken branch of a conditional. I have to say I don't like it. This certainly falls under the category of "the top level is hopeless" [1], but maybe we can do better in Racket2.

[1] See for example https://gist.github.com/samth/3083053, though that list isn't even up-to-date.

-Philip

Simon Haines

unread,
Sep 10, 2019, 1:19:40 AM9/10/19
to Racket Users
Thanks again Philip for taking the time to reply.


This is a rather unpleasant pitfall of the REPL. If you try to evaluate the expression `(if #f some-unbound-identifier 1)`, you will see that it evaluates to `#f` in the REPL but raises an unbound identifier error in a module. At the REPL, `some-unbound-identifier` refers to a top-level variable, and it is allowed to support, for example, forward references to identifiers that will be defined in a subsequent interaction, or interactive re-definition of variables.

When entering '(hex a b c 1 2 3)' into the REPL, I don't think the symbols 'a', 'b' and 'c' are undefined or forward-references as they appear in the taken branch of the conditional. Maybe syntax objects are different coming from the REPL than a module, somehow resulting in the macro working as expected?

Anyway, I take your point the top level has issues. I will be far more wary of it in future.

Philip McGrath

unread,
Sep 10, 2019, 2:30:29 AM9/10/19
to Simon Haines, Racket Users
`On Tue, Sep 10, 2019 at 1:19 AM Simon Haines <simon....@con-amalgamate.net> wrote:
This is a rather unpleasant pitfall of the REPL. If you try to evaluate the expression `(if #f some-unbound-identifier 1)`, you will see that it evaluates to `#f` in the REPL but raises an unbound identifier error in a module. At the REPL, `some-unbound-identifier` refers to a top-level variable, and it is allowed to support, for example, forward references to identifiers that will be defined in a subsequent interaction, or interactive re-definition of variables.

When entering '(hex a b c 1 2 3)' into the REPL, I don't think the symbols 'a', 'b' and 'c' are undefined or forward-references as they appear in the taken branch of the conditional. Maybe syntax objects are different coming from the REPL than a module, somehow resulting in the macro working as expected?

When you mention symbols being defined or undefined, which is terminology that other Lisp and Scheme languages use, I think there can be some confusion between a few different concepts. In Racket, we reserve the word "symbol" for a type of value, like a number, string, or boolean. So, in the expression `(λ (f) (f f))`, there are no symbols: we call `λ` and `f` "identifiers," which are the parts of the syntax of a program that might have bindings. Of course you know about `quote`, which can produce a symbol from literal program text in expressions like `(quote a)` or, more often, `'a`. Lisp programs that manipulate other programs, most particularly macros, traditionally used symbols as the representation for identifiers, but it turns out that a symbol isn't really as much information as you want, so Racket uses "syntax objects," where a syntax object combines a datum with lexical context (a set of scopes), source-location information, and other properties. By analogy to `quote`, the expression `(syntax a)` or `#'a` produces a syntax object, and in particular a syntax object representing an identifier. (Sometimes we say "identifier" when we really mean "a syntax object representing an identifier," which can be a bit confusing, but then `identifier?` is probably a better function name than `syntax-object-representating-identifier?`.)

If you consider the expanded form of `(hex a)`:
(bytes
  (let ([e (syntax-e #'a)])
    (if (number? e)
        a
        (string->number (symbol->string e) 16))))
 
The first sub-expression to be evaluated is `#'a`, which produces a syntax object representing an identifier. This syntax object value is then passed to the procedure `syntax-e`, which extracts the symbol value.

The key point here is that this doesn't involve a reference to the identifier `a`: it could be written equivalently as `'a` (and should be, if you want your macro to work this way), but we can also write it as `(string->symbol "a")`, which makes it extra clear that the binding of `a`, or rather the lack thereof, isn't consulted in evaluating this expression. So we could re-write the expansion of `(hex a)` as:
(bytes
  (let ([e (string->symbol "a")])

    (if (number? e)
        a
        (string->number (symbol->string e) 16))))
 
This makes it clear that the only reference to the identifier `a` is in the then branch of the conditional, which isn't taken. Probably, in your original macro, you wanted to write `e` there instead of the template variable `num`.

-Philip

Simon Haines

unread,
Sep 12, 2019, 3:08:53 AM9/12/19
to Racket Users
Thanks again Philip, this is making sense now. I appreciate your help with this.
Regards, Simon.


Reply all
Reply to author
Forward
0 new messages