Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Dye Packs in Racket Scheme (?!)

188 views
Skip to first unread message

Eddy Sputz

unread,
Sep 30, 2012, 7:31:42 PM9/30/12
to
Note: some of what I discuss is specific to Racket but it just illustrates the attitude among Scheme people and programming language "theorists" in general.

I can't stand Scheme's attitude toward macros. Take this Common Lisp code:

(defun blah (x)
(+ x 1))

(defmacro foo (y)
`(format t "~a" ,(blah y)))


The tick mark is one of Lisp/Scheme's greatest features. It is like "quote" but allows for interpolation of calculated values (which are followed by a comma). The ",@" operator splices any calculated list into the list which is quoted. (Note to self: it would be nice if there was a string version of backquote, comma, at-comma etc.) So if I use the macro foo as in:

(foo 12)


It will expand to

(format t "~a" 13)


And then gets evaluated. This is powerful because it means that any Lisp code can be used to build the macro results.

Let's try it with SBCL (Common Lisp):

CL-USER> (foo 12)
13


Perfect it worked. Okay how about Scheme. Now Scheme really, really wants you to use "hygenic" macros.

Racket has a throwback that Neanderthals such as myself can use.

From the docs:

(require mzlib/defmacro)

syntax

(define-macro id expr)

(define-macro (id . formals) body ...+)

syntax

(defmacro id formals body ...+)


formals = (id ...)
| id
| (id ...+ . id)

Defines a (non-hygienic) macro id through a procedure that manipulates
S-expressions, as opposed to syntax objects.

Okay let's try it:

(require mzlib/defmacro)

(define (blah x)
(+ x 1))

(define-macro (foo . xs)
`(printf "~a\n" ,(blah (car xs))))


And now ask Racket to expand

> (foo 12)
. . blah: undefined;
cannot reference undefined identifier


Huh? blah is defined. Right above foo. Just what the hell is going on here? Oh I guess I should have read the next paragraph

"In all cases, the procedure is generated in the transformer environment, not the normal
environment."


Okay so this "transformer environment" doesn't contain my definitions. Clicking on the "transformer environment" link doesn't help...

"bindings in phase level 1 constitute the transformer environment. Phase level -1 corresponds to the run time of a different module for which the enclosing module is imported for use at phase level 1 (relative to the importing module); bindings in phase level -1 constitute the template environment. Thelabel phase level does not correspond to any execution time; it is used to track bindings (e.g., to identifiers within documentation) without implying an execution dependency."

Okay whatever. Look guys, speaking in your own dense, jargon-infested verbiage doesn't make what you're saying any more correct than something some other idiot has to say. So I guess that the upshot is that I can't use my own code in a macro. Wait ... I've got it maybe making blah local to foo will work

(define-macro (foo . xs)
(define (blah x)
(+ x 1))
`(printf "~a\n" ,(blah (car xs))))


And Racket says

> (foo 12)
13

Okay maybe I can live with that.
Oh wait, what if I have an established code base that I want to call from my macro. But what if I just do something like:

(define (blah x)
(+ x 1))

(define-macro (foo . xs)
(let ((bar blah))
`(printf "~a\n" ,(bar (car xs)))))

No cigar:

> (foo 1)
. . blah: undefined;
cannot reference undefined identifier

Okay okay what about:

(define (blah x)
(+ x 1))

(define-macro (foo . xs)
`(printf "~a\n" ,((cadr xs) (car xs))))

Not even close:

> (foo 1 blah)
. . application: not a procedure;
expected a procedure that can be applied to arguments
given: 'blah
arguments...:
1

Oh right, I forgot that macros don't evaluate their arguments so the name "blah" just gets passed in. Okay so if I want to use my own code in a macro it has to defined to be local to the the macro. That could be a lot of copy-pasting. There just doesn't seem to be any way around it. Snooping around, I found this gem in the Racket manual

"A syntax object combines a simpler Racket value, such as a symbol or pair, with lexical information about bindings, source-location information, syntax properties, and tamper status. In particular, an identifier is represented as a symbol object that combines a symbol with lexical and other information."

"Tamper status" are they serious? Jesus nobody can be that paranoid can they? I mean, these are friggin' macros we're not talking about not the Mona Lisa or some piece of equipment vital to national security. And then I found this

"The tamper status of a syntax object is either tainted, armed, or clean:
A tainted identifier is rejected by the macro expander for use as either a binding or expression. If a syntax object is tainted, then any syntax object in the result of (syntax-e stx) is tainted, and datum->syntax with stx as its first argument produces a tainted syntax object.
Other derived operations, such as pattern matching in syntax-case, also taint syntax objects when extracting them from a tainted syntax object.
An armed syntax object has a set of dye packs, which creates taints if the armed syntax object is used without first disarming the dye packs. In particular, if a syntax object is armed, syntax-e, datum->syntax, quote-syntax, and derived operations effectively treat the syntax object as tainted. The macro expander, in contrast, disarms dye packs before pulling apart syntax objects.
Each dye pack, which is added to a syntax object with the syntax-arm function, is keyed by an inspector. A dye pack can be disarmed using syntax-disarm with an inspector that is the same as or a superior of the dye pack’s inspector.
A clean syntax object has no immediate taints or dye packs, although it may contain syntax objects that are tainted or armed.

Taints cannot be removed, and attempting to arm a syntax object that is already tainted has no effect on the resulting syntax object."


Holy shit. These guys take their macros seriously. "syntax-arm", "dye packs" ?!?! No wonder I couldn't sneak in my no-account "blah" past these security freaks. For hilarious reading I direct you to: http://docs.racket-lang.org/reference/stxcerts.html?q=define-macro#(tech._tamper._statu)

--------------

Eddie

Nils M Holm

unread,
Oct 1, 2012, 2:11:52 AM10/1/12
to
Eddy Sputz <eddie...@gmail.com> wrote:
> Note: some of what I discuss is specific to Racket but it just
> illustrates the attitude among Scheme people and programming language
> "theorists" in general.

Have you tried plain old SYNTAX-RULES?

> (define (blah x) (+ 1 x))

> (define-syntax foo
(syntax-rules ()
((_ x . xs)
(format #t "~A~%" (blah x)))))

> (foo 12)
13

Works in every post-R4RS Scheme without any additional mambo jambo
and is pretty intuitive to grasp, IMO.

That being said, SYNTAX-RULES can be a real pain in some cases
where Common Lisp macros simply shine.

Regarding documentation: yes, there is some work to be done, for sure.

--
Nils M Holm < n m h @ t 3 x . o r g > www.t3x.org

Eli Barzilay

unread,
Oct 1, 2012, 6:05:51 AM10/1/12
to
Meta-note: I'm possibly being over-optimistic here, but I'll assume
that you are actually interested in hearing explanations etc. (If
not, please indicate so by redundant sarcasm and/or cheap attempts at
offensiveness.)


Eddy Sputz <eddie...@gmail.com> writes:
>
> Note: some of what I discuss is specific to Racket but it just
> illustrates the attitude among Scheme people and programming
> language "theorists" in general.

At least in the Racket case, you should keep in mind that the macro
system (as well as other related parts like modules etc) are being
driven by practical needs first, not by obscure theories.


> The [back-]tick mark is one of Lisp/Scheme's greatest features.

[Unrelated side-note: this feature exist in many languages now, often
with strings, sometimes in a more sexpr-like way.]


> (Note to self: it would be nice if there was a string version of
> backquote, comma, at-comma etc.)

http://barzilay.org/misc/scribble-reader.pdf


> [...] Okay how about Scheme. Now Scheme really, really wants you to
> use "hygenic" macros.

You're mixing 2.5 independent concepts: hygiene and phase-separation,
and my guess is that you do the usual conflation of hygiene with the
simple `syntax-rules' kind of rewrite-rule based macros. WRT the
latter point, you can use `define-syntax' to define simple functions
as you would in CL, modulo the fact that we use a different data type
instead of plain sexprs:

http://blog.racket-lang.org/2011/04/writing-syntax-case-macros.html

The `define-macro' compatibility hack is, roughly speaking, taking the
input syntax and converts it into sexprs, applies the transformation
function, then uses a hash-table lookup that was constructed in the
first step to guess how to convert the resulting sexpr back into a
syntax value. It's obviously not a perfect process since downgrading
the data to sexprs is a lossy process.


> Okay let's try it:
>
> (require mzlib/defmacro)
>
> (define (blah x)
> (+ x 1))
>
> (define-macro (foo . xs)
> `(printf "~a\n" ,(blah (car xs))))

Here you're getting to the phase separation feature. The quick-quick
version is that code that lives in the syntax world is unrelated to
code that lives in the runtime world. There's a bunch of ways to
specify that code goes into the syntax world, two popular ones:

1. Wrap the code in a `begin-for-syntax':

(begin-for-syntax (define (blah x) (+ x 1)))

and your code now works.

2. Another option when you have a whole bunch of code that lives in
its own module is to use (require (for-syntax "blah.rkt")).

In the first case you obviously get no `blah' bound in the runtime
code. In the second, you can do that by requiring the same code in
both levels *but* the two levels are still unrelated. For example:

-> (module blah racket
(define c 0)
(define (blah) (set! c (add1 c)) c)
(provide blah))
-> (require mzlib/defmacro 'blah (for-syntax 'blah))
-> (list (blah) (blah) (blah) (blah) (blah))
'(1 2 3 4 5)
-> (define-macro (foo) (blah))
-> (list (foo) (foo) (foo))
'(1 2 3)
-> (list (blah) (blah) (blah) (blah) (blah))
'(6 7 8 9 10)

Now, before you get all excited about the wonderful feature you have
to give up, it is useful to read this:

http://fare.livejournal.com/146698.html

and realize that phase separation is directly resolving such issues.

Side note: as I said above, this has nothing to do with hygiene.

Side note #2: if you read the above and still think that you can't
live without this feature then CL is the right choice for you.


> "bindings in phase level 1 constitute the transformer
> environment. [...]"
>
> Okay whatever. Look guys, speaking in your own dense,
> jargon-infested verbiage

Yes, you're looking at a *reference* page labeled "Syntax Model"... so
you should expect some dense verbiage. If you want a more readable
description, then have a look at the guide instead, specifically:

http://docs.racket-lang.org/guide/stx-phases.html


> doesn't make what you're saying any more correct than something some
> other idiot has to say.

(Correctness doesn't enter into it. It is a description of what
Racket is doing.)


> "Tamper status" are they serious? Jesus nobody can be that paranoid
> can they? I mean, these are friggin' macros we're not talking about
> not the Mona Lisa or some piece of equipment vital to national
> security.

This feature refers to maintaining scope and visibility of bindings,
and it is definitely not something that newbies should deal with. If
you really want to know, the idea is roughly like this: when you
define a module, you can choose which identifiers are provided and
which are not. (Note that from a CL package point of view, this is
already a bad idea so if you don't want to allow people to have
private bindings you'd better stop reading this.) The thing is that
you might have a macro that expands into such a private binding. For
example:

(module foo racket
(define c 0)
(define (c!) (set! c (add1 c)) c)
(define-syntax-rule (counted-eval expr)
(begin (printf "#~a ~s\n" (c!) 'expr) expr))
(provide counted-eval))

The author of this module wants to have each use of the macro get
counter properly, without other changes to the internal data --
otherwise more bindings would be provided. But still, since the macro
is available, you can use `expand' manually, get some resulting code
that has the internal identifier and pull it out (no, that's not how
you'd pull something out in practice unless you're a dentist, I'm just
trying to keep it clear):

-> (counted-eval (+ 1 2))
#1 (+ 1 2)
3
-> (cadddr (syntax->list (cadr (syntax->list (expand #'(counted-eval (+ 1 2)))))))
#<syntax:1:132 (#%app c!)>

then use it directly:

-> (eval (cadddr (syntax->list (cadr (syntax->list (expand #'(counted-eval (+ 1 2))))))))
; #%app: cannot use identifier tainted by macro transformation

So this is where those "taints" come into the game, and prevent you
from doing that kind of thing.

(Again, this is *not* something that CL can do or tries to do or
considers a good thing.)


> Holy shit. These guys take their macros seriously.

Yes -- and the above example is clearly not showing any great need for
such seriousness. But bear in mind that large parts of Racket are
written in Racket, and that core code relies on such protection to
hide some dangerous implementation details which could otherwise break
internal variants that can lead to segfaults.

And some obvious side-notes:

1. To repeat this yet again, feel free to scream bloody murder on such
a horrendous feature. If you want that, then how about I make that
easy and give you a few choices that so you can just choose one:
a. If you do these kind of tricks you deserve what you get.
b. Programmers should be allowed to shoot their own feet.
c. ... thieves and bums use C++ and nice people use CLOS.

2. Yes, we have an FFI, so you can very easily find other, more
straightforward ways to segfault. But OTOH, there is also a way to
drop the trust level when you load some untrusted code.

3. You started with the "theorists", but note that this is a very
practical issue. I run a homework submission server that evaluates
submission code, and I know that students can hack my machine
through it. There are many other cases where sandboxing is a very
desired feature, and AFAICT, this kind of sandboxing isn't possible
in CL without something external like running a separate restricted
process.

--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!

Pascal J. Bourguignon

unread,
Oct 1, 2012, 6:56:25 AM10/1/12
to
Eddy Sputz <eddie...@gmail.com> writes:

> Note: some of what I discuss is specific to Racket but it just illustrates the attitude among Scheme people and programming language "theorists" in general.
>
> I can't stand Scheme's attitude toward macros. Take this Common Lisp code:
>
> (defun blah (x)
> (+ x 1))
>
> (defmacro foo (y)
> `(format t "~a" ,(blah y)))

This code is not conforming. It will give the same error in most CL
implementations than in scheme (try compile-file). To make it
conforming, you have to write it as:


(eval-when (:compile-toplevel :load-toplevel :execute)
(defun blah (x)
(+ x 1)))

(defmacro foo (y)
`(format t "~a" ,(blah y)))


--
__Pascal Bourguignon__ http://www.informatimago.com/
A bad day in () is better than a good day in {}.
0 new messages