How to write dirty R5RS macros

ol...@pobox.com

unread,

Mar 27, 2002, 3:53:45 PM3/27/02

to

The topic of this article is writing R5RS macros that are
observationally equivalent to referentially opaque macros. These
macros capture user identifiers and allow their own identifiers being
captured by closest lexical bindings. In other words, we will write
macros that do the things that the R5RS macro system has sought to
prevent. This article will also show how to 're-define' a lambda form
while still keeping the original 'lambda' available. The redefined
lambda acts as if it was infected by a virus, which propagates through
the lambda's body infecting other lambdas in turn. Although we will be
re-defining all binding forms, we will preserve the semantics of
Scheme as given in R5RS.

* Referential transparency of macros
* Petrofsky's extraction
* Towards the full referential opaqueness: a mylet form
* Achieving the full referential opaqueness: redefining all binding forms

* Referential transparency of macros

The referential transparency of macros means that all identifiers in
the macro code (other than pattern variables) refer to the bindings
that existed when the macro was defined rather to the bindings that
exist at the point of invocation. The transparency has two facets. The
first one demands that bindings introduced by macros should not
capture free identifiers in macro arguments.

Let's consider the following macro:

(define-syntax m1
(syntax-rules ()
((m1 body) (let ((i 10)) body))))

A form (m1 body) expands into (let ((i 10)) body). Naive, referentially
opaque expansion of (m1 (* 1 i)) would have produced
(let ((i 10)) (* 1 i))
The binding of 'i' introduced by the macro would have captured the
free variable 'i' occurring in the 'body'. The referential
transparency and hygiene prevent this capture through systematic
renaming of introduced identifiers. Therefore,

(let ((i 1))
(m1 (* 1 i)))

actually expands to

(let ((i~2 1))
(let ((i~5 10)) (* 1 i~2))

and gives the result 1. The identifier i~2 is different from i~5: we
will call them of different _colors_.

The second facet of the referential transparency demands that free
identifiers in a macro expansion should not be captured by bindings
that textually appear after the macro definition. If a macro expansion
introduces a free identifier, the identifier refers to the binding
occurrence which was in effect when the macro was defined. E.g.

(define foo 1)

(define-syntax mm
(syntax-rules ()
((mm) foo)))

The following form

(let ((foo 2))
(mm))

expands into
(let ((foo~1 2))
foo)

and yields 1 when evaluated. The local 'let' binds 'foo' of a
different color, and therefore, does not capture 'foo' introduced by
macro mm. If foo was not defined when macro mm was introduced, (mm)
will always lead to an "unbound identifier" error.

We will now show how to write R5RS macros that violate both aspects of
referential transparency. The main idea has been introduced by Al
Petrofsky in an article
How to write seemingly unhygienic macros using syntax-rules
http://groups.google.com/groups?selm=87oflzcdwt.fsf%40radish.petrofsky.org
Date: 2001-11-19 01:23:33 PST

In that article, Al Petrofsky has shown how to circumvent the weak
form of the first facet of referential transparency. This article
generalizes Petrofsky's idea to attack the full referential transparency.

* Petrofsky's extraction

For completeness and reference we will derive the Petrofsky technique
first. The derivation below differs from the one in Petrofsky's
article in insignificant details: I didn't remember the details and
therefore wrote the macros below by following the general idea.

Petrofsky's article deals with a weak form of the referential
transparency: capturing a free identifier in the user code that has
not been rebound. That is, we'd like to write a macro m1 so that
(m1 10 body) expands into
(let ((i 10)) body)
such that the binding of 'i' captures possible occurrences of 'i' in
the 'body'. We assume that there are no other bindings of 'i' in the
scope of '(m1 10 body)', or 'i' was defined early in the global scope
and not re-defined since. This assumption is the distinction between
the 'weak' referential opaqueness and the true one.

Developing even weakly referentially-opaque macros is challenging. We
cannot just write

(define-syntax m1
(syntax-rules ()
((_ val body) (let ((i val)) body))))

because
(m1 10 (* 1 i))
will expand into
(let ((i~5 10)) (* 1 i))
where 'i' in (* 1 i) refers to the top-level binding of 'i' or
remains undefined. However, we can write

(define-syntax m1-i
(syntax-rules ()
((_ i val body) (let ((i val)) body))))

that is, specifically pass the variable to capture. In that case,
(m1-i i 10 (* 1 i))
expands into
(let ((i 10)) (* 1 i))

and the capture occurs. Hence to circumvent the hygiene in the weak
sense, we only need to find a way to convert an invocation of m1 into
an invocation of m1-i. Macro m1-i requires the explicit specification
of the symbol to capture -- which we can get by extracting the symbol
'i' (_together_ with its color) from the argument of m1. That's the
essence and the elegance of the Petrofsky's idea. Once we have the
rightly colored occurrence of 'i', we can use it in the binding form and
effect the capture.

The extraction of the colored identifiers from a form is done by the
following macro extract:

; Extract a colored identifier from a form
; extract SYMB BODY CONT
; BODY is a form that may contain an occurrence of an identifier that
; refers to the same binding occurrence as SYMB, perhaps with a different
; color. CONT is a form of the shape (K-HEAD K-IDL . K-ARGS)
; where K-IDL are K-ARGS are S-expressions representing lists or the
; empty list.
; The extract macro expands into
; (K-HEAD (extr-id . K-IDL) . K-ARGS)
; where extr-id is the extracted colored identifier. If symbol SYMB does
; not occur in BODY at all, extr-id is identical to SYMB.

(define-syntax extract
(syntax-rules ()
((_ symb body _cont)
(letrec-syntax
((tr
(syntax-rules (symb)
((_ x symb tail (cont-head symb-l . cont-args))
(cont-head (x . symb-l) . cont-args)) ; symb has occurred
((_ d (x . y) tail cont) ; if body is a composite form,
(tr x x (y . tail) cont)) ; look inside
((_ d #(x . y) tail cont)
(tr x x (y . tail) cont))
((_ d1 d2 () (cont-head symb-l . cont-args))
(cont-head (symb . symb-l) . cont-args)) ; symb does not occur
((_ d1 d2 (x . y) cont)
(tr x x y cont)))))
(tr body body () _cont)))))

The extract macro is the workhorse of the circumvention strategies.
We also need a macro that extracts several identifiers, extract*:

; Extract several colored identifiers from a form
; extract* SYMB-L BODY CONT
; where SYMB-L is the list of symbols to extract, and BODY and CONT
; has the same meaning as in extract, see above.
;
; The extract* macro expands into
; (K-HEAD (extr-id-l . K-IDL) . K-ARGS)
; where extr-id-l is the list of extracted colored identifiers. The extraction
; itself is performed by the macro extract.

(define-syntax extract*
(syntax-rules ()
((_ (symb) body cont) ; only one symbol: use extract to do the job
(extract symb body cont))
((_ _symbs _body _cont)
(letrec-syntax
((ex-aux ; extract symbol-by-symbol
(syntax-rules ()
((_ found-symbs () body cont)
(reverse () found-symbs cont))
((_ found-symbs (symb . symb-others) body cont)
(extract symb body
(ex-aux found-symbs symb-others body cont)))
))
(reverse ; reverse the list of extracted symbols
(syntax-rules () ; to match the order of SYMB-L
((_ res () (cont-head () . cont-args))
(cont-head res . cont-args))
((_ res (x . tail) cont)
(reverse (x . res) tail cont)))))
(ex-aux () _symbs _body _cont)))))

Now we can write a macro

(define-syntax m1-dirty-v1
(syntax-rules ()
((_ _val _body)
(let-syntax
((cont
(syntax-rules ()
((_ (symb) val body) (let ((symb val)) body) ))))
(extract i _body (cont () _val _body))))))

Then
(m1-dirty-v1 10 (* 1 i))
expands into
(let ((i~11 10)) (* 1 i~11))
and evaluates to 10, as expected.

The m1-dirty-v1 macro seems to do the job, but it has a flaw. It does
not nest:

(m1-dirty-v1 10
(m1-dirty-v1 20 (* 1 i)))

expands into
(let ((i~16 10)) (let ((i~17~25~28 20)) (* 1 i~16)))

and evaluates to 10 rather than to 20 as we might have expected. The
nested invocation of m1-dirty-v1 violates our assumption of no
bindings of 'i' in the scope of m1-dirty-v1 that was not present at
the time m1-dirty-v1 was defined. The outer invocation of m1-dirty-v1
creates a binding for 'i' -- which violates the weak referential
transparency assumption. Al Petrofsky has also shown how to overcome
the problem. If we bind 'i' and still want m1-dirty-v1 to work as
expected, we need to re-define m1-dirty-v1 in the scope of the new
binding to 'i'. Thus we need a macro that re-defines itself in its own
expansion. We however face a problem:

(m1-dirty-v1 10
(m1-dirty-v1 20 (* 1 i)))

If the outer invocation of m1-dirty-v1 re-defines itself, this
redefinition has to capture the inner invocation of m1-dirty-v1. This
is the same problem of the weak referential opaqueness -- which we
know how to solve, by extracting the colored identifier m1-dirty-v1
from outer macro's body. We need to extract two identifiers: 'i' and
'm1-dirty-v1'. We arrive at the following macro:

; A macro that re-defines itself in its expansion
; m1-dirty-v2 val body
; expands into
; (let ((i val)) body)
; and also re-defines itself in the scope of body.

(define-syntax m1-dirty-v2
(syntax-rules ()
((_ _val _body)
(letrec-syntax
((doit ; it's the continuation from extract*
(syntax-rules () ; myself-symb i-symb are colored ids extracted
((_ (myself-symb i-symb) val body) ; from the 'body'
(let ((i-symb val)) ; first bind the symbol i
(let-syntax ; now re-define oneself
((myself-symb
(syntax-rules ()
((_ val__ body__)
(extract* (myself-symb i-symb) body__
(doit () val__ body__))))))
body))))))
(extract* (m1-dirty-v2 i) _body
(doit () _val _body))))))

Therefore

(m1-dirty-v2 10
(m1-dirty-v2 20 (* 1 i)))

now expands to
(let ((i~26 10)) (let ((i~52 20)) (* 1 i~52)))

and evaluates to 20.

The macro m1-dirty-v2 is still weakly referentially transparent. If we
evaluate

(let ((i 1))
(m1-dirty-v2 10 (* 1 i)))

we obtain
(let ((i 1)) (let ((i~3~22~29 10)) (* 1 i)))
which evaluates to 1 rather than 10.

We can still build a completely referentially opaque m1-dirty
macro. This operation is easier to illustrate on a simpler example,
which breaks the second facet of the referential transparency.

* Towards the full referential opaqueness: a mylet form

In this section, we will try to break the second facet of the
referential transparency: write a macro that allows free identifiers
in its expansion being captured by the closest lexical binding. In
other words, we want to write such a macro mm

(define-syntax mm
(syntax-rules ()
((mm) foo)))

so that the following form
(let ((foo 2))
(mm))
evaluates to 2.

As the first step, let us aim at the referentially opaque capture in
the following form:

(mylet ((foo 2))
(mm))

Where 'mylet' is a new binding form -- which eventually expands to the
regular let. We can use the freedom of implementing mylet to remove
the roadblock to referential opaqueness. The trick is to re-define
dirty macros whenever we add a new binding. This will guarantee the
capture of the latest binding.

Since we want to generate and re-generate macro mm, we need a
generator:

; Macro: make-mm NAME SYMB BODY
; In the scope of BODY, define a macro NAME that expands into a symbol SYMB

(define-syntax make-mm
(syntax-rules ()
((_ name symb body)
(let-syntax
((name
(syntax-rules ()
((_) symb))))
body))))

; (mylet ((var init)) body)
; expands into
; (let ((var init)) body')
; where body' is the body wrapped in the re-definitions of mylet and macro mm.

(define-syntax mylet
(syntax-rules ()
((_ ((_var _init)) _body)
(letrec-syntax
((doit ; The continuation from extract*
(syntax-rules () ; mylet-symb, etc. are extracted from body
((_ (mylet-symb mm-symb foo-symb) ((var init)) body)
(let ((var init)) ; bind the 'var' first
(make-mm mm-symb foo-symb ; now re-generate the macro mm
(letrec-syntax
((mylet-symb ; and re-define myself
(syntax-rules ()
((_ ((var_ init_)) body_)
(extract* (mylet-symb mm-symb foo-symb) (var_ body_)
(doit () ((var_ init_)) body_))))))
body)))
))))
(extract* (mylet mm foo) (_var _body)
(doit () ((_var _init)) _body))))))

Now
(define foo 1)
(mylet ((x 1)) (list (mm) x))
evaluates to (1 1), as expected. (mm) expands to 'foo' defined in the
global environment. However,

(mylet ((foo 2)) (list (mm) foo))
expands to
((lambda (foo~39) (list foo~39 foo~39)) 2)
and evaluates to (2 2). Hence (mm) expanded to foo which was captured
by the local binding! Because mylet constantly re-generates itself, it
nests:
(mylet ((foo 3)) (mylet ((foo 4)) (mylet ((foo 5)) (list foo (mm)))))

expands to

((lambda
(foo~55)
((lambda
(foo~100)
((lambda (foo~142) (list foo~142 foo~142)) 5))
4))
3)

and evaluates to (5 5). To further verify that (mm) captures the lexical
binding, we evaluate

(mylet ((foo 3))
(mylet ((thunk (lambda () (mm))))
(mylet ((foo 4)) (list foo (mm) (thunk)))))

which gives us (4 4 3). The invocation (mm) within the closure 'thunk'
captured foo that was lexically visible at that time, which was 3.

* Achieving the full referential opaqueness: redefining all binding forms

The previous section showed that a a non-hygienic capture is possible
if we can introduce our own binding forms. We can achieve the same
result with the regular binding forms let, let*, letrec and lambda --
if we just re-define them. To be precise we need to 'overload' only
one, fundamental binding form: the lambda itself. The other binders
are defined in terms of lambda.

This redefinition is done by a macro defile, which defiles its
body. It is a bit long, therefore, it is moved to the appendix. We
will consider a few excerpts from that macro's code.

The first notable piece is the following:

(letrec-syntax
...
(lambda-native ; capture the native lambda
(syntax-rules ()
((_ . args) (lambda . args))))

This fragment does what it looks like: it captures the native lambda,
which is needed to effect bindings.

(letrec-syntax
...
((let-symb ; R5RS definition of let, with no ...
(syntax-rules () ; and no named-let, for simplicity
((_ "0" vars inits () . body)
((lambda-symb vars . body) . inits))
((_ "0" vars inits ((var init) . other) . body)
(let-symb "0" (var . vars) (init . inits) other . body))
((_ bindings . body)
(let-symb "0" () () bindings . body))))

This is the _standard_ definition of let (save named let, which is
dropped for simplicity). We can't use a ... pattern because of the
deep nesting of this macro. We redefine 'let' to use our redefined
lambda. Finally, the overloaded lambda:

(letrec
...
(lambda-symb ; re-defined, infected lambda
(syntax-rules ()
((_ _vars _body)
(letrec-syntax
((doit
(syntax-rules ()
((_ (mylet-symb mylet*-symb mylambda-symb
mymm-symb myfoo-symb) vars body)
(lambda-native vars
(make-mm mymm-symb myfoo-symb
(do-defile ; proliferate in the body
(mylet-symb mylet*-symb mylambda-symb
mymm-symb myfoo-symb)
body)))))))
(extract* (let-symb let*-symb lambda-symb mm-symb foo-symb)
(_vars _body)
(doit () _vars _body))))))

We're relying on the lambda-native to create bindings. Right after
that, we redefine all our macros. The infected lambda spreads from one
binding to another.

Let us see how it works:

(defile
(let ((foo 2)) (list (mm) foo))
)

expands into
((lambda (foo~186) (list foo~186 foo~186)) 2)
and predictably evaluates to (2 2). All the infected lambdas are gone:
the expansion result is the regular Scheme code.

Furthermore,
(defile
(let ((foo 2))
(let ((foo 3) (bar (list (mm) foo)))
(list foo (mm) bar)))
)
evaluates to (3 3 (2 2)). The scoping rules of let are respected.

Finally,
(defile
(let* ((foo 2)
(i 3)
(foo 4)
(ft (lambda () (mm))) ; will capture binding of foo to 4
(foo 5)
(ft1 (lambda (foo) (mm))) ; will capture the arg of ft1
(foo 6))
(list foo (mm) (ft) (ft1 7) '(mm)))
)

evaluates to the expected (6 6 4 7 (mm)).
All the examples run with Bigloo 2.4b interpreter and the compiler.

We must point out that the defiled examples behave as if (mm), unless
quoted, were just the variable foo. In other words, as if mm were
defined as a *non-hygienic*, referentially opaque macro
(define-macro (mm) foo)

We have thus demonstrated an observational equivalent of a dirty,
referentially opaque macro, exclusively developed with R5RS, hygienic
macros. The guarantees of R5RS macros can be circumvented.

Appendix.
; This macro defiles its body.
; It re-defines let-forms and the lambda, and defines
; a non-hygienic macro 'mm'. Whenever any binding is introduced,
; the let-forms, the lambdas and 'mm' are redefined.
; The redefined lambda acts as if it were infected by a virus, which
; keeps spreading within lambda's body to infect other lambda's there.
;
; In the following macro, we should have redefined letrec and the named let.
; We omit those forms for simplicity.

(define-syntax defile
(syntax-rules ()
((_ dbody)
(letrec-syntax
((do-defile
(syntax-rules () ; all the overloaded symbols
((_ (let-symb let*-symb lambda-symb mm-symb foo-symb)
body-to-defile)
(letrec-syntax
((let-symb ; R5RS definition of let, with no ...
(syntax-rules () ; and no named-let, for simplicity
((_ "0" vars inits () . body)
((lambda-symb vars . body) . inits))
((_ "0" vars inits ((var init) . other) . body)
(let-symb "0" (var . vars) (init . inits) other
. body))
((_ bindings . body)
(let-symb "0" () () bindings . body))))

(let*-symb ; R5RS definition of let*
(syntax-rules ()
((_ () . body) (let-symb () . body))
((_ (binding . bindings) . body)
(let-symb (binding) (let*-symb bindings . body)))))

(lambda-symb ; re-defined, infected lambda
(syntax-rules ()
((_ _vars _body)
(letrec-syntax
((doit
(syntax-rules ()
((_ (mylet-symb mylet*-symb mylambda-symb
mymm-symb myfoo-symb) vars body)
(lambda-native vars
(make-mm mymm-symb myfoo-symb
(do-defile ; proliferate in the body
(mylet-symb mylet*-symb mylambda-symb
mymm-symb myfoo-symb)
body)))))))
(extract* (let-symb let*-symb lambda-symb
mm-symb foo-symb)
(_vars _body)
(doit () _vars _body))))))

(lambda-native ; capture the native lambda
(syntax-rules ()
((_ . args) (lambda . args))))
)

body-to-defile)))))

(extract* (let let* lambda mm foo) dbody
(do-defile () dbody))
))))

David Golden

unread,

Mar 27, 2002, 5:03:03 PM3/27/02

to

ol...@pobox.com wrote:

> The topic of this article is writing R5RS macros that are
> observationally equivalent to referentially opaque macros. These
> macros capture user identifiers and allow their own identifiers being
> captured by closest lexical bindings. In other words, we will write
> macros that do the things that the R5RS macro system has sought to
> prevent. This article will also show how to 're-define' a lambda form
> while still keeping the original 'lambda' available. The redefined
> lambda acts as if it was infected by a virus, which propagates through
> the lambda's body infecting other lambdas in turn. Although we will be
> re-defining all binding forms, we will preserve the semantics of
> Scheme as given in R5RS.
>

Spectacular stuff !

--
Don't eat yellow snow.

Al Petrofsky

unread,

Mar 27, 2002, 8:00:39 PM3/27/02

to

ol...@pobox.com (ol...@pobox.com) writes:

> We have thus demonstrated an observational equivalent of a dirty,
> referentially opaque macro, exclusively developed with R5RS, hygienic
> macros. The guarantees of R5RS macros can be circumvented.

I don't think you can quite put it that way. The guarantees of r5rs
macros are with respect to the r5rs binding forms. What you have
shown, rather, is a set of binding forms, developed on top of r5rs,
which create bindings that are not transparent to the variable
references of a particular macro.

That's what I was jokingly suggesting earlier this month in article
<874rjvb...@radish.petrofsky.org>, and I'm impressed to see you
actually did it. I like how the aptly-named defile macro uses
identifier extraction to set up the initial bindings of lambda, let,
etc.. That's a little simpler than how I did things in my November
article, and it nicely avoids the problem of redefining top-level
lambda.

Pondering the possibilities, I think it would be straightforward to
add do, letrec, and let-syntax to the binding forms that defile
supports. Letrec-syntax would be trickier, but possible, and internal
define would be impossible (that is, impossible if you want to support
macros that expand into internal definitions). The only other
non-top-level r5rs binding construct is syntax-rules, which binds
pattern variables. Although the concept of referential transparency
is meaningful for pattern variables, it is hard to conceive of a macro
that would want to insert an identifier reference that would be
captured by a pattern variable binding. I think that as a result of
the fact that the region of a pattern variable is a template rather
than an expression, it would work out that the macro would have to do
its own identifier extractions, and defile would not need to do
anything special with syntax-rules.

-al

Al Petrofsky

unread,

Mar 28, 2002, 7:50:21 PM3/28/02

to

A couple notes after fiddling with your code a bit:

ol...@pobox.com (ol...@pobox.com) writes:
> (define-syntax extract
...

> ((_ d #(x . y) tail cont)

...

> All the examples run with Bigloo 2.4b interpreter and the compiler.

Does this bogus dotted vector really work in Bigloo? I know some
readers accept #(a . (b c)) as a synonym for #(a b c), but I can't
imagine how bigloo represents #(x . y) after reading it. (Perhaps it
is silently converted to #(x)? That is, (x . y) gets passed to
a version of list->vector that ignores the list's last cdr. If so,
you should probably report that as a Bigloo bug.)

> (letrec-syntax
> ...
> (lambda-native ; capture the native lambda
> (syntax-rules ()
> ((_ . args) (lambda . args))))
>
> This fragment does what it looks like: it captures the native lambda,
> which is needed to effect bindings.

This is not actually necessary. In the region of lambda-native, the
non-native lambda is named lambda-symb, so there is no conflict. You
can drop this clause from the letrec-syntax bindings and replace the
one call to lambda-native with a call to lambda. (At some point in
the expansion, lambda-sym will be lambda, but thanks to referential
transparency, you needn't worry about that.)

-al

ol...@pobox.com

unread,

Mar 28, 2002, 9:27:21 PM3/28/02

to

Al Petrofsky <a...@petrofsky.org> wrote in message news:<87g02l7...@radish.petrofsky.org>...

> ol...@pobox.com (ol...@pobox.com) writes:
>
> > We have thus demonstrated an observational equivalent of a dirty,
> > referentially opaque macro, exclusively developed with R5RS, hygienic
> > macros. The guarantees of R5RS macros can be circumvented.
>
> I don't think you can quite put it that way. The guarantees of r5rs
> macros are with respect to the r5rs binding forms.

Do the perverted lambdas introduced in the previous article satisfy
R5RS? A brief check of R5RS seems to answer affirmatively. R5RS
Section 4.1.4 describes the semantics of a lambda form at
run-time. The defiled lambda seems to have exactly the same semantics
(because it macro-expands into the native lambda). Since a lambda is
not a macro (in most of the Schemes), R5RS is silent on how lambdas
should be treated by a macro expander. The defiled let and let*
binding forms should comply with R5RS as well, because these forms
relate to the overloaded lambda as the original let and let* forms
relate to the original lambda. R5RS Section 4.3 does not outlaw the
construction in the previous article, because the 'defile' macro is
truly a R5RS macro.

If a Scheme system with defiled lambda and other binding forms
nevertheless complies with R5RS, it casts doubt on the assertion that
R5RS macros are referentially transparent.

Thomas Bushnell, BSG

unread,

Mar 28, 2002, 11:45:41 PM3/28/02

to

ol...@pobox.com (ol...@pobox.com) writes:

> If a Scheme system with defiled lambda and other binding forms
> nevertheless complies with R5RS, it casts doubt on the assertion that
> R5RS macros are referentially transparent.

Hrm, I'm inclined to a holistic view of the standard. Such defiled
lambdas, because they allow the violation of referential transparency,
are not actually standard compliant.

Perhaps I'm missing some subltety, however... enlighten me if so.

Sander Vesik

unread,

Mar 29, 2002, 10:07:52 AM3/29/02

to

Thomas Bushnell, BSG <tb+u...@becket.net> wrote:
> ol...@pobox.com (ol...@pobox.com) writes:
>
>> If a Scheme system with defiled lambda and other binding forms
>> nevertheless complies with R5RS, it casts doubt on the assertion that
>> R5RS macros are referentially transparent.
>
> Hrm, I'm inclined to a holistic view of the standard. Such defiled
> lambdas, because they allow the violation of referential transparency,
> are not actually standard compliant.

*IF* these are written using nothing but R5RS compliant constructs and
not relying on any undefined behaviour behaving in a certain way, are
definately standard compliant. In fact, an application of such is
probably testing the compliance of various implementations 8-)

>
> Perhaps I'm missing some subltety, however... enlighten me if so.
>

That R5RS facilities are more capable than people usualy realise and thus
some things commonly asserted about them need to necessarily be true?

--
Sander

+++ Out of cheese error +++

Al Petrofsky

unread,

Mar 29, 2002, 3:41:23 PM3/29/02

to

ol...@pobox.com (ol...@pobox.com) writes:
> Al Petrofsky <a...@petrofsky.org> wrote in message news:<87g02l7...@radish.petrofsky.org>...
> > ol...@pobox.com (ol...@pobox.com) writes:
> >
> > > We have thus demonstrated an observational equivalent of a
> > > dirty, referentially opaque macro, exclusively developed with
> > > R5RS, hygienic macros. The guarantees of R5RS macros can be
> > > circumvented.
> >
> > I don't think you can quite put it that way. The guarantees of
> > r5rs macros are with respect to the r5rs binding forms.
>
> Do the perverted lambdas introduced in the previous article satisfy

> R5RS? [...] R5RS is silent on how lambdas should be treated by a
> macro expander.

The r5rs 4.3 discussion of the referential transparency of
syntax-rules transformers states:

* If a macro transformer inserts a free reference to an identifier,
the reference refers to the binding that was visible where the
transformer was specified, regardless of any local bindings that
may surround the use of the macro.

In other words, the local bindings introduced by lambda and the other
r5rs binding forms must be transparent to free identifier references
inserted by syntax-rules expansions.

> If a Scheme system with defiled lambda and other binding forms
> nevertheless complies with R5RS, it casts doubt on the assertion that
> R5RS macros are referentially transparent.

SICP shows how to implement a call-by-name scheme system on top of
r5rs scheme. Does that cast doubt on the assertion that r5rs scheme
is call-by-value?

-al

ol...@pobox.com

unread,

Mar 31, 2002, 5:47:29 PM3/31/02

to

I have verified that the defiled R5RS macros work in S48 (after you
remove the #(a . b) pattern from the macro expand, see below). All
examples pass, although it takes a second. The ,expand command in S48
can nicely show the result of the macro expansion.

Al Petrofsky <a...@petrofsky.org> wrote in message news:<878z8c6...@radish.petrofsky.org>...

> > (define-syntax extract ...
> > ((_ d #(x . y) tail cont) ...
> > All the examples run with Bigloo 2.4b interpreter and the compiler.
> Does this bogus dotted vector really work in Bigloo?

I'm sorry to say that I never tested that case. I have never come
across a practical example of vector patterns. Bigloo accepts #(a
. b) pattern, but it doesn't work as I expected. SCM and S48 do not
even accept #(a . b). SCM and S48 do take correct vector patterns,
e.g., #(a b), but they fail to match against them. Thus vector
patterns in SCM and S48 don't seem to work, at least for me.

> > (letrec-syntax
> > ...
> > (lambda-native ; capture the native lambda
> > (syntax-rules ()
> > ((_ . args) (lambda . args))))
> >
> > This fragment does what it looks like: it captures the native lambda,
> > which is needed to effect bindings.
> This is not actually necessary.

I'm afraid this is necessary -- at least for S48. If you omit that
declaration,
,expand (defile (let ((foo 2)) (list (mm) foo)))
yields
'((#(>> #(>> #(>> let let 1673) #{Generated #{Generated doit 1635}
1650} 1654) #{Generated #{Generated #{Generated lambda-native 1635}
1650} 1654} 1674) (foo) (list foo foo)) 2)

However, with the unmodified defile as given in the article,
,expand (defile (let ((foo 2)) (list (mm) foo)))
prints
'((lambda (foo) (list foo foo)) 2)

a neat, albeit perverted result.

Still, your point is valid: defile does use referential transparency.
That's the reason lambda-native redefinition works at all. It's
rather ironic that an attack on the referential transparency itself
relies on some form of the referential transparency.

Al Petrofsky

unread,

Mar 31, 2002, 10:15:43 PM3/31/02

to ol...@pobox.com

ol...@pobox.com (ol...@pobox.com) writes:
> Al Petrofsky <a...@petrofsky.org> wrote in message news:<878z8c6...@radish.petrofsky.org>...
> > > (define-syntax extract ...
> > > ((_ d #(x . y) tail cont) ...
> > > All the examples run with Bigloo 2.4b interpreter and the compiler.
> > Does this bogus dotted vector really work in Bigloo?
>
> I'm sorry to say that I never tested that case. I have never come
> across a practical example of vector patterns.

The very practical example, for which vector patterns were added to
the spec between r4rs and r5rs, is quasiquote.

> SCM and S48 do take correct vector patterns, e.g., #(a b), but they
> fail to match against them.

It works for me in scm 5d2:

(require 'macro)
(let-syntax ((m (syntax-rules () ((_ #(a b)) (+ a b)))))
(m #(1 2)))
=> 3

It's possible S48 implements the r4rs spec, in which case vector
patterns are matched according to equal?, and the following example
will still work:

(let-syntax ((m (syntax-rules () ((_ #(a b)) (+ 1 2)))))
(m #(a b)))
=> 3

> > > This fragment does what it looks like: it captures the native lambda,
> > > which is needed to effect bindings.
> > This is not actually necessary.
>
> I'm afraid this is necessary -- at least for S48. If you omit that
> declaration,
> ,expand (defile (let ((foo 2)) (list (mm) foo)))
> yields
> '((#(>> #(>> #(>> let let 1673) #{Generated #{Generated doit 1635}
> 1650} 1654) #{Generated #{Generated #{Generated lambda-native 1635}
> 1650} 1654} 1674) (foo) (list foo foo)) 2)

From the presence of lambda-native in the output, I presume you missed
the second half of my prescription:

> > You can drop this clause from the letrec-syntax bindings and
> > replace the one call to lambda-native with a call to lambda.

-al

Joe Marshall

unread,

Mar 29, 2002, 7:06:00 AM3/29/02

to

"Thomas Bushnell, BSG" <tb+u...@becket.net> wrote in message
news:87663gy...@becket.becket.net...

> ol...@pobox.com (ol...@pobox.com) writes:
>
> > If a Scheme system with defiled lambda and other binding forms
> > nevertheless complies with R5RS, it casts doubt on the assertion that
> > R5RS macros are referentially transparent.
>
> Hrm, I'm inclined to a holistic view of the standard. Such defiled
> lambdas, because they allow the violation of referential transparency,
> are not actually standard compliant.

Since they are implicitly allowed by the standard as is, then
they are standard compliant. So either the standard is incorrect
if it asserts that R5RS macros cannot be made ``referentially opaque''
(Oleg's view) or the standard is not standard compliant (Thomas
Bushnell, BSG's view). I think you are throwing out the baby
with the bathwater.

Whatever the case, I'm pleased that one can subvert the
hygiene of the pattern matching. This means that you do not
need to refer to an `underlying implementation' for those
cases where you don't want hygiene.

Now if someone could show how to go from non-hygienic
substitution to where you could execute arbitrary code
during macro expansion (ala DEFMACRO), then an
underlying implementation would not be necessary at all.

~jrm

ol...@pobox.com

unread,

Apr 1, 2002, 4:02:29 PM4/1/02

to

Al Petrofsky <a...@petrofsky.org> wrote in message news:<871ye36...@radish.petrofsky.org>...

> The r5rs 4.3 discussion of the referential transparency of
> syntax-rules transformers states:
> * If a macro transformer inserts a free reference to an identifier,
> the reference refers to the binding that was visible where the
> transformer was specified, regardless of any local bindings that
> may surround the use of the macro.

This paragraph does apply as it is to the defiled macros. In the code,

(define foo 1)

(defile
(let ((foo 2)) (list (mm) foo))
)

the identifier 'foo' inserted in the expansion of the macro mm indeed
refers to the binding of 'foo' that was visible when macro 'mm' was
defined. The twist is that the definition of the macro 'mm' happened
right after the local binding of foo.

The macro 'defile' is a R5RS macro, in full compliance with R5RS.
The 'defile' macro shadows syntactic bindings and binds 'lambda',
'let', 'let*' in a local scope. That is not prohibited by R5RS.
R5RS states that there are no reserved keywords, and syntactic
bindings may shadow variable bindings and other syntactic bindings.

However, the result of the 'defile' expansion is observationally
equivalent to the expansion of a non-hygienic macro and is apparently
referentially opaque. That's the paradox: I go on following the rules
and end up seemingly contradicting them.

On one hand, I should not be surprised. The implementations of
lambda-calculator and of the Turing machine as R5RS macros prove that
the syntax-rules macros are Turing complete. Therefore, it is
potentially possible to use R5RS macros to build another macro system
that lacks the referential transparency guarantees.

This explanation leaves some sense of dissatisfaction. If I look at an
isolated peace of code, e.g.,

(let ((foo 2)) (list (mm) foo))

and know that macro mm expands into a reference to a variable foo
(which is also globally defined), I still can't tell which binding of
variable 'foo' will be captured. I cannot apply the referentially
transparency guarantee of R5RS because I don't know off hand if 'let'
or 'lambda' have been re-defined in _particular_ ways. To be sure
about the binding of 'foo', I have to analyze all the program up to
that point, check out definitions and re-definitions of all syntactic
keywords. Essentially I have to macro-expand a form in order to make
a judgment about its identifier bindings. The macro-expansion process
however, being Turing complete, may go on for a long time and may even
fail to terminate. This means that many simple judgments about
identifier bindings cannot be decided upon within bounded or
exponential etc. time -- cannot be mechanically decided, in general.

This conclusion poses another question. The R5RS macro system, being
severely constrained in expressiveness, is still powerful enough to
implement its own "negation", so to speak. If this behavior is deemed
acceptable, what is the reason then to keep the constraints? Why not
to add guards and other syntactic sugar of far more pleasant
syntax-case macros? If we have nothing to lose, we can at least enjoy
it.

David Rush

unread,

Apr 3, 2002, 6:24:24 AM4/3/02

to

ol...@pobox.com (ol...@pobox.com) writes:
> On one hand, I should not be surprised. The implementations of
> lambda-calculator and of the Turing machine as R5RS macros prove that
> the syntax-rules macros are Turing complete. Therefore, it is
> potentially possible to use R5RS macros to build another macro system
> that lacks the referential transparency guarantees.

Yes. Isn't it amazing how hard it is to avoid Turing-completeness?
Your demonstration of macro-CPS raises some significant issues
w/rt the status of syntax-rules macros in R5RS.

> The R5RS macro system, being
> severely constrained in expressiveness, is still powerful enough to
> implement its own "negation", so to speak. If this behavior is deemed
> acceptable, what is the reason then to keep the constraints? Why not
> to add guards and other syntactic sugar of far more pleasant
> syntax-case macros? If we have nothing to lose, we can at least enjoy
> it.

As far as I'm concerned, all that's needed is for someone to turn the
syntax-case specification into a SRFI. Then, if there ever is an R6RS,
it can leave the macro system specification in the SRFI world, where
it belongs.

david rush
--
Censorship may be useful for preservation of morality, but can never
be so for its restoration.
-- Jean-Jacques Rousseau