Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

syntax-rules

1,271 views
Skip to first unread message

Golubev I. N.

unread,
Feb 28, 2002, 6:51:44 AM2/28/02
to
Widely known descriptions of `syntax-rules' syntax in r5rs and "The
Scheme Programming Language, 2nd Edition" seem unsufficient to me.
They do not tell explicitly just how and when syntax expander
determines that "literal identifier is inserted as a bound
identitifer" so that it must be "renamed to prevent inadvertent
captures of free identifiers".

For example, how r5rs ensures that when `letrec' is defined as in
r5rs,

(letrec ((a init1) (b init2)) body ...)

is expanded to `let' with 2 additional *distinct* temporary variables,
not into form including erroneous

(let ((newtemp init1) (newtemp init2)) ...)

And how does it ensure that even when `newtemp' is defined in scope of
`letrec' definition, it does not get into `letrec' expansion unless it
is present in input form?

Please reply by mail.

Al Petrofsky

unread,
Mar 3, 2002, 8:03:45 PM3/3/02
to Golubev I. N.
g...@mo.msk.ru (Golubev I. N.) writes:

> Widely known descriptions of `syntax-rules' syntax in r5rs and "The
> Scheme Programming Language, 2nd Edition" seem unsufficient to me.
> They do not tell explicitly just how and when syntax expander
> determines that "literal identifier is inserted as a bound
> identitifer" so that it must be "renamed to prevent inadvertent
> captures of free identifiers".

That's because it's a difficult thing to explicitly tell.
Syntax-rules is quite easy to use and sufficiently powerful for many
tasks without a deep understanding of its exact semantics. It is
intended to more often create a correct macro from the input of a
naive user than does a traditional lisp macro system. If you have
read r5rs and understand it well enough to write a cond macro
(including =>), and you're too foolhardy to realize that you don't
really want to know enough detail about syntax-rules to truly
understand the letrec macro, then read on...

======================================================================

The r5rs macro system documentation is hard to fully understand, and
is in some aspects incomplete. Compared to the other features in
r5rs, the macros were based on more recent research, and were added to
the report at a point when the rnrs revision process was stalled. I
think perhaps the only people who read and deeply understood the
syntax-rules specification and contributed to its revision were people
who had experience in writing a lexically-scoped macro system and were
so familiar with certain aspects of such systems that they failed to
notice that these aspects weren't adequately defined in the
specification. At the other extreme, some of the issues with macro
uses that expand into macro transformers were not yet fully understood
by anyone, and this also led to deficiencies in the specification.

As a result, I think it is virtually impossible to determine the exact
intended semantics solely from r5rs. So the short answer to your
questions is: instead of trying to figure out why it follows from the
syntax-rules specification that the letrec macro works, try to figure
out what the syntax-rules specification must have been trying to say,
given that it must have intended to specify a semantics under which
the letrec macro works.

Below I will try to address some of the weaknesses of the spec. This
article might not be any more accessible than r5rs, but I hope to
illuminate some of the trickier aspects of lexically-scoped macros
that r5rs barely addresses.

First note that the definition of a "literal identifier" in a pattern
or template is "any identifier in the pattern or template that is not
a pattern variable or ellipsis". However, the semantics of literal
identifiers in the templates are unrelated to the semantics of literal
identifiers in the patterns.

Here is the key paragraph in R5RS:

Identifiers that appear in the template but are not pattern
variables or the identifier ... are inserted into the output as
literal identifiers. If a literal identifier is inserted as a free
identifier then it refers to the binding of that identifier within
whose scope the instance of syntax-rules appears. If a literal
identifier is inserted as a bound identifier then it is in effect


renamed to prevent inadvertent captures of free identifiers.

It doesn't really make sense to say "is inserted as a free identifier"
or "is inserted as a bound identifier", because when the identifier is
inserted, it is neither free nor bound. In fact, it may be
subsequently replicated, and in the final resulting expression, some
of the copies could be free uses, some bound uses, and still other
copies could become bindings (which are distinct from bound uses) and
literals. For example:

(let ((a 1))
(letrec-syntax
((foo (syntax-rules ()
((_ b)
(bar a b))))
(bar (syntax-rules ()
((_ c d)
(cons c (let ((c 3))
(list d c 'c)))))))
(let ((a 2))
(foo a))))

=> (1 2 3 a)

When the use of foo is transcribed, the a in the template (bar a b) is
inserted into the output as a literal identifier. When the output is
then transcribed according to bar, this identifier is replicated and
used to replace the four occurrences of the pattern variable c in the
template (cons c (let ((c 3)) (list d c 'c))). The completely
macro-expanded version of the expression looks like this:

(let ((a 1))
(let ((a 2))
(cons a (let ((a 3))
(list a a 'a)))))
=> (1 2 3 a)

Why doesn't this evaluate to (2 3 3 a)? Because macros introduce new
aspects to lexical scoping, such that the meaning of an identifer in a
macro-expanded expression depends not only on the name of the
identifier used and on which bindings' regions enclose it, but also on
when the identifier was inserted, and on which bindings' regions
enclose the macro that did the insertion. Specifically, two
identifiers are only considered to be entirely equivalent if they have
the same name, and either they were both part of the original program
text, or they were inserted by the same template transcription and the
corresponding literal identifiers in the template were entirely
equivalent. If an identifier use is in the region of one or more
bindings for entirely equivalent identifiers, then the identifier
refers to the innermost such binding. If there is no such binding,
then an identifier use refers to whatever binding is referred to by
the corresponding literal identifier in the template whose
transcription inserted it. An identifier use is an error if no such
binding exists. When we get to macro uses that expand into macro
transformers, it will become important that these rules are recursive.

Note that I said "inserted by the same template transcription" rather
than "inserted by the same template": each time a template that
contains a literal identifier is transcribed, it inserts a new
distinct copy of that identifier.

Now let's walk through the expansion of that example, and indicate the
uniqueness of a macro-inserted identifier by suffixing its name with
#n, where n is the serial number of the template transcription that
inserted the identifier. This will make it easy to determine if two
identifiers are exactly equivalent.

(let ((a 1))
(letrec-syntax
((foo (syntax-rules ()
((_ b)
(bar a b))))
(bar (syntax-rules ()
((_ c d)
(cons c (let ((c 3))
(list d c (quote c))))))))
(let ((a 2))
(foo a))))

The use of foo at the bottom is in the region of the letrec-syntax
binding for foo, and the foo identifiers exactly match (neither was
inserted by a template transcription), so the foo transformer is
applied, the pattern variable b is matched to a, and the template (bar
a b) is transcribed: the pattern variable b is replaced with the part
of the input that matched it, a, and the literal identifiers bar and a
are replaced by freshly-inserted identifiers with the same names but
new insertion serial numbers. That gives us (bar#1 a#1 a). The whole
expression is now:

(let ((a 1))
(letrec-syntax
((foo (syntax-rules ()
((_ b)
(bar a b))))
(bar (syntax-rules ()
((_ c d)
(cons c (let ((c 3))
(list d c 'c)))))))
(let ((a 2))
(bar#1 a#1 a))))

The expression in the last line has a use of bar#1 that is not in the
region of any binding for bar#1. We therefore go to the template that
inserted bar#1 (i.e. the template used in transcription serial number
one), which is (bar a b), and look for a binding for bar whose region
encloses the template. We find the letrec-syntax binding for bar, so
that transformer is applied, the pattern variables c and d are matched
to a#1 and a respectively, and the template is transcribed:

(let ((a 1))
(letrec-syntax
((foo (syntax-rules ()
((_ b)
(bar a b))))
(bar (syntax-rules ()
((_ c d)
(cons c (let ((c 3))
(list d c 'c)))))))
(let ((a 2))
(cons#2 a#1 (let#2 ((a#1 3))
(list#2 a a#1 (quote#2 a#1)))))))

Now we have a use of cons#2 with no binding of cons#2. We therefore
go to the template of transcription number two, and from there the
nearest binding of cons is the ordinary top-level one. Similarly,
let#2, list#2, and quote#2 all refer to ordinary let, list, and quote.
(We'll assume that let is a builtin rather than a macro.) The use of
a#1 as the first argument to cons is not in the region of any binding
for a#1, so we go to the template of transcription number one, which
was (bar a b), and look for a binding for a whose region encloses the
template. Notice that the program contains two bindings for a, and
the cons expression is in the regions of both of them, but the
template is only in the region of the outer one, so that is the one
that is used, and its value, 1, is passed as the first argument to
cons. The use of a in the last line is inside the regions of two
bindings for a, and the innermost one holds the value 2, so 2 is
passed as the first argument to list. The a#1 refers to the only
binding for a#1, which contains the value 3. The quoted identifier
a#1 generates the symbol a, because the #1 is not part of the name of
the identifier, it is just extra information about the identifier that
we happen to be writing down next to its name.


Are you still with me? Anyone? Anyone?


Okay, now we get to the tricky part: macro uses that expand into macro
transformers. This means that some identifiers will have been
inserted by templates that were inserted by other templates, and the
rules for finding their bindings will need to be applied recursively.
Consider:

(let ((x 1))
(let-syntax
((foo (syntax-rules ()
((_ y) (let-syntax
((bar (syntax-rules ()
((_) (let ((x 2)) y)))))
(bar))))))
(foo x)))

The first step is straghtforward:

(let ((x 1))
(let-syntax
((foo (syntax-rules ()
((_ y) (let-syntax
((bar (syntax-rules ()
((_) (let ((x 2)) y)))))
(bar))))))
(let-syntax#1
((bar#1 (syntax-rules#1 ()
((_#1) (let#1 ((x#1 2)) x)))))
(bar#1))))

There is no binding for let-syntax#1, so we look for a binding of
let-syntax at the site of the template of the first macro
transcription. The nearest (and only) binding is the top-level
builtin. Next we expand the bar#1 macro use. That's interesting
because we have a template that contains two non-equivalent literal
identifiers named x: plain x and x#1.

(let ((x 1))
(let-syntax
((foo (syntax-rules ()
((_ y) (let-syntax
((bar (syntax-rules ()
((_) (let ((x 2)) y)))))
(bar))))))
(let-syntax#1
((bar#1 (syntax-rules#1 ()
((_#1) (let#1 ((x#1 2)) x)))))
(let#1#2 ((x#1#2 2)) x#2))))

Now it takes two steps to resolve the identifier let#1#2 back to the
ordinary top-level let. (We first look for a binding of let#1#2
starting from its use, then for a binding of let#1 starting from the
template used in macro transcription number two, and finally we look
for a binding of let starting from the template used in macro
transcription number one.) Lastly, we look for a binding for the x#2
at the end of the expression. There is none for x#2, so we go to the
template of transcription number two (the bar#1 template) and look for
a binding for x. We find the binding with value 1, and that is the
value of this whole expression.

Some scheme implementations get this example wrong and return 2
because they compare identifiers only by comparing their names and the
times of their final insertion, so the binding for x#1#2 captures the
use of x#2. This is wrong because x#1 and x are clearly distinct, and
there is no justification for a second "renaming" to cause them to
become equal again, when the whole purpose of identifiers being "in
effect renamed" is "to prevent inadvertent captures".


Here's another problem case:

(let ((x 1))
(let-syntax
((foo (syntax-rules ()
((_ y) (let-syntax
((bar (syntax-rules ()
((_ x) y))))
(bar 2))))))
(foo x)))

After transcribing the (foo x) macro use, we get:

(let ((x 1))
(let-syntax
((foo (syntax-rules ()
((_ y) (let-syntax
((bar (syntax-rules ()
((_ x) y))))
(bar 2))))))
(let-syntax#1
((bar#1 (syntax-rules#1 ()
((_#1 x#1) x))))
(bar#1 2))))

Now we have to answer this question: in the rule ((_#1 x#1) x), is the
x in the template a pattern variable, or a literal identifier? I
believe it should be considered distinct from the x#1 in the pattern,
and that it is therefore a literal identifier (and the value of the
above expression is 2), but r5rs is unclear about this, because it
says that renaming applies only to "a binding for an identifier
(variable or keyword)" (r5rs 4.3). Possibly the word "variable" here
was meant to include "pattern variable", but nowhere else does the
report use the word "binding" to refer to a pattern variable binding.


The biggest shortcoming of r5rs syntax-rules is the inability to use
an ellipsis (the identifier "...") in a syntax-rules form inside of a
template, because the outer syntax-rules insists on treating it
specially. The amusing thing about this is that it is essentially a
form of unhygiene, and could easily have been avoided by having the
identifier for the ellipsis be passed in as a parameter, rather than
implicitly reserving "...". So the syntax-rules design loses as a
result of failing to observe the very sort of constraint that it
imposes on anyone trying to write a block-with-implicitly-bound-break
macro.


Lest I sound like I am a detractor, let me say that despite the flaws
and the hairiness when you look too closely, r5rs macros were an
amazing leap forward for the writing of simple macros, and the great
majority of macros are simple. They're also a leap forward in
providing a painful but powerful way for writing complex macros,
without the scheme system having to deal with the anomalies of a
"low-level" macro system.


For further reading, you may wish to find some of the papers
referenced in r5rs. Look at library.readscheme.org for some of these,
and for some more recent papers. In particular, read the papers about
syntax-case and the chez scheme module system to learn about the most
significant advances in scheme syntax since syntax-rules. There are
also several good bits of macro knowledge in the scheme faq at
schemers.org. To fully come to grips with the tricky details, there
is no substitute for writing a simple implementation of eval,
null-environment, and scheme-report-environment in scheme.

If this article actually leaves you wanting more of the same, you
could try doing a search for petrofsky or oleg and "syntax-rules" on
groups.google.com, which will retrieve many simple and complex
examples, and a few explorations so mind-numbingly esoteric that they
are must-haves in any "Syntax-Rules Syllabus for the Criminally
Insane".

-al

Golubev I. N.

unread,
Mar 4, 2002, 2:40:14 PM3/4/02
to
Thanks a lot.

> Syntax-rules is sufficiently powerful for many tasks without a deep


> understanding of its exact semantics.

`letrec' is not such a task.

> If you're too foolhardy to realize that you don't


> really want to know enough detail about syntax-rules to truly
> understand the letrec macro,

I am not. But I wanted to corroborate my suspicions that one has to
be.

> I think perhaps the only people who read and deeply understood the
> syntax-rules specification and contributed to its revision were
> people who had experience in writing a lexically-scoped macro system
> and were so familiar with certain aspects of such systems that they
> failed to notice that these aspects weren't adequately defined in
> the specification.

I feel the same.

> despite the flaws and the hairiness when you look too closely, r5rs
> macros were an amazing leap forward for the writing of simple
> macros

Agree.

> In particular, read the papers about
> syntax-case and the chez scheme module system to learn about the most
> significant advances in scheme syntax since syntax-rules.

I decided to do that eventually, at least for "Syntactic Abstraction
in Scheme" '93, if I would find nothing better. After all, it
describes design of system which many syntax-rules implementations are
based on (via portable syntax-case implementation).

> There are
> also several good bits of macro knowledge in the scheme faq at
> schemers.org.

I have already checked them out. Having to clarify what "followed by
n instances of `...'" really means (which is done there) is another
good example of things that are missing in specification but necessary
to understand it. Amusingly, despite all this the same faq calls r5rs
"high-quality".

Antti Karttunen

unread,
Mar 4, 2002, 9:47:32 PM3/4/02
to
In article <87it8db...@radish.petrofsky.org>,
Al Petrofsky <a...@petrofsky.org> wrote:
...

>
>Are you still with me? Anyone? Anyone?

No, and I don't try to be at 4:38 am.
But I guess this is a good occassion to insert a naive question between:

How do I recreate the effect of dirty macro basque given
in defmac1.scm using only define-syntax ?
That is, I need the free reference to B inside the macro body/template
to refer to the closest lexically enclosing binding of B
(as when using define-macro in the guile example below),
not to the global definition valid at the time of defining
the macro, as when using define-syntax in the scsh example below.

I guess I will use scsh in my project, but unfortunately
it has only define-syntax, not define-macro.


looper:/home/karttu/macs(11)% guile
guile> (load "defmac1.scm")
guile> (test_it 'humpty)
(humpty . hummingbird)
guile> (exit)
looper:/home/karttu/macs(12)% scsh
Welcome to scsh 0.6.1 (Combinatorial Algorithms)
Type ,? for help.
> (load "defsyn2.scm")
defsyn2.scm
> (test_it 'humpty)
'(humpty . dumpty)
> (exit)
looper:/home/karttu/macs(13)% more defmac1.scm

(define B '((humpty . dumpty) (tilleri . talleri) (pilleri . hilleri)))

(define-macro (basque arg) `(assq ,arg B))


(define (test_it W)
(let ((B '((tilleri . tyllero) (humpty . hummingbird) (pilleri . pallero))))
(basque W)
)
)

looper:/home/karttu/macs(14)% more defsyn2.scm

(define B '((humpty . dumpty) (tilleri . talleri) (pilleri . hilleri)))

(define-syntax basque
(syntax-rules ()
((basque arg)
(assq arg B)
)
)
)


(define (test_it W)
(let ((B '((tilleri . tyllero) (humpty . hummingbird) (pilleri . pallero))))
(basque W)
)
)

looper:/home/karttu/macs(15)%


Terveisin,

Antti Karttunen

Al Petrofsky

unread,
Mar 5, 2002, 12:10:34 AM3/5/02
to
kar...@walrus.megabaud.fi (Antti Karttunen) writes:

> How do I recreate the effect of dirty macro basque given in
> defmac1.scm using only define-syntax ? That is, I need the free
> reference to B inside the macro body/template to refer to the

> closest lexically enclosing binding of B [in the macro use]

You don't. There are devious ways to write unhygienic macros with
syntax-rules, but you can't break referential transparency. You'll
probably have to pass B as an argument to basque. If you provide some
detail about why you don't want to do that, then perhaps someone can
propose a solution appropriate to the situation.

> (define B '((humpty . dumpty) (tilleri . talleri) (pilleri . hilleri)))
>
> (define-syntax basque
> (syntax-rules ()
> ((basque arg)
> (assq arg B))))
>

> (define (test_it W)
> (let ((B '((tilleri . tyllero) (humpty . hummingbird) (pilleri . pallero))))
> (basque W)))

The sick and fragile solution would be to redefine let so that it
pseudo-unhygienicly binds basque to a macro that refers to the B that
let is binding.

-al

ol...@pobox.com

unread,
Mar 5, 2002, 4:46:54 PM3/5/02
to
Al Petrofsky <a...@petrofsky.org> wrote a great article in message
news:<87it8db...@radish.petrofsky.org>...

That's a very interesting and detailed article!

I'd like to mention two somewhat subtle pitfalls we have to be aware
of when writing R5RS macros. I have fallen into both these
pitfalls. Hopefully my example will save someone from doing the
same. The examples are a bit simplified for presentation; yet, the
pitfalls are very real.

The first pitfall concerns with lexical scoping and shadowing. Pattern
variables don't shadow!

Let's consider a simple function:

(define foo-f
(lambda (x)
(let ((id (lambda (x) x)))
(id (+ 1 x)))))
(display (foo-f 4)) ; prints 5
(newline)

The argument of 'foo-f' is named x; the argument of 'id' is also named
'x'. There is no confusion as the latter shadows the former. Suppose
we want to re-write foo-f as a macro (e.g., for efficiency). We can do
that literally:

(define-syntax foo-m
(syntax-rules ()
((_ x)
(let-syntax
((id
(syntax-rules ()
((_ x) x))))
(id (+ x 1))))))

(display (foo-m 4))
(newline)

oops, we've got an error:
"Use of macro does not match definition -- ((id~1 (+~1 4 1)))"

Let's try to manually expand (foo-m 4) to see what's going on. On the
first iteration, (foo-m 4) expands into
(let-syntax
((id
(syntax-rules ()
((_ 4) 4))))
(id (+ 4 1)))

Pattern variable 'x' matched the number 4. Therefore, all instances of
'x' in the template are replaced by 4. Absolutely all instances! When
the macroexpander gets to (id (+ 4 1)) it finds out that the
expression doesn't match the only given pattern (_ 4). Therefore, the
macro-expander reports an error. Getting an error is the best possible
case. In the worst case, no error is generated but the result of
the macro-expansion is just bizarre.

To fix the problem, we should always make sure that all unrelated
pattern variables have unique names:

(define-syntax foo-m-fixed
(syntax-rules ()
((_ x)
(let-syntax
((id
(syntax-rules ()
((_ y) y))))
(id (+ x 1))))))

(display (foo-m-fixed 4)) ; prints 5

Pattern variables don't shadow each other!


The second pitfall deals with passing parameters to a submacro. You
can do it through arguments, or through the "environment". Better make
this _exclusive_ OR. Do not mix the two techniques! The pitfall has
been covered in the original article by Al Petrofsky. I'll repeat it
again on a simpler example.

(define (bar-f x y)
(let ((helper
(lambda (u) (+ x u))))
(helper y)))
(display (bar-f 4 1)) ; prints 5

The body of a 'subfunction' helper sums the values of two variables: x
and u. The value of 'x' is taken from the lexical environment that
encloses the 'helper' -- namely, the lexical environment of bar-f's
body. The value of 'u' is passed to the helper as an argument. Suppose
we want to re-write bar-f as a macro:

(define-syntax bar-m
(syntax-rules ()
((_ x y)
(let-syntax
((helper
(syntax-rules ()
((_ u) (+ x u)))))
(helper y)))))

(display (bar-m 4 1)) ; prints 5

It works. Where's the pitfall then?


Let us consider a very similar macro:

(define-syntax bar-m2
(syntax-rules ()
((_ var body)
(let-syntax
((helper
(syntax-rules ()
((_ bdy) (lambda (var) bdy)))))
(helper body)))))

(display ((bar-m2 z (+ z 1)) 4))
(newline)

When we evaluate the display form we get a run-time error: "Unbound
variable -- z". Al Petrofsky explained in great detail why this
happens. To fix the problem, we have to pass the parameters to the
submacro 'helper' either through the "environment":

(define-syntax bar-m1
(syntax-rules ()
((_ var body)
(let-syntax
((helper
(syntax-rules ()
((_) (lambda (var) body)))))
(helper)))))

(display ((bar-m1 z (+ z 1)) 4)) ; prints 5

or through the arguments:

(define-syntax bar-m3
(syntax-rules ()
((_ var body)
(let-syntax
((helper
(syntax-rules ()
((_ vvar bbody) (lambda (vvar) bbody)))))
(helper var body)))))

(display ((bar-m3 z (+ z 1)) 4)) ; prints 5

Again, don't mix the ways of passing parameters to a submacro!

If you're mindful of these pitfalls and use CPS/tail-calls, you will
find that -- surprisingly -- writing R5RS macros looks a lot like
writing regular tail-recursive Scheme procedures.

David Golden

unread,
Mar 5, 2002, 5:15:37 PM3/5/02
to
ol...@pobox.com wrote:

> I'd like to mention two somewhat subtle pitfalls we have to be aware
> of when writing R5RS macros. I have fallen into both these
> pitfalls. Hopefully my example will save someone from doing the
> same. The examples are a bit simplified for presentation; yet, the
> pitfalls are very real.
>


Just out of interest, given your excellent XML framework, have you
ever wondered about the similarities, if any, between a syntax-rules
definition and an XML schema ?? i.e. the similarity of validating
an XML document against a schema, and checking whether a particular
expression would be matched by a particular syntax-rules definition ?

Or am I talking utter nonsense again?


--
Don't eat yellow snow.

ol...@pobox.com

unread,
Mar 5, 2002, 9:29:34 PM3/5/02
to
David Golden <qnivq....@bprnaserr.arg> wrote in message news:<dsbh8.6337$D6.1...@news.iol.ie>...
> ... and checking whether a particular expression would be matched by a
> particular syntax-rules definition

I believe that the problem is:
- given the definition of a syntax transformer
(define-syntax foo (syntax-rules ...))
- and a macro-application
(foo ...)
determine if the macro-application terminates and yields the expanded
code, terminates with an error, or fails to terminate. Am I right?

The similarity between R5RS macros and (tail-recursive) Scheme
procedures implies that R5RS macros might be Turing-complete. This fact
has been constructively proven by a poster on this newsgroup a year or
so ago. If I'm right about the formulation of your problem, it is
equivalent then to the Halting problem...

BTW, I read that XSLT is also shown to be Turing-complete. Therefore,
it is just as difficult to decide if an application of a stylesheet to
an XML document will yield anything....

Antti Karttunen

unread,
Mar 6, 2002, 9:00:00 AM3/6/02
to
In article <f73a7491.02030...@posting.google.com>,
RT Happe <rth...@web.de> wrote:
>Michael Sperber [Mr. Preprocessor] writes:
>> >>>>> "Brian" == Brian D Carlstrom <b...@zurich.ai.mit.edu> writes:
>
>> Brian> define-syntax does not have to be used with syntax-rules. You can
>> Brian> provide a functon to do explicit s-exp based conversion.
>
>> There's documentation at
>>
>> ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/prop/exrename.ps.gz
>>
>> which applies to Scheme 48/scsh as well, with the exception of the
>> TRANSFORMER keywords, which is omitted here.
>
>There are three simple examples in scsh-0.6.1/scheme/misc/doodl.scm
>(the macros SETTER, DEFINE-SETTER, and BIND). Basically, you rename
>rather than gensym, and compare names with the dedicated comparison
>function rather than with Scheme's EQ? or something.
>
> (define-syntax foo
> ;; transforms EXP --> EXP'
> (lambda (exp rename eq?) ...))
>
>rthappe

Thanks everybody, now I have advanced a bit.

% scsh
Welcome to scsh 0.6.1 (Combinatorial Algorithms)
Type ,? for help.

> ,config ,load convform.scm
convform.scm
> ,open convform
Load structure convform (y/n)? y
[convform[((for-syntax #f))]
]
> (define-convform list 1 2 3)
(#{Name define-convform} list 1 2 3)
'(1 2 3)
>

where the contents of convform.scm is:

(define-structure convform
(export (define-convform :syntax))
(open scheme)
(for-syntax
(open scheme)
(begin
(define (define-convform-expander arg1 arg2 arg3)
(write arg1) (newline) ; What we have eaten?
(cdr arg1) ; Just something now, and avoid infinite recursion.
)
)
)
(begin (define-syntax define-convform define-convform-expander))
)

Now, the define-structure stuff above is written solely by the
trial-and-error method, as I couldn't find any kind of documentation
about the (for-syntax <clause>) part of define-structure -form,
neither from www.google.com nor groups.google.com.


Next I would like to know whether a macro defined
in such way as a transformer function can define more
similar dirty macros by calling define-syntax
at its expansion time? And if so, should those invocations
(of macro define-convform) occur inside some of the
structure- (i.e. Scheme48/scsh-module) -definitions?

Or should I use guile-scsh instead, with presumably has
the guile's old-fashioned define-macros?

What I have in mind is a simple language for defining
sequential conversion scripts that execute a set
of predefined conversion forms, which have been defined
like:

(define-convform (FETCH_LATEST_FROM_XYZZY (O output))
(run (lynx -source ,(assq XYZZY_URL *DEFS*)) (> ,output))
)


(define-convform (NUKE_CRS (I infile) (O outfile))
(run (tr -d '\015') (< ,infile) (> ,outfile))
)


and that macro define-convform would do two things:

A) define a function

(define (NUKE_CRS_fun *STEP* *DEFS* *IO_BINDINGS* infile outfile)
(let ((infile (assq infile *IO_BINDINGS*))
(outfile (some_code_for_creating_new_temporary_file_name
outfile and_destructively_adding_it_to *IO_BINDINGS*))
)
(some_catcher_and_thrower_for_possibly_ruined_execution_here?
(run (tr -d `\015`) (< ,infile) (> ,outfile))
)
)
)

B) define a macro called NUKE_CRS

which would expand something like (NUKE_CRS (I file1) (O file2))
to call: (NUKE_CRS_fun *STEP* *DEFS* *IO_BINDINGS* 'file1 'file2)
and these all invocations would occur
inside some begin-form or such with *DEFS* and *IO_BINDINGS*
bound as appropriate association-lists.

So the resulting "scripts" look a bit like definitions of
a Prolog-clause with many subclauses, with their instantiated (here input
files marked with (I ...)) and uninstantiated (here output
files marked with (O ...)) variables, and the *IO_BINDINGS*
list keeping track which filenames are already instantiated
to temporary file names like tmp20020306.1, tmp20020306.2, etc.
But it also feels a bit like an assembly program
MOV #item,@dest
ADD (R1)+,@dest
etc.
with similar breakpoint and single-step capabilities
allowed as the most debuggers have.

But this syntax is still in development. Now somebody
asks why I don't use the pipe-notation of scsh,
for which the reason is that there are also conversion forms
which have more than one input and output, and I especially
need the single-stepping facility for debugging and all
the intermediate results of conversions saved (as files)
in case something goes bad in the later steps.


Yours,

Antti Karttunen

Michael Sperber [Mr. Preprocessor]

unread,
Mar 6, 2002, 9:25:15 AM3/6/02
to Antti Karttunen
>>>>> "Antti" == Antti Karttunen <kar...@walrus.megabaud.fi> writes:

Antti> Now, the define-structure stuff above is written solely by the
Antti> trial-and-error method, as I couldn't find any kind of documentation
Antti> about the (for-syntax <clause>) part of define-structure -form,
Antti> neither from www.google.com nor groups.google.com.

Look in the Scheme 48 manual shipped with scsh 0.6.1, near the end of
Section 4.8 in the chapter on Modules. Admittely, it's perhaps not
that easy to find.

--
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla

Rolf-Thomas Happe

unread,
Mar 6, 2002, 11:03:37 AM3/6/02
to
Antti Karttunen writes:
> Next I would like to know whether a macro defined
> in such way as a transformer function can define more
> similar dirty macros by calling define-syntax
> at its expansion time? And if so, should those invocations
> (of macro define-convform) occur inside some of the
> structure- (i.e. Scheme48/scsh-module) -definitions?

You may find the little Doedel package listed at
http://www.scsh.net/resources/markup.html
moderately helpful, in particular the config file
doedules.scm. You'll find low-level macros, some using
helper functions, macros defining macros, plus the
corresponding structure definitions. (The package
contains also an experimental trashy part misusing
SYNTAX-RULES to extend rather than to define syntax.)



> Or should I use guile-scsh instead, with presumably has
> the guile's old-fashioned define-macros?

The hair you notice comes with Scheme 48's refined module
system. Macros and modules don't team up easily, by their
very nature: both macro expanders and the expanded code
may refer to objects by name, but the module system
restricts visibility and forces you to make (part of)
the structure of the program explicit. That is to say,
I wouldn't give up on structured programming and stick
with s48/scsh.

rthappe


David Golden

unread,
Mar 6, 2002, 4:49:06 PM3/6/02
to
ol...@pobox.com wrote:

> determine if the macro-application terminates and yields the expanded
> code, terminates with an error, or fails to terminate. Am I right?

Almost certainly :-) .

> If I'm right about the formulation of your problem, it is
> equivalent then to the Halting problem...

Eep.

I was simply wondering whether parts of the XML Schema spec could be
considered in some sense analagous to syntax-rules, but that puts
it in a whole new light anyway.

That then raises a question in my tiny mind - how easy is it to write an
XML Schema with which it is undecidable whether a given document is valid
against that schema ?

> BTW, I read that XSLT is also shown to be Turing-complete. Therefore,
> it is just as difficult to decide if an application of a stylesheet to
> an XML document will yield anything....

Cool, that's quite twisted....

joe_english

unread,
Mar 6, 2002, 7:10:38 PM3/6/02
to
David Golden wrote:

>That then raises a question in my tiny mind - how easy is it to write an
>XML Schema with which it is undecidable whether a given document is valid
>against that schema ?

I won't swear to this, since reading the W3C XML Schema
spec makes my head hurt, but there is a decision procedure
for W3C XML Schema validity. The REC is based on (a restricted
subset of) tree-regular grammars, plus a whole bunch of other
stuff piled on top. If nothing in the pile of other stuff
is Turing-complete (I suspect not), then W3C XML Schema validators
always halt.


--Joe English

Jim Bender

unread,
Mar 6, 2002, 8:46:26 PM3/6/02
to
On a slightly tangential point:

I have thought that the more syntax-case like "match" being used in Kent
Dybvig and Dan Friedman's classes at Indiana (see
http://www.cs.indiana.edu/l/www/classes/c311/res/index.html) would be a
*very* interesting mechanism to transform XML in Scheme.

My bibliography sites (http://readscheme.org) heavily use XML, which is then
transformed into HTML. I want to be able to reformulate these hand-written
(and more procedural) transformations into something like their "match".
Perhaps by the summertime this will happen.

For those who are interested, the Scheme bibliography is based on a pair of
XML documents: an XML version of a bibtex Scheme bibliography, and a
separate XML file which describes the "structure" of the site. These are
available (at least temporarily) at:
- http://celtic.benderweb.net/xml/scheme_biblio.sbib.xml
- http://celtic.benderweb.net/xml/scheme-bib.site.xml

Jim Bender
http://library.readscheme.org -- the Bibliography of Scheme-related Research
http://haskell.readscheme.org -- the Online Bibliography of Haskell Research
http://readscheme.org

"David Golden" <qnivq....@bprnaserr.arg> wrote in message

news:m9wh8.6414$D6.1...@news.iol.ie...

MJ Ray

unread,
Mar 7, 2002, 6:49:14 AM3/7/02
to
David Golden <qnivq....@bprnaserr.arg> wrote:
>> BTW, I read that XSLT is also shown to be Turing-complete. Therefore,
>> it is just as difficult to decide if an application of a stylesheet to
>> an XML document will yield anything....
> Cool, that's quite twisted....

Sorry if this was already posted, but http://www.unidex.com/turing/

--
MJR ,----------------------------------------------------
| Q. Do you need a net-based application developing,
| or advice and training about web technology?
| A. I suggest you try http://www.luminas.co.uk/

ol...@pobox.com

unread,
Mar 7, 2002, 9:46:47 PM3/7/02
to
kar...@walrus.megabaud.fi (Antti Karttunen) wrote in message news:<a657d0$o97$1...@walrus.megabaud.fi>...
Actually, R5RS macros are more powerful than they may appear. The
present article implements your example using only hygienic macros.

> (define-convform (NUKE_CRS (I infile) (O outfile))
> (run (tr -d '\015') (< ,infile) (> ,outfile))
> )
> and that macro define-convform would do two things:

> A) define a function

> (define (NUKE_CRS_fun *STEP* *DEFS* *IO_BINDINGS* infile outfile)
> (let ((infile (assq infile *IO_BINDINGS*))
> (outfile (some_code_for_creating_new_temporary_file_name
> outfile and_destructively_adding_it_to *IO_BINDINGS*))
> )
> (some_catcher_and_thrower_for_possibly_ruined_execution_here?
> (run (tr -d `\015`) (< ,infile) (> ,outfile))
> )
> )
> )

> B) define a macro called NUKE_CRS
> which would expand something like (NUKE_CRS (I file1) (O file2))
> to call: (NUKE_CRS_fun *STEP* *DEFS* *IO_BINDINGS* 'file1 'file2)
> and these all invocations would occur
> inside some begin-form or such with *DEFS* and *IO_BINDINGS*
> bound as appropriate association-lists.

You don't seem to need to define any macros defining macros. Most of
the things can be done with procedures.

; An assoc list of (TAG . conv-proc)
; where conv-proc is a procedure
; STEP DEFS IO-BINDINGS FILE-NAME FILE-NAME ...
; where FILE-NAME is a symbol.

(define *convforms* '())

Now,

(define-convform (NUKE_CRS (I infile) (O outfile))
(run (tr -d "\015") (< ,infile) (> ,outfile))
)

will expand into the following:

(begin
(set! *convforms*
(cons
(cons 'NUKE_CRS
(lambda (step defs io-bindings infile outfile)
(let ((file1 (locate file1 io-bindings)))
(let ((file2 (store! file2 io-bindings)))
(run (tr -d '\015') (< ,file1) (> ,file2))))))
*convforms*))
(define (NUKE_CRS file1 file2)
(cond
((assq 'NUKE_CRS *convforms*) =>
(lambda (ass)
((cdr ass) *STEP* *DEFS* *IO-BINDINGS* file1 file2)))
(else (error "can't find procedure: " 'NUKE_CRS))))
)

To be more precise, the lambda form associated with NUKE_CRS has
actually the form:


(lambda (step defs io-bindings infile outfile)
((lambda (file1 file2)
(run (tr -d '\015') (< ,file1) (> ,file2)))
(locate infile io-bindings #t)
(locate outfile io-bindings #f)))

where (locate file io-bindings input-file?) locates the file in
io-bindings (if it is an input file). Otherwise, it creates the file
and destructively modifies the bindings.

The only non-trivial part is converting


(run (tr -d '\015') (< ,infile) (> ,outfile))

into
(lambda (file1 file2)
(run (tr -d '\015') (< ,file1) (> ,file2)))
That is, renaming of variables in the body of define-convform. Well,
it all can be done with hygienic macros. Given the code in appendix,

(define-convform (NUKE_CRS (I infile) (O outfile))
(run (tr -d "\\015") (< ,infile) (> ,outfile))
)

indeed expands into:

(begin (set! *convforms*
(cons (cons 'NUKE_CRS
(lambda
(step~10~11
defs~10~11
io-bindings~10~11
infile~11
outfile~11)
((lambda
(fname~1~6~13 fname~1~5~13)


(run (tr -d "\\015")

(< ,fname~1~6~13)
(> ,fname~1~5~13)))
(locate-file io-bindings~10~11 infile~11 #t)
(locate-file io-bindings~10~11 outfile~11 #f))))
*convforms*))
(define
(NUKE_CRS fname~1~6~20 fname~1~5~20)
((lambda
(temp~21~23)
(if temp~21~23
((lambda
(ass~19~24)
((cdr ass~19~24)
*STEP*
*DEFS*
*IO-BINDINGS*
fname~1~6~20
fname~1~5~20))
temp~21~23)
(begin (error "can't find procedure: " 'NUKE_CRS))))
(assq 'NUKE_CRS *convforms*))))

Note how all macroexpander-generated names match each other in all the
right places.

Given the dummy definitions for the 'run' macro and other procedures,
(NUKE_CRS (I myinfile1) (O myofile2))

indeed runs and prints the correct result. The complete code follows:

; An assoc list of (TAG . conv-proc)
; where conv-proc is a procedure
; STEP DEFS IO-BINDINGS FILE-NAME FILE-NAME ...
; where FILE-NAME is a symbol.

(define *convforms* '())

; This is for debugging purposes only
(define-syntax run
(syntax-rules ()
((_ . forms) (begin
(display "Runnning: ") (display (quasiquote forms)) (newline)))
))

; the main dispatch macro
(define-syntax define-convform
(syntax-rules ()
((_ (_tag . file-args) _body)
(letrec-syntax
((partition-file-args ; partition into ifiles and ofiles args
(syntax-rules (I O)
((_ () ifiles ofiles all-files tag body)
(make-arglist () () (ifiles ofiles) all-files tag body))
((_ ((I file) . other-args) ifiles ofiles all-files tag body)
(partition-file-args other-args
(file . ifiles) ofiles (file . all-files)
tag body))
((_ ((O file) . other-args) ifiles ofiles all-files tag body)
(partition-file-args other-args
ifiles (file . ofiles) (file . all-files)
tag body)))
)
(make-arglist ; generate tempnames for each of the arglist
(syntax-rules ()
((_ arglist argtemps iofiles () tag body)
(gen arglist argtemps iofiles tag body))
((_ arglist argtemps iofiles (file . files) tag body)
(make-arglist (file . arglist) (fname . argtemps)
iofiles files tag body))))
(gen
(syntax-rules ()
((_ arglist argtemps iofiles tag body)
(begin
(set! *convforms*
(cons
(cons (quote tag)
(gen-body arglist argtemps iofiles body))
*convforms*))
(gen-def argtemps tag)))))
)
(partition-file-args file-args () () () _tag _body)))))

; The renamer:
; any mentioning of one of the arglist in the body is replaced with argtemp
(define-syntax gen-body
(syntax-rules ()
((_ arglist _argtemps iofiles body)
(gen-outer arglist iofiles
(let-syntax
((ren (syntax-rules ()
((_ argtemps arglist) (lambda argtemps body)))))
(ren _argtemps _argtemps))))))

; generate the outer conv procedure, which is to be associated with
; with the tag
(define-syntax gen-outer
(syntax-rules ()
((_ (file ...) (ifiles ofiles) inner-lambda)
(lambda (step defs io-bindings file ...)
(inner-lambda
(locate-file io-bindings file (?memq? file ifiles)) ...)))))

(define-syntax ?memq?
(syntax-rules ()
((_ x xs)
(letrec-syntax
((so?
(syntax-rules (x)
((_ x . others) #t)
((_) #f)
((_ y . others)
(so? . others)))))
(so? . xs)))))

(define-syntax gen-def
(syntax-rules ()
((_ (file ...) tag)
(define (tag file ...)
(cond
((assq (quote tag) *convforms*) =>
(lambda (ass)
((cdr ass) *STEP* *DEFS* *IO-BINDINGS* file ...)))
(else (error "can't find procedure: " (quote tag))))))))

(define-syntax I
(syntax-rules ()
((_ sym) (quote sym))))

(define-syntax O
(syntax-rules ()
((_ sym) (quote sym))))

; A sample definition


(define-convform (NUKE_CRS (I infile) (O outfile))
(run (tr -d "\\015") (< ,infile) (> ,outfile))
)

; The following is a set of stubs to make the example run

(define (locate-file io-bindings file input?)
(if input?
(cond
((assq file io-bindings) => cdr)
(else (error "unknown input file: " file-symb #f))))
(store! file io-bindings))

(define *STEP* #f)
(define *DEFS* '())
(define *IO-BINDINGS* '((#f . #f) ; header
(myinfile1 . "infile-1")))

(define (store! file io-bindings)
(let ((file-name (symbol->string file)))
(set-cdr! io-bindings (cons (cons file file-name) (cdr io-bindings)))
file-name))

(NUKE_CRS (I myinfile1) (O myofile2)) ; and it surely runs
; it prints: Runnning: ((tr -d \015) (< myinfile1) (> myofile2))

0 new messages