Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Macros?

2 views
Skip to first unread message

Herbert Snorrason

unread,
Jan 29, 2006, 1:53:25 PM1/29/06
to Perl6
Perl6 will have macros. Good. Cool. But, sadly, that seems to be close
to the most specific thing anyone says about the subject. There is
some further discussion in Apocalypse & Exegesis 6, but nothing in the
Synopsis.

Now, considering that macros are a language feature and that the
Synopses are presented as the language spec, I wonder if there
shouldn't be something there. Exactly what, though, I won't pretend to
know. :)
--
Schwäche zeigen heißt verlieren;
härte heißt regieren.
- "Glas und Tränen", Megaherz

Yuval Kogman

unread,
Jan 29, 2006, 2:03:33 PM1/29/06
to Herbert Snorrason, Perl6
On Sun, Jan 29, 2006 at 18:53:25 +0000, Herbert Snorrason wrote:
> Perl6 will have macros. Good. Cool. But, sadly, that seems to be close
> to the most specific thing anyone says about the subject. There is
> some further discussion in Apocalypse & Exegesis 6, but nothing in the
> Synopsis.
>
> Now, considering that macros are a language feature and that the
> Synopses are presented as the language spec, I wonder if there
> shouldn't be something there. Exactly what, though, I won't pretend to
> know. :)

Basically the plan is that when an internal AST language is decided
upon, the macros will be able to get either the source code text, or
an AST.

Then the macros have to emit butchered text, or butchered ASTs.

Aside from that they are normal perl 6 subroutines, that simply get
invoked during compile time instead of during runtime.

The AST language (maybe it'll be PIL based) is not yet final, so
there's not much to say.

--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: /me climbs a brick wall with his fingers: neeyah!

Luke Palmer

unread,
Jan 29, 2006, 3:13:44 PM1/29/06
to Herbert Snorrason, Perl6
On 1/29/06, Yuval Kogman <nothi...@woobling.org> wrote:
> Aside from that they are normal perl 6 subroutines, that simply get
> invoked during compile time instead of during runtime.

With one extra "feature". By default (my preference) or with a trait,
parameters can get passed in as ASTs instead of real values:

macro debug ($var) {
qq[print '$var.text() = ' ~ $var.text()]
}
debug($foo );
# expands to
print '$foo = ' ~ $var ;

We would also like a quasiquoting mechanism, so don't have to rely on
string concatenation, and we don't have to construct parse trees by
hand. It's sort of a happy medium. But that is as yet unspecced.

Luke

Herbert Snorrason

unread,
Jan 29, 2006, 3:29:43 PM1/29/06
to Perl6
On 29/01/06, Yuval Kogman <nothi...@woobling.org> wrote:
> Basically the plan is that when an internal AST language is decided
> upon, the macros will be able to get either the source code text, or
> an AST.
Two things. First, if the AST path is taken, doesn't that mean that
the AST representation has to be compatible between implementations
(assuming there'll be more than one)? Secondly, there's ease of use.
ASTs are, at least from what I've seen, pretty verbose. Aren't we
trying to make things easy for the programmer? With source text, doing
manipulations by hand can be a bother, so that's no solution either...

Maybe I'm spoiled by the idea of s-expressions, though. But I get the
impression that lispy macros are where the idea comes from...

Yuval Kogman

unread,
Jan 29, 2006, 4:15:08 PM1/29/06
to Herbert Snorrason, Perl6
On Sun, Jan 29, 2006 at 20:29:43 +0000, Herbert Snorrason wrote:
> On 29/01/06, Yuval Kogman <nothi...@woobling.org> wrote:
> > Basically the plan is that when an internal AST language is decided
> > upon, the macros will be able to get either the source code text, or
> > an AST.
> Two things. First, if the AST path is taken, doesn't that mean that
> the AST representation has to be compatible between implementations
> (assuming there'll be more than one)?

Yes.

> Secondly, there's ease of use. ASTs are, at least from what I've
> seen, pretty verbose. Aren't we trying to make things easy for the
> programmer? With source text, doing manipulations by hand can be a
> bother, so that's no solution either...
>
> Maybe I'm spoiled by the idea of s-expressions, though. But I get
> the impression that lispy macros are where the idea comes from...

Well, the aim is to get something as nice as lisp macros. Hopefully
the AST will be easy enough to chew with the tools provided in the
language.

Remember, however, that this is not a parse tree, and is thus
somewhat simpler.

BTW, do we also support parse trees?

--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &

/\ kung foo master: /me sushi-spin-kicks : neeyah!!!!!!!!!!!!!!!!!!!!

Larry Wall

unread,
Feb 2, 2006, 6:29:02 PM2/2/06
to Perl6

S06 now sez:

+=head2 Macros

+Macros are functions or operators that are called by the compiler as
+soon as their arguments are parsed (if not sooner). The syntactic
+effect of a macro declaration or importation is always lexically scoped,
+even if the name of the macro is visible elsewhere. As with ordinary operators,
+macros may be classified by their grammatical category. For a given grammatical
+category, a default parsing rule or set of rules is used, but those rules
+that have not yet been "used" by the time the macro keyword or token is seen
+can be replaced by use of "is parsed" trait. (This means, for instance, that
+an infix operator can change the parse rules for its right operand but not its
+left operand.)
+
+A macro is called as if it were a method on the current match object returned
+from the grammar rule being reduced.
+
+Macros may return either a string to be reparsed, or a syntax tree
+that needs no further parsing. The textual form is handy, but the
+syntax tree form is generally preferred because it allows the parser
+and debugger to give better error messages. Textual substitution
+on the other hand tends to yield error messages that are opaque to
+the user. Syntax trees are also better in general because they are
+reversible, so things like syntax highlighters can get back to the
+original language and know which parts of the derived program come
+from which parts of the user's view of the program.
+
+In aid of returning syntax tree, Perl provides a "quasiquoting" mechanism
+using the keyword "code", followed by a block intended to represent an AST:
+
+ return code { say $a };
+
+[Conjecture: Other keywords are possible if we have more than one AST type.]
+
+Within a quasiquote, variable and function names resolve first of all
+according to the lexical scope of the macro definition, and if unrecognized in
+that scope, are assumed to be bound from the scope of the macro call
+each time it is called. If they cannot be bound from the scope of
+the macro call, a compile-time exception is thrown.
+
+Variables that resolve from the lexical scope of the macro definition
+will be inserted appropriately depending on the type of the variable,
+which may be either a syntax tree or a string. (Again, syntax tree
+is preferred.) The case is similar to that of a macro called from
+within the quasiquote, insofar as reparsing only happens with the
+string version of interpolation, except that such a reparse happens at
+macro call time rather than macro definition time, so its result cannot
+change the parsers expections about what follows the interpolated variable.
+
+Hence, while the quasiquote itself is being parsed, the syntactic
+interpolation of a variable into the quasiquote always results in
+the expectation of an operator following the variable. (You must
+use a call to a submacro if you want to expect something else.)
+Of course, the macro definition as a whole can expect whatever it
+likes afterwards, according to its syntactic category. (Generally,
+a term expects a following postfix or infix operator, and an operator
+expects a following term or prefix operator.)
+
+In case of name ambiguity, prefix with C<COMPILING::> to indicate a name in
+the compiling scope, and C<OUTER::> to indicate a name in the macro definition's
+scope.
+
+[Conjecture: Due to these dwimmy scoping rules, there is no need of
+a special "unquote" construct as in Scheme et al.]

Larry

Brad Bowman

unread,
Feb 4, 2006, 8:32:08 PM2/4/06
to Perl6

Hi,

I've read and reread the macro explanation but I'm still not entirely
clear on number of things. The questions and thoughts below are based
on my (mis)understanding.

On 03/02/06 02:05, Larry Wall wrote:
> Macros are functions or operators that are called by the compiler as

> soon as their arguments are parsed (if not sooner). The syntactic

> effect of a macro declaration or importation is always lexically

> scoped, even if the name of the macro is visible elsewhere.

And presumably they can be lexically unimported, or whatever the verb is
for what "no" does.

> As with
> ordinary operators, macros may be classified by their grammatical
> category. For a given grammatical category, a default parsing rule or
> set of rules is used, but those rules that have not yet been "used"
> by the time the macro keyword or token is seen can be replaced by
> use of "is parsed" trait. (This means, for instance, that an infix


> operator can change the parse rules for its right operand but not

> its left operand.)
>
> In the absence of a signature to the contrary, a macro is called as
> if it were a method on the current match object returned from the
> grammar rule being reduced; that is, all the current parse information
> is available by treating C<self> as if it were a C<$/> object.

Is this a :keepall match object?
Or is the Perl6 grammar conserving by default?
(The "Syntax trees [...] are reversible" suggests so)
Or is this one of the "signature to the contrary" possibilities?

> [Conjecture: alternate representations may be available if arguments
> are declared with particular AST types.]


>
> Macros may return either a string to be reparsed, or a syntax tree

> that needs no further parsing. The textual form is handy, but the

> syntax tree form is generally preferred because it allows the parser

> and debugger to give better error messages. Textual substitution

> on the other hand tends to yield error messages that are opaque to

> the user. Syntax trees are also better in general because they are

> reversible, so things like syntax highlighters can get back to the

> original language and know which parts of the derived program come

> from which parts of the user's view of the program.
>

> In aid of returning syntax tree, Perl provides a "quasiquoting"

> mechanism using the keyword "CODE", followed by a block intended to
> represent an AST:
>
> return CODE { say $a };

I guess the string form is C<eval "CODE { $str }">

If CODE may enclose arbitrary source text of whatever DSL poeple invent,
alternate braces would probably be useful. Either q()-like, HERE-doc
or pod's C<< >> nesting style.

> [Conjecture: Other keywords are possible if we have more than one
> AST type.]

Ocaml and camlp4 are probably a good source of ideas for quasiquoting.
I've only perused the documentation, has one actually used Ocaml here?
See: http://caml.inria.fr/pub/docs/tutorial-camlp4/tutorial004.html

Rather than misrepresenting Ocaml with my sketchy understanding,
I'll just mention some possibly interesting features:

Specific expander rules from the grammar can be used, <:rulename< ... >>

They have a C -> AST expander. I can imagine a SQL -> AST expander
would find some use in Perl. I don't think the same AST type is used but
that's just a guess.

Two of the "p"s in p4 stand for pretty-printer, which is the AST->source
conversion. In addition to aiding debugging and reformatting, it allows
interconversion between different syntaxes (sp?). Ocaml comes with two
grammars, one is backwards compatible and the other has jettisoned
the baggage.

> Within a quasiquote, variable and function names resolve first of

> all according to the lexical scope of the macro definition, and if
> unrecognized in that scope, are assumed to be bound from the scope
> of the macro call each time it is called. If they cannot be bound
> from the scope of the macro call, a compile-time exception is thrown.


>
> Variables that resolve from the lexical scope of the macro definition

> will be inserted appropriately depending on the type of the variable,

> which may be either a syntax tree or a string. (Again, syntax tree

> is preferred.) The case is similar to that of a macro called from

> within the quasiquote, insofar as reparsing only happens with the

> string version of interpolation, except that such a reparse happens

> at macro call time rather than macro definition time, so its result
> cannot change the parser's expectations about what follows the
> interpolated variable.

Is there any cpp-like protection against self-referential expansions
when using the string returning form?

The last S06 sentence above overflowed my mental stack, so I'm unsure whether
self-referential expansions are somehow impossible.

> Hence, while the quasiquote itself is being parsed, the syntactic

> interpolation of a variable into the quasiquote always results in

> the expectation of an operator following the variable. (You must

> use a call to a submacro if you want to expect something else.)

> Of course, the macro definition as a whole can expect whatever it

> likes afterwards, according to its syntactic category. (Generally,

> a term expects a following postfix or infix operator, and an operator

> expects a following term or prefix operator.)

Do @arrays of ASTs interpolate/splice?

Lisp needs ,@ (comma-at) to do splatty interpolation, that is remove the
outer pair of parens. Depending on what the ASTs look like and how they
splice together, such a form may or may not be necessary.

> In case of name ambiguity, prefix with C<COMPILING::> to indicate a

> name in the compiling scope, and anything else (such as C<OUTER::>)
> to indicate a name in the macro definition's scope, since that's the
> default. In particular, any variable declared within the quasiquote
> block is assumed to scope to the quasiquote; to scope the declaration
> to the macro call's scope, you must say
>
> my COMPILING::<$foo> = 123;
> env COMPILING::<@bar> = ();
> our COMPILING::<%baz>;
>
> or some such if you wish to force the compiler to install the variable
> into the symbol table being constructed by the macro call.

"COMPILING" here means the scope in which the macro is being expanded, rather
than the scope in which the macro itself is being compiled, is that correct?

Perhaps a twigil would be clearer? Such huffmanization is probably
undeserved and would be seen as encouraging promiscuous lexical intercourse...


What are the variable visibility rules when interpolating in quasiquotes?
Does a variable unbound in a spliced AST bind to one in the enclosing
quasiquote?

The consequences of this when inserting an AST from a parsed parameter
need to be considered. If the enclosing quotes variables are visible
then an unintended binding may occur.


> [Conjecture: Due to these dwimmy scoping rules, there is no need of

> a special "unquote" construct as in Scheme et al.]

No gensym shenanigans either. The scoping rules seem to be hygienic,
no unintended variable leaking. Unintended variable capture seems unlikely
too, only if you forget to declare a variable with the macro declaration
and coincidently declare the same variable in the macro use scope will
everything go haywire.


Brad

--
That one's own district is unsophisticated and unpolished is a great
treasure. Imitating another style is simply a sham.
-- Hagakure http://bereft.net/hagakure/

Larry Wall

unread,
Feb 6, 2006, 5:04:33 PM2/6/06
to Perl6
On Sun, Feb 05, 2006 at 02:32:08AM +0100, Brad Bowman wrote:
:
: Hi,

:
: I've read and reread the macro explanation but I'm still not entirely
: clear on number of things. The questions and thoughts below are based
: on my (mis)understanding.
:
: On 03/02/06 02:05, Larry Wall wrote:
: > Macros are functions or operators that are called by the compiler as
: > soon as their arguments are parsed (if not sooner). The syntactic
: > effect of a macro declaration or importation is always lexically
: > scoped, even if the name of the macro is visible elsewhere.
:
: And presumably they can be lexically unimported, or whatever the verb is
: for what "no" does.

Presumably. At least its grammatical effect must be unimportable, even
if the name isn't. Which we could do, since we've divorced the grammatical
effect from name visibility. Nevertheless, the easiest thing might just
be to hide the name, or rather the lexical alias of the name, if the
existence of the lexical alias is what controls the lexical scoping of
the grammatical effect.

: > As with


: > ordinary operators, macros may be classified by their grammatical
: > category. For a given grammatical category, a default parsing rule or
: > set of rules is used, but those rules that have not yet been "used"
: > by the time the macro keyword or token is seen can be replaced by
: > use of "is parsed" trait. (This means, for instance, that an infix
: > operator can change the parse rules for its right operand but not
: > its left operand.)
: >
: > In the absence of a signature to the contrary, a macro is called as
: > if it were a method on the current match object returned from the
: > grammar rule being reduced; that is, all the current parse information
: > is available by treating C<self> as if it were a C<$/> object.
:
: Is this a :keepall match object?
: Or is the Perl6 grammar conserving by default?
: (The "Syntax trees [...] are reversible" suggests so)
: Or is this one of the "signature to the contrary" possibilities?

It feels to me like something that wants to be controlled by a very
large context, such as which debugger/IDE you're working under, if any.
Maybe that's one of those "signature to the contrary" things. I dunno.

: > [Conjecture: alternate representations may be available if arguments


: > are declared with particular AST types.]
: >
: > Macros may return either a string to be reparsed, or a syntax tree
: > that needs no further parsing. The textual form is handy, but the
: > syntax tree form is generally preferred because it allows the parser
: > and debugger to give better error messages. Textual substitution
: > on the other hand tends to yield error messages that are opaque to
: > the user. Syntax trees are also better in general because they are
: > reversible, so things like syntax highlighters can get back to the
: > original language and know which parts of the derived program come
: > from which parts of the user's view of the program.
: >
: > In aid of returning syntax tree, Perl provides a "quasiquoting"
: > mechanism using the keyword "CODE", followed by a block intended to
: > represent an AST:
: >
: > return CODE { say $a };
:
: I guess the string form is C<eval "CODE { $str }">

Seems like that would bind variables differently, unless we took steps
for it not too. I was thinking that string macros would have no binding
to the macro's definition's lexical scope. But then I'm not sure what
that could desugar to.

: If CODE may enclose arbitrary source text of whatever DSL poeple invent,


: alternate braces would probably be useful. Either q()-like, HERE-doc
: or pod's C<< >> nesting style.

Any CODE-like macro could choose its own delimiter policy. Arguably we
could go with q:code or some such instead, and I considered this,
but it seemed to me that if you're parsing something that the user
is thinking of primarily as generic Perl code, it ought to look more
like a code block and less like a string.

: > [Conjecture: Other keywords are possible if we have more than one


: > AST type.]
:
: Ocaml and camlp4 are probably a good source of ideas for quasiquoting.
: I've only perused the documentation, has one actually used Ocaml here?

Not this one.

: See: http://caml.inria.fr/pub/docs/tutorial-camlp4/tutorial004.html

In my copious free time... :-)

: Rather than misrepresenting Ocaml with my sketchy understanding,


: I'll just mention some possibly interesting features:
:
: Specific expander rules from the grammar can be used, <:rulename< ... >>

Our rules are all just subs in disguise, so I'm sure we could do something
similar, modulo syntactic sugar.

: They have a C -> AST expander. I can imagine a SQL -> AST expander


: would find some use in Perl. I don't think the same AST type is used but
: that's just a guess.

At this point I'm not so interested in specific mappings, but I'm sure
everyone will have their favorites.

: Two of the "p"s in p4 stand for pretty-printer, which is the AST->source


: conversion. In addition to aiding debugging and reformatting, it allows
: interconversion between different syntaxes (sp?). Ocaml comes with two
: grammars, one is backwards compatible and the other has jettisoned
: the baggage.

Pugs would like to perform similar tricks.

: > Within a quasiquote, variable and function names resolve first of


: > all according to the lexical scope of the macro definition, and if
: > unrecognized in that scope, are assumed to be bound from the scope
: > of the macro call each time it is called. If they cannot be bound
: > from the scope of the macro call, a compile-time exception is thrown.
: >
: > Variables that resolve from the lexical scope of the macro definition
: > will be inserted appropriately depending on the type of the variable,
: > which may be either a syntax tree or a string. (Again, syntax tree
: > is preferred.) The case is similar to that of a macro called from
: > within the quasiquote, insofar as reparsing only happens with the
: > string version of interpolation, except that such a reparse happens
: > at macro call time rather than macro definition time, so its result
: > cannot change the parser's expectations about what follows the
: > interpolated variable.
:
: Is there any cpp-like protection against self-referential expansions
: when using the string returning form?

Not currently.

: The last S06 sentence above overflowed my mental stack, so I'm unsure

: whether self-referential expansions are somehow impossible.

That wasn't the intent. I was merely trying to straighten out what
grammatical category the parser would be looking for after an interpolation.
Basically, an auto-unquoted variable always expects an operator after it,
whereas an ordinary string macro can leave the parser in any state it
likes at the end of the string. For example, a string macro can
be defined that return "$x +", in which case a term is expected afterwards.
You can't do that with autounquoted variables--you have to use something
that looks more like a function.

But I admit that I'm handwaving here.

: > Hence, while the quasiquote itself is being parsed, the syntactic


: > interpolation of a variable into the quasiquote always results in
: > the expectation of an operator following the variable. (You must
: > use a call to a submacro if you want to expect something else.)
: > Of course, the macro definition as a whole can expect whatever it
: > likes afterwards, according to its syntactic category. (Generally,
: > a term expects a following postfix or infix operator, and an operator
: > expects a following term or prefix operator.)
:
: Do @arrays of ASTs interpolate/splice?

I would think the right thing to depends on the type of each element.

: Lisp needs ,@ (comma-at) to do splatty interpolation, that is remove the


: outer pair of parens. Depending on what the ASTs look like and how they
: splice together, such a form may or may not be necessary.

Insert more handwaving here. For features like this I'll be relying
heavily on feedback from the lambdacamel combinators. My main goal
in participating in the design is to make sure Perl 6 remains usable
by mere mortals for most of the things mere mortals might want to do.

: > In case of name ambiguity, prefix with C<COMPILING::> to indicate a


: > name in the compiling scope, and anything else (such as C<OUTER::>)
: > to indicate a name in the macro definition's scope, since that's the
: > default. In particular, any variable declared within the quasiquote
: > block is assumed to scope to the quasiquote; to scope the declaration
: > to the macro call's scope, you must say
: >
: > my COMPILING::<$foo> = 123;
: > env COMPILING::<@bar> = ();
: > our COMPILING::<%baz>;
: >
: > or some such if you wish to force the compiler to install the variable
: > into the symbol table being constructed by the macro call.
:
: "COMPILING" here means the scope in which the macro is being expanded,
: rather
: than the scope in which the macro itself is being compiled, is that correct?

Yes.

: Perhaps a twigil would be clearer? Such huffmanization is probably


: undeserved and would be seen as encouraging promiscuous lexical
: intercourse...

I used to have a twigil for it, and dehuffmanized it. (Used to be the +
twigil, which I've since reused for env vars.)

: What are the variable visibility rules when interpolating in quasiquotes?


: Does a variable unbound in a spliced AST bind to one in the enclosing
: quasiquote?

Good question. I don't profess to know the right answer. My guess
is that, if the AST was passed in as an argument, it has already
been matched against the COMPILING scope, so anything unbound would
be an error if we don't bind it into the macro body (assuming the
user is in "use strict" land, but if not, the parser could already
have bound an unrecognized variable into the current package, so an
unbound variable could still be an unambiguous indication of desire
to bind to the macro in that case). But I think we'll just have to
play with it and see what makes the most sense.

: The consequences of this when inserting an AST from a parsed parameter

: need to be considered. If the enclosing quotes variables are visible
: then an unintended binding may occur.

Yes, depends on when variables are bound to the AST snippet.

: > [Conjecture: Due to these dwimmy scoping rules, there is no need of


: > a special "unquote" construct as in Scheme et al.]
:
: No gensym shenanigans either. The scoping rules seem to be hygienic,
: no unintended variable leaking. Unintended variable capture seems unlikely
: too, only if you forget to declare a variable with the macro declaration
: and coincidently declare the same variable in the macro use scope will
: everything go haywire.

Yes, my intent is to trade a few unlikely "Don't do thats" for an
increase in naturalness. But again, I'm just doing this by the seat
of my pants. It's not like I really know all the ins and outs of
what I'm doing. I'm not smart enough to do an exhaustive search--I'm
more like one of those neural net chess programs that is too organic
to tell you why it made any particular move...

Larry

0 new messages