I came across J.Amsterdam's Iterate package and would like to address its main weakness: correct, lexical aware code expansion.
I believe in its over 10 years of existence, it's never been able to correctly handle this form: (macrolet ((over(x) `(collect ,x))) (iterate (for n in '(1 2 3)) (flet ((over(x)(declare (ignore x)) (collect 2))) (over n)))) ; would yield (2 2 2) if correct
One reason is that it does not contain a code walker aware of lexical bindings.
I actually wonder whether Iterate actually needs a full code walker, thereby duplicating functionality found in every Lisp implementation, or whether Iterate could be written without one, possibly by clever use of MACROLET and/or *MACROEXPAND-HOOK*?
Here's one of the problems: Iterate allows a (COLLECT x INTO var) clause to appear anywhere deep inside the body of (Iterate ...), contrary to LOOP. As a result, it's only after COLLECT has been encountered that Iterate knows which bindings it must create in its top level expansion. Information flows from deep inside to the outermost level. How to do that?
E.g. (iterate ... (collect x into a) ... (collect ... into b) ...) -> `(let (a b) ... ...) (iterate ... no-collect-at-all ...) -> `(let () ...) or (progn ...)
Do you have an idea how such a COLLECT INTO macro could be written without duplicating a code walker?
This seems hairy to me if one wants it to work both in interpreted and compiled modes, since with an interpreter, the COLLECT form may only be encountered upon evaluation of (COLLECT #) --really late.
If it's not possible without a code walker, are there portable means to use the implementation's code walking engine? It seems a pity to me that something like AUGMENT-ENVIRONMENT (in CLtL2) was not standardized in ANSI-CL. Section 8.5, Environments, of CLtL2 see http://www.supelec.fr/docs/cltl/clm/node102.html
Thanks for your help, Jorg Hohle Telekom/T-Systems Technology Center
Joerg Hoehle <hoe...@users.sourceforge.net> wrote: > I came across J.Amsterdam's Iterate package and would like to address > its main weakness: correct, lexical aware code expansion.
I'm not sure if Andreas Fuchs (the maintainer of Iterate) reads comp.lang.lisp, so you could consider sending your question to the iterate mailing list on common-lisp.net.
(This is the first time I've opened c.l.l in a long time, and I didn't intend to post anything, but then... (-:)
I'm cross-posting this to the iterate-devel mailing list.
On Mon, 15 Nov 2004 10:15:35 +0100, Joerg Hoehle wrote: > I came across J.Amsterdam's Iterate package and would like to address > its main weakness: correct, lexical aware code expansion.
> I believe in its over 10 years of existence, it's never been able to > correctly handle this form: > (macrolet ((over(x) `(collect ,x))) > (iterate > (for n in '(1 2 3)) > (flet ((over(x)(declare (ignore x)) (collect 2))) > (over n)))) ; would yield (2 2 2) if correct
Actually, the current version of iterate can handle that clause pretty well (since iterate--main--1.0--patch-6, since when we pass the *env* argument to macroexpand calls).
What it can't handle (and what I think you meant to post, anyway) are nested macrolet clauses, like:
(iterate (macrolet ((over(x) `(collect ,x))) (for n in '(1 2 3)) (flet ((over(x)(declare (ignore x)) (collect 2))) (over n)))) ; would yield (2 2 2) if correct
because the code walker does not know how to extend the environment with this nested macrolet.
> One reason is that it does not contain a code walker aware of lexical > bindings.
Right. I think that's partly due to the fact that it's impossible to have a macrolet (and symbol-macrolet)-aware code walker in portable CL without reimplementing half of common lisp. (-:
> I actually wonder whether Iterate actually needs a full code walker, > thereby duplicating functionality found in every Lisp implementation, > or whether Iterate could be written without one, possibly by clever > use of MACROLET and/or *MACROEXPAND-HOOK*?
An iterate that doesn't code walk could end up being more friendly to its maintainer. Iterate in its current form relies very heavily on the code walker, so you'd end up re-implementing most of it from scratch (probably not such a bad thing, either (-:)
*macroexpand-hook*, now there's an idea. From the code walker, it might be possible to put a closure in there which can expand the macrolet by itself. That would be interesting to try out (you'd still have to create a macro function from the macrolet's definition. hmm)
> Here's one of the problems: Iterate allows a (COLLECT x INTO var) > clause to appear anywhere deep inside the body of (Iterate ...), > contrary to LOOP. As a result, it's only after COLLECT has been > encountered that Iterate knows which bindings it must create in its > top level expansion. Information flows from deep inside to the > outermost level. How to do that?
> E.g. > (iterate ... (collect x into a) ... (collect ... into b) ...) > -> `(let (a b) ... ...) > (iterate ... no-collect-at-all ...) > -> `(let () ...) or (progn ...)
> Do you have an idea how such a COLLECT INTO macro could be written > without duplicating a code walker?
Unfortunately, ANSI CL doesn't specify the order in which macros are expanded, but if it did, you /might/ be able to signal conditions which the outer *macroexpand-hook* functions could know how to treat. All this is highly speculative, but it would be a pretty neat hack. (-:
> This seems hairy to me if one wants it to work both in interpreted > and compiled modes, since with an interpreter, the COLLECT form may > only be encountered upon evaluation of (COLLECT #) --really late.
Right. That's probably part of the reason why ANSI doesn't specify macroexpand order.
> If it's not possible without a code walker, are there portable means > to use the implementation's code walking engine? It seems a pity to me > that something like AUGMENT-ENVIRONMENT (in CLtL2) was not > standardized in ANSI-CL. > Section 8.5, Environments, of CLtL2 see > http://www.supelec.fr/docs/cltl/clm/node102.html
It seems like most implementations offer code walker support similar to ClTl2's. While not ANSI CL, these might be your best bet.
Cheers, -- Andreas Fuchs, <a...@boinkor.net>, a...@jabber.at, antifuchs
Andreas Fuchs <a...@boinkor.net> writes: > It seems like most implementations offer code walker support similar > to ClTl2's. While not ANSI CL, these might be your best bet.
I think not even a complete code walker can solve Iterate. Consider:
(symbol-macrolet ((paradox t)) (macrolet ((when-symbol-macro (form &environment env) (when (nth-value 1 (macroexpand-1 'paradox env)) form))) (iterate (for x in '(1 2 3)) (when-symbol-macro (collect x into paradox)))))
If Iterate binds PARADOX as a variable, then that binding hides the symbol macro, and WHEN-SYMBOL-MACRO expands to NIL, so Iterate shouldn't see the COLLECT clause and has no reason to bind PARADOX. But in that case, the symbol macro remains visible and WHEN-SYMBOL-MACRO passes the COLLECT clause through, so Iterate should bind PARADOX after all.
I don't think any standard Common Lisp constructs have this problem.
toomas.altos...@hut.fi (Toomas Altosaar) writes: > If this is so, does that imply that Iterate's specification has > too many "degrees of freedom"?
I don't know if that term applies to discrete systems. Even if it does, it seems the very number of degrees of freedom in Iterate is ill-specified.
At first, I thought the binding problem could be solved by first walking the body once with an environment that doesn't include any accumulation variables, selecting the variables, and then walking the body again with the resulting environment and verifying that the same variables get selected again. However, this scheme does not allow a macro in the body to expand to a form that accumulates to a gensym variable. One could just forbid that but I think it would feel unobvious to the programmer.
I see the following ways to resolve the situation:
(a) Specify that any macros expanding to accumulation forms must not depend on whether the accumulation variables are already in the environment. This does not require any change in the Iterate implementation. However, I am not sure it is not possible to violate this requirement inadvertently.
(b) Require the programmer to list all the accumulation variables with an (:into a b c) form at the top of the body. Signal an error if the code attempts to accumulate into any other variables. This too would be incompatible with gensyms chosen in the body, but at least that would be obvious to the programmer. On the plus side, this would allow a nested Iterate form to accumulate to a variable bound by the outer Iterate form.
(c) Deduce the accumulation variables from the body, but not by expanding macros. Instead, define Iterate-specific walkers for all standard CL macros, and give the programmer a way to define walkers for her own macros. The walkers must then satisfy the requirement of (a) but that is not a big problem because they are already specific to Iterate. This would also ensure that the bindings made by
(iterate (case t (() (:collect t :into y))))
will not depend on how smart the CASE macro is trying to be.
> At first, I thought the binding problem could be solved by first > walking the body once with an environment that doesn't include > any accumulation variables, selecting the variables, and then > walking the body again with the resulting environment and > verifying that the same variables get selected again. However, > this scheme does not allow a macro in the body to expand to a > form that accumulates to a gensym variable. One could just > forbid that but I think it would feel unobvious to the > programmer.
> I see the following ways to resolve the situation:
I haven't been following this discussion closely enough, so sorry if it's been covered, but couldn't you have the ITERATE macro bind an accumulation "environment"? Make it basically an alist of (var-name . accumulator), and an accumulation macro could add its variable to the alist if it isn't there already. Or are these accumulation variables supposed to also be available as variables to the user? In that case, I'm pretty sure it can't be done without a code walker.
> Consider: > (symbol-macrolet ((paradox t)) > (macrolet ((when-symbol-macro (form &environment env) > (when (nth-value 1 (macroexpand-1 'paradox env)) > form))) > (iterate (for x in '(1 2 3)) > (when-symbol-macro > (collect x into paradox)))))
Are you sure this is correct code? IIRC the MACROLET expander is defined in the lexical environment, but CLHS clearly says "the consequences are undefined if the local macro definitions reference any local variable or function bindings that are visible in that lexical environment."
Doesn't this mean that (nth-value ...) is undefined as well?
Oops, this arguments sounds like use of macroexpand need be forbidden inside macrolet!?!
BTW, my interpretation is opposite of yours: Iterate does not detect '(collect into paradox) from an eagle's perspective. Instead, it proceeds top-down to walk the code. Therefore, when when-symbol-macro is encountered, Iterate has not yet seen the collect clause, therefore it has not yet bound paradox as a variable. As a result, NTH-VALUE returns T because the symbol-macro got expanded. Then only, Iterate gets to see the collect clause, which causes it to generate a variable binding for paradox.
As a result, the symbol-macro is shadowed. Iterate collects into a distinct variable and returns nil (you may wish to add a (finally (return paradox)) clause to see that).
>I don't think any standard Common Lisp constructs have this problem.
What's the problem, and why should e.g. cl:loop behave differently (except that loop does not accept collect inside some form)?
There's some point, though: if Iterate could yield an expansion without itself resorting to macroexpand, then when-symbol-macro would be expanded in a environment context where the Iterate expansion would have bound variables (via LET), causing different behaviour.
But that's little surprise, no? It all depends on what time and environments macroexpansion occurs. I can't remember a point where CL makes some guarantees or recommendatiosn on this.
Did I miss something?
Regards, Jorg Hohle Telekom/T-Systems Technology Center
Joerg Hoehle <hoe...@users.sourceforge.net> writes: > Kalle Olavi Niemitalo <k...@iki.fi> writes: >> Consider: >> (symbol-macrolet ((paradox t)) >> (macrolet ((when-symbol-macro (form &environment env) >> (when (nth-value 1 (macroexpand-1 'paradox env)) >> form))) >> (iterate (for x in '(1 2 3)) >> (when-symbol-macro >> (collect x into paradox)))))
> Are you sure this is correct code?
Yes, apart from the use of Iterate.
> IIRC the MACROLET expander is > defined in the lexical environment, but CLHS clearly says "the > consequences are undefined if the local macro definitions reference > any local variable or function bindings that are visible in that > lexical environment."
In my example, there are no local variable or function bindings in the lexical environment of the MACROLET form. Thus, the macro definition obviously does not reference any such bindings.
> Doesn't this mean that (nth-value ...) is undefined as well?
NTH-VALUE is defined in the global environment.
> Therefore, when when-symbol-macro is encountered, Iterate has > not yet seen the collect clause, therefore it has not yet bound > paradox as a variable.
The current implementation may do that, but in my opinion, it is the wrong thing to do. Iterate is effectively lying to the macro function about the environment. There is no general guarantee that the expansion will then operate correctly, and I don't think one should be added, as that would disable some macroexpand-time optimizations that are too difficult to be deferred to the compiler.
>>I don't think any standard Common Lisp constructs have this problem. > What's the problem, and why should e.g. cl:loop behave differently > (except that loop does not accept collect inside some form)?
I hope I have shown the problem. LOOP analyzes its clauses statically without macroexpansion and thus need not guess what to place in environments.
> Did I miss something?
I think you missed the (case t (() (:collect t :into x))) issue. Macros are not in general guaranteed to preserve the existence of forms, but Iterate seems to be relying on such a guarantee, in order to decide which variables to bind and how to initialize them.
> Kalle Olavi Niemitalo <k...@iki.fi> writes: > I don't think any standard Common Lisp constructs have this problem.
I mean you want to say that no CL macro needs macroexpand to produce its expansion (except maybe for compiler-let/macrolet tricks)? That's also why LOOP clauses appear at top-level.
> > I see the following ways to resolve the situation: > (b) Require the programmer to list all the accumulation variables > with an (:into a b c) form at the top of the body.
Me too sees that requiring collection variables be declared at top-level should not cause much harm to users, and also not prevent gensym'ed variables (they already have to be named twice since (collect foo into (gensym)) is quite nonsense, you want the value).
It would be an Iterate rather different from the one we know. But it might be worthwhile if it can then work without macroexpanding its body.
I haven't investigated whether that would be the only change from the current Iterate though. Currently, quite a few of Iterate forms lead to the creation of internal variables (but internal means they may not affect macroexpansion).
t...@conquest.OCF.Berkeley.EDU (Thomas F. Burdick) writes:
> I haven't been following this discussion closely enough, so sorry if > it's been covered, but couldn't you have the ITERATE macro bind an > accumulation "environment"? Make it basically an alist of > (var-name . accumulator), and an accumulation macro could add its > variable to the alist if it isn't there already. Or are these > accumulation variables supposed to also be available as variables to > the user? In that case, I'm pretty sure it can't be done without a > code walker.
They are available to the user as lexical variables (or at least symbol-macrolet) like loop variables so you can write (finally (return x)). That's why an internal plist like you propose doesn't seem to be enough. See Message-ID: <uy8go4np9....@users.sourceforge.net> Subject: macro flow from inside to outside where I raised this particular question :-) of how to possibly get lexical bindings for something named only deep within the body.
I believe Iterate uses lexical variables because that's what's felt efficient. a p-list of name/values would have lookup cost at run-time. Ten years ago, that would have been reason enough not to advertise Iterate as a good replacement for LOOP/DO etc.
However, I've never seen a documented restriction on loop variables. E.g. in both Iterate-1.0.9 and CLISP's LOOP, this does not work as expected: (loop initially (setq x (list 'a 'b)) ; try and prefill x for i in '(1 2 3) collect i into x finally (return x)) I don't know whether this is correct (portable) code.
BTW, drop me a note if you want to test my version of Iterate. I did nearly two dozen small changes and bugfixes to Iterate-1.0.9.
Regards, Jorg Hohle Telekom/T-Systems Technology Center