Hi folks,
Now that the tree manipulation problem is solved, another problem becomes noticeable - the problem of hygiene / referential transparency. To quote Simon from [1]:
You can write code like tq"""Enum[$className]""", but if you want to be really sure that things don't break if people define their own Enum class in scope, one better does: tq"""_root_.java.lang.Enum[$className]""".
Of course, this doesn't have to be like that. In most LISPs and in a lot of other macro-enabled programming languages, tree construction is hygienic in the sense that bindings in macro-generated code are established using to some well-defined rules rather than using the HULK SMASH BIND principle. For example, in my candidacy exam write-up [2] I outlined such rules for Template Haskell, Nemerle and Racket, and of course there's a lot of other info to read on that matter on the web.
Therefore I propose that we now discuss what rules suit Scala macros and decide how to make it happen. I suggest we don't discuss whether these rules are going to be implementable or not - that's my problem - and focus solely on design.
1) The main goal to pursue is pretty clear: we want to write "import java.lang.Enum" at the definition site of the quasiquote mentioned above and then have this information carried to whatever macro expansion site the quasiquote ends up in.
2) And then there's the first tough choice. Do we preserve only top-level bindings, i.e. just "List" in "List(x, y)", or all sorts of bindings, i.e. "List", "x" and "y" in "List(x, y)"? The former represents a compromise between rigor and practicality, whereas the latter is rigorous, but quite inflexible. [3] provides a Nemerle vs Template discussion on this matter.
3) Sometimes we will want to break hygiene, so we need syntax for expressing the difference between hygienic and unhygienic names. Firstly, what should be the default? (I'd argue, hygiene). Secondly, how do we express the non-default? (I'd suggest introducing a special flavor of names that can then be spliced in identifier positions). Thirdly, do we provide special syntax for non-default? (In Template Haskell, they have/had %pi as a shortcut for $(dyn "pi"), but then we'll have to escape calls to methods named %).
4) How do we play nicely with 2.11, because with a very high probability we won't be ready by 2.11 code freeze (mid September)? Imho, the only thing to make sure here is aligning q"..." in paradise and q"..." in trunk. If we decide that hygiene should be the default, we should rename trunk's q/tq/cq/pq into something like uq/utq/ucq/upq to signify that they are unhygienic to make sure that we don't break people's code afterwards.
[1] https://groups.google.com/d/msg/scala-language/C7Pm6ab1sPs/-Sm4Pkz3O9EJ
[2] https://github.com/scalamacros/scalamacros.github.com/blob/master/paperstalks/2012-09-10-CandidacyExamPaper.pdf
[3] http://www.haskell.org/pipermail/template-haskell/2004-February/000250.html
[4] http://research.microsoft.com/~simonpj/tmp/notes2.ps
Cheers,
Eugene
--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
I just had an idea: If you want hygiene, why not write:tq"""$Enum[$className]"""
?
You can write code like ...
tq"""Enum[$className]"""
... but if you want to be really sure that things don't break if people define their own Enum class in scope, one better does:
tq"""_root_.java.lang.Enum[$className]"""
... but that gets pretty verbose and tedious, so one refactors it a bit:
val Enum = "_root_.java.lang.Enum"
tq"""$Enum[$className]"""
But this causes:
[error] exception during macro expansion:
[error] java.lang.AssertionError: assertion failed: "java.lang.Enum"
[error] at scala.reflect.internal.Trees$AppliedTypeTree.<init>(Trees.scala:481)
[error] at scala.reflect.internal.Trees$AppliedTypeTree$.apply(Trees.scala:478)
[error] at scala.reflect.internal.Trees$AppliedTypeTree$.apply(Trees.scala:486)
Ouch. Of course, that's the right way to do it:
val Enum = Select(Select(Ident(newTermName("java")), newTermName("lang")), newTypeName("Enum"))
Now I suddenly have to care about TermNames vs. TypeNames again!
I think we have two issues here with the current situation:
- The "easy" way is extremely fragile and one gets no warning about it
- Doing it "correctly" (fully-qualified names, no duplication) is a lot more verbose and invalidates one of the key features of quasi-quotes: Not having to mess with TermNames vs. TypeNames.
I think it is pretty unfortunate that the wrong way is easy, readable and concise while the right way is more complicated and verbose.
I really wonder why it breaks down so fast ... if string interpolators can figure out what's a type and what's a term in ...
tq"""_root_.java.lang.Enum[$className]"""
I really wonder why they can't do it for ...
val Enum = "_root_.java.lang.Enum"
tq"""$Enum[$className]"""
The problem is that in your example the string gets silently promoted to Literal(Constant("...")), and then the compiler tries to apply a type argument to it, which doesn't make sense.
To be honest, I'm proud of Scala starting to enforce some tree invariants. In good old days, this would probably crash much later, somewhere in-between mixin and erasure. Maybe in refchecks :)
So back to the problem we were discussing. In a sense, quasiquotes do exactly what you requested, but I think we should still disallow splicing strings into positions where names are expected to avoid confusion.
***
As for Martin's suggestion, I think he meant that when we write:
import java.lang.Enum
q"$Enum"
Quasiquotes should: a) allow that, b) remember the lexical context for names introduced that way.
This is a very cute shortcut - concise and not requiring any special syntax. My only concern here is that in that case we make hygiene optional, because people (especially the newcomers) will sometimes forget to put dollars in front of those names.
Yes, I agree with your last suggestion here. That's what I meant by proposing to turn hygiene on by default.
You could read up more on how people model that in my candidacy write-up that I linked above (sections II-B and III, and also papers from References).
Referentially transparent
Preserves global bindings
bar=> (ns bar)
nil
bar=> `(let (list 100))
(clojure.core/let (clojure.core/list 100))
Disregards local bindings
user=> (defmacro m [x] (let [list 1] `(list ~x)))
#'user/m
user=> (m 2)
(2)
Doesn’t distinguish binders from bindees
bar=> `(let [foo 100] (foo 100))
(clojure.core/let [bar/foo 100] (bar/foo 100))
When used in a macro expansion, this won’t compile!
Not hygienic
Locally introduced names (`’foo) can be inadvertently captured
There’s no way to specify the target of the capture
Has convenient gensym that works within the same quasiquote
user=> (defmacro m [x] `(let [x# 1] x#))
#'user/m
user=> (m 2)