Am 13.08.2009 um 22:30 schrieb Brian Hurt:
> Now, I can certainly see a lot of potiential downsides to this.
> Redefining what #{} or #() means is just the start.
I think, this is the reason Rich is not very positive for that idea:
because nobody came up with a way of defining "namespaces" for reader
macros, so that they don't interfere with each other.
> But it'd make it a lot easier to do things with DSLs.
I'm happy with macros for DSLs. Actually the macros just quasiquote
their arguments and pass them on to actual functions.
> So, what are people's thoughts?
I've yet to see the desire for a self-defined reader macro. But I'm no
Common Lisper (a Schemer actually). So I'm not used to reader macros.
Maybe I'm missing the paradise.
I'm not necessarily opposed to the idea. But I wouldn't give it high
priority either.
Sincerely
Meikel
Indeed, Rich has said that he's strongly disinclined to add them:
http://groups.google.com/group/clojure/browse_frm/thread/f1148b3569e8d275
For my part, I would like to have them available, simply because I
like concise, purpose-built languages (xpath-esque tree traversal
stuff lately). But then, I can see the damage they've caused in other
environments. Power and responsibility aren't always well-managed.
- Chas
Would it make any difference if the scope of the reader macro was
limited to the file which defines/uses it? Any file that wanted to
use a custom reader macro would then have to add its own
(use-reader-macro ...) statements, and there'd be no possibility for
conflicts.
Something like:
(defn comment-block-begin []
"Dispatch function for beginning of block comments")
(use-reader-macro '#| comment-block-begin)
Oops, I see on the link given elsewhere in this thread that that was
already discussed and dismissed.
>
> What is the main point of reader macros? Is it so you can define your
> own short-hand syntax, or is it the ability to get more direct access
> to the reader?
> If it is the first point, then I'd be happy to not have them - to me
> shorthand doesn't buy much.
> If it is the second point then why not simply have the reader pipe raw
> text into a reader macro?
> Ie, usage would be like
> (def-reader-macro pass-through [raw-string]
> (read-string raw-string))
>
> Usage would be (pass-through (+ 1 2)). The pass-through call would be
> given "(+ 1 2)" in the raw-string argument and must return a value
> that can be eval'd.
>
> More complex "reader" macros could be (infix x + y + z / 3).
I think you can already do that with regular macros.
I'm under the impression reader macros usually have to be prefixed
with #, like #(foo %) which expands to (fn [x] (foo x)) or something
like that. I can't find the common lisp units package I was talking
about, but if it were perfect it would have worked something like Frink:
http://futureboy.us/frinkdocs/
So you could do something like (+ 1m 2m) and have it give you back 3m,
and if you did (/ 10m 5s) it would give you back 2m/s, and blow up if
you tried to do (+ 1m 2m^2).
It would be difficult to do such a thing unobtrusively without a
reader macro.
The other project I had experience with, CL-SQL, I think having a
reader macro was sort of an abuse; it doesn't matter much if you have
to put a call around your SQL or even stick it in quotes. Especially
because it's unclear to me what value you should get from reading an
SQL expression other than a string, unless it's going to blow up when
you have a syntax error at compile time, but as an abstraction layer
it certainly didn't have enough information to do that correctly at
compile time.
—
Daniel Lyons
> A single "super quoted" string reader would avoid this problem.
> Instead of defining a new read syntax like:
>
> #my-syntax(your DSL goes between here and here)
>
> Clojure could provide a general purpose string creating read syntax.
> Something like #"...
A good thought, but #"foo" is reader syntax for defining a regular
expression with the pattern "foo". :-/
Double-quoted strings are decent for stuff like this. (Triple-quotes
in python always appealed to me, though triple-quoting things can get
tiring.)
- Chas
AllegroGraph uses ! as a reader macro to introduce URIs and RDF
literals, which means they appear quite frequently in user code. You
can write
!rdf:type
!<http://example.com/foo>
!"chien"@fr
and the appropriate object (a so-called future-part) is integrated
directly into the read code, just like a string or a list. This has
three advantages over something like
(resource "rdf" "type")
—
* This process happens at read-time, which means that you can use this
syntax in *unevaluated* positions, such as in macros, query forms that
are only read, not evaluated, etc. That can be really convenient.
* The syntax is much closer to the conventional syntax used in
serialization formats (simply add a '!' to the front of the usual
syntax). Users who are more familiar with Java or Python don't seem to
mind, which is not true when you add parentheses to your syntax.
* You can raise a syntax error at read time, rather than at compile-
or run-time. That's a nice feature in a dynamic language, where you
might not know your code is broken until much later. It's nice for
DSLs to have distinct syntax and semantics, and requiring the use of
macros or functions to implicitly construct syntactic elements blurs
that line.
I've observed this in the discrepancy between the 'primary' Clojure
types, such as set, map, et al, which have syntax, and those which do
not: sorted-set or struct-map, for example.
However, for all these benefits I'm still not sure whether it would be
worthwhile allowing user-generated reader macros. They are absolute
hell when integrating conflicting libraries — it only takes one major
library to use a reader macro character, and it's pretty much verboten
for all other libraries, just in case you want to combine them.
Franz's Allegro Common Lisp offers a workaround for this problem —
named readtables, which makes using readtables in CL much tidier. If a
library's readtable does not take effect in user code until demanded,
then conflicts do not occur. Code that uses libraries with conflicting
syntax can be partitioned to use different readtables.
If it were possible to signal to Clojure's reader which readtable to
use when reading a file/stream/string, and readtables were named in a
similar manner to namespaces, and could be composed, then I think user-
defined reader macros would be safe and convenient. Just as in-ns and
ns adjust the current namespace, they could adjust the readtable:
(ns com.example.foo
(:refer-clojure)
(:use com.example.library1)
(:readtables
clojure.core
com.example.library1))
I'm sure this would complicate Clojure's reader significantly.
-R
>>> More complex "reader" macros could be (infix x + y + z / 3).
>>
>> I think you can already do that with regular macros.
>
> I don't think so. Macros are invoked after the read stage but before
> evaluation of arguments. This kind of macro would be invoked without
> the text going through any kind of reader expansion.
Isn't reader expansion exactly what you don't want?
user> (defmacro infix [& args]
`(quote ~args))
#'user/infix
user> (infix 3 + 4 * 7)
(3 + 4 * 7)
I don't understand what's stopping anyone from implementing the body
of that macro to make it actually implement infix arithmetic.
—
Daniel Lyons
On Aug 14, 2009, at 12:48 AM, Daniel Lyons <fus...@storytotell.org>
wrote:
That particular example may not have to be a macro at all:
http://paste.lisp.org/display/75230
I think people want reader macros for a couple different reasons.
Sometimes it's just to remove parens from a function or macro call."
(sorry, finger slipped on the send button)
Another rather different reason is to implement features that would
otherwise require manually escaped strings as was mentioned earlier.
Perhaps these different desires can fulfilled with two different
constructs.
--Chouser
Wonderful! :)
> I think people want reader macros for a couple different reasons.
> Sometimes it's just to remove parens from a function or macro call."
>
> (sorry, finger slipped on the send button)
>
> Another rather different reason is to implement features that would
> otherwise require manually escaped strings as was mentioned earlier.
Speaking personally, the only reader macro I can think of I would
actually use would be the units one. What makes that interesting is
that you have to modify what the reader does with a token that almost
already parses. I wouldn't be upset if I had to wrap an expression in
some syntax but the downside to string quoting is mainly the editor.
People don't seem to like putting raw SQL in their code either (I
don't agree but whatever).
> Perhaps these different desires can fulfilled with two different
> constructs.
The two being:
1. To remove parens from a function or macro call. You mean e.g. #"
and #()?
2. To achieve DSLs that would screw up the reader, such as the units
one? Or is there a better example?
I halfway like the named readtable idea proposed by Richard Newman,
but I have to admit I still feel uneasy for some reason.
—
Daniel Lyons
Clojure spells this #_ instead of / and it is indeed
implemented as a (builtin) reader macro.
--Chouser
Right, though I guess I didn't state that very precisely.
For example, #() could be implemented as a regular macro if
you called it like (lambda ...) or something, or maybe ($ + 5 %)
Not as pretty, but it's a pretty minor syntactic difference.
Similarly #"" is pretty close to what the re-pattern
function does. One difference is that #"" compiles the
regex at read time while re-pattern compiles it runtime. If
re-pattern were a macro that difference would essentially
disappear. However the another difference is that #"" has
different quoting rules than "" strings: #"foo\sbar" is more
pleasant than (re-pattern "foo\\sbar"), which leads us to
point 2:
> 2. To achieve DSLs that would screw up the reader, such as the units
> one? Or is there a better example?
Regex patterns are my favorite example. Why does Rich get
to let everyone embed pretty regex expressions, but if
I want have my own regex-like mini-language, people will
have go through string-escaping contortions to use it? :-)
Units is another fine example, as is infix notation without
tortured spacing: (infix 5*2) The reader gets upset about
that.
So in general 1 is in my opinion a fairly minor syntax
thing, while 2 could be somewhat alleviated if Clojure had
a string literal format that allowed un-escaped double
quotes and left backslashes unmolested. This would allow
things like (infix #'''5*2'''). Again, not pretty but
perhaps better than nothing.
One argument against wide-open user defined reader macros
that I don't think I've heard is that currently any .clj
file can be parsed without evaluating any of the code. It
can't be compiled without executing macros, but at least it
can be parsed. If code can define new reader behaviour,
this would no longer be true -- Clojure could be more like
Perl! http://www.perlmonks.org/?node_id=663393
--Chouser
In code, yes, it would disappear because you can shift the operation
into a later phase. In data, not so, because those later phases never
occur.
user=> (type (second (read-string "(foo #\"hello\")")))
java.util.regex.Pattern
user=> (type (second (read-string "(foo (re-pattern \"hello\"))")))
clojure.lang.PersistentList
user=> (type (eval (read-string "(re-pattern \"hello\")")))
java.util.regex.Pattern
This is to say, forms that have reader support can be treated as data
but still contain concrete objects without evaluation. Forms that do
not must be somehow evaluated.
When you have arbitrary forms arriving from user data -- such as a
query language -- it's really nice to avoid having to write a generic
tree walker...
We use these concrete reader-supported objects all the time, such as
when reading config files off-disk, taking advantage of Clojure's
literal reader support for maps, strings, numbers, etc.
Right now you can embed a regex literal, or a BigDecimal, in a literal
form. You can't embed a URI, or a complex number, without involving
explicit evaluation. I imagine that if Rich was building Semantic Web
tools, or working in complex mathematics, there would be syntactic
support for these things.
(This is not to gripe, or bitch, or say that Rich has made wrong
decisions. I'm just pointing out that extensibility allows you to
build the tools you need for your domain without waiting for the
language implementor. That's the touted advantage of Lisp...)
> So in general 1 is in my opinion a fairly minor syntax
> thing, while 2 could be somewhat alleviated if Clojure had
> a string literal format that allowed un-escaped double
> quotes and left backslashes unmolested. This would allow
> things like (infix #'''5*2'''). Again, not pretty but
> perhaps better than nothing.
That's probably convenient, but note that this supports your own regex-
like mini-language (;)) but requires the same contortions for
something like language annotations:
"chien"@fr
The point of reader macros is to offer the same extensibility
processing strings in the reader as programmers get from macros
processing forms at compile time.
Ultimately, disallowing reader macros puts the burden of syntactic
extension on the end user and the developer: they must come up with
some alternative, probably involving parsing strings at compile- or
run-time, and the user must use them.
That might be a good tradeoff if syntactic extension is uncommon --
Clojure is certainly very useful now, extensible reading carries a
cost, and I'm hardly one to say that aping Common Lisp is a great idea
-- but maybe syntactic extension is avoided because it has thus far
been impossible (Clojure) or unsafe (Common Lisp)?
> One argument against wide-open user defined reader macros
> that I don't think I've heard is that currently any .clj
> file can be parsed without evaluating any of the code. It
> can't be compiled without executing macros, but at least it
> can be parsed.
That's a good point, though it's only true to say "without evaluating
any of the code in user files". Clojure core reader macro code runs
all the time when reading; that's the point. I think it's fair to say
that reader macros are in a different category to 'ordinary' code, and
I wouldn't object to restrictions such as "no reader macros operating
in the same file or namespace".
-R
> So in general 1 is in my opinion a fairly minor syntax
> thing, while 2 could be somewhat alleviated if Clojure had
> a string literal format that allowed un-escaped double
> quotes and left backslashes unmolested. This would allow
> things like (infix #'''5*2'''). Again, not pretty but
> perhaps better than nothing.
...and then someone wants to embed python source in a clojure file,
and comes calling on the clojure group to complain about the broken
triple-quoted string literal reader macro. ;-)
- Chas
On Aug 13, 2009, at 2:30 PM, Brian Hurt wrote:I'm just wondering what people's response would be to allow user-generated reader macros. I'm not sure, but I think the only change to the clojure core that would be necessary in order to do this would be that in clojure/src/jvm/clojure/lang, LispReader.dispatchMacros would have to be made public. This would allow user code to update this array and add new macro functions.
Now, I can certainly see a lot of potiential downsides to this. Redefining what #{} or #() means is just the start. But it'd make it a lot easier to do things with DSLs.
So, what are people's thoughts?
Trying to use them in Common Lisp has frustrated the crap out of me. The only library I know of that promulgates them seriously is CL-SQL and the gymnastics you have to do to install the reader macros are frustrating. Another library I tried to use I couldn't get to work at all (units, I think).
In general I don't think they're worth the trouble. Reducing code compatibility is also a problem. Perhaps with namespaces there could be a better way to do it than CL does but I'm very wary of them. I don't think Scheme lets you have them at all.
Reading != evaluation.
The form
{:foo (let [x 5] (fn [y] (+ x y)))}
contains no clojures, no lambdas… nothing executable at all. It's a
single-pair map, where the only object is a PersistentList.
Try it:
user=> (:foo (read-string "{:foo (let [x 5] (fn [y] (+ x y)))}"))
(let [x 5] (fn [y] (+ x y)))
user=> (type *1)
clojure.lang.PersistentList
To get a function object or a closure you need eval:
user=> (eval *2)
#<user$eval__91$fn__93 user$eval__91$fn__93@60498b39>
user=> (*1 3)
8
A reader macro allows you to produce a data structure at read time
from the user input; the reader turns a stream of characters into a
data structure. It's a parser.
You could write your own parser to produce data structures from a
stream of user input, but all that allows you to do is support non-
Lispy syntax and throw more meaningful errors. If you use Clojure's
reader, simply check the data structure you get out the other end —
you probably have to do that anyway.
The reader *is* a parser for reading Clojure data structures from a
file or stdin. You'd be doing a lot of unnecessary work if you wrote
your own. Furthermore, extending the built-in reader means that your
work can be reused — your tests can use the same syntax, your code
can, other people can use your library to express byte arrays. I'd use
it for sure.
[[
Sidenote:
A big motivation for reader macros is that, over the years, there have
been and will be hundreds of people like you who end up having to
write their own parser because they need just a tiny bit more
syntactic support. Individually it's easy to dismiss them -- "oh, it's
no big deal to write a custom parser for byte arrays", "oh, I can just
do this with a macro instead", "never mind, I didn't really need that
feature" -- but aggregated there's a lot of pointless work being done.
]]
> And second of all, the cost of allowing readers and rewriting the
> syntax of the language as it's parsed, is huge. I'd be imposing a
> huge and lasting cost on all clojure developers to save me a small
> cost right now.
Brian, I'm not sure that's the case, for two reasons:
1. Common Lisp implementations have supported reader macros for many
years, and their readers are very, very fast.
2. Clojure already has extensible reader macros — it simply doesn't
expose it to non-core code. I even hacked one in a couple of months
ago when I was experimenting with syntax support for multisets.
There should be no speed penalty for allowing user reader macros when
compared to core reader macros.
-R
It would be nice if someone wrote a separate extension to clojure that
(reads in a text file and) that does tokenization and manipulation of
said tokens (I'm thinking YACC, flex/bison sort of thing).
(Then you could substitute in clojure/clojure macros to make 'syntax'
tree, and pass it to eval...).