[UNN] Feature Macros, not an alternative to Feature Expressions

Alan Dipert

unread,

Jan 22, 2015, 9:02:02 PM1/22/15

to cloju...@googlegroups.com

Hi everyone,

Micha Niskin and I wish to share that we identified concretely an
aspect of the Feature Macros proposal that we think makes the whole
thing unsound. We unannounce it :-)

We couldn't have reached this conclusion without the vigorous argument
and discussion the Clojure community was kind enough to indulge us
with. We invite everyone involved to share in the joy that comes with
working an idea completely through. Go us!

The problem is one of composition. While operations being applied
under the usual Lisp eval rules receive arguments after evaluation,
macros applied under the usual Lisp macro expansion rules *do not*
receive their arguments after macro-expansion. It is a curious
asymmetry.

Because macros have the option of expanding their arguments, and
because most don't, we can't pass macros code containing other macros
and expect the same kind of inside-out composition we get with normal
Lisp. Consider the normal Lisp function, which can be ignorant of the
code contributing to the argument values it sees:

(+ 1 2)
(+ (inc 0) (* 2 1)).

In both cases, + sees an arg list of [1 2]. It doesn't know or care
what code represented its arguments prior to evaluation.

Macros aren't like this. We can't pass arguments to macros that
contain macro calls and expect the top-most macro to be ignorant of
how its arguments came to be.

Consider the ns macro and the case-platform macro from our proposal
[1]. If ns macro-expanded its arguments, we could achieve FX-like
functionality like:

(ns example-portable-ns
(:require (case-platform
:clj clojure.core
:cljs cljs.core)))

But ns doesn't macroexpand its args, and never will.

The workaround, which we applied in ignorance of the asymmetry as the
ns+ macro in the proposal, is to wrap. If you wrap an existing macro,
you have an opportunity to control the expansion of its arguments.
This means that for every macro you want to exhibit the new semantic,
you need to wrap it. This results in a code explosion problem (in the
form of wrapper macros) which is the same problem we're trying to
solve.

The #+ and #- reader macros of Feature Expressions circumvent this
problem, because as reader constructs, they are the only
possibly-conditionalized thing preceding macro expand. The code
containing them cannot know that they exist, in the same way that a
macro which received its arguments expanded sees no macros. With
them, regular ns works fine, because it is ignorant of the
reader-level dispatch that preceded its expansion:

(ns example-portable-ns
(:require #+clj clojure.core #+cljs cljs.core))

We're still not super-enthusiastic about FX as it stands, because
we're terrified by the prospect of losing code generation forever. We
think there might alternatives to mitigate though, such as boxing
feature-read forms in a new special form with :+ and :- meta hung on
it. At least then we could generate and print things without
descending immediately into string munging.

We encourage you to think deeply and critically about FX too. We
tried to, and were rewarded by learning something awesome about Lisp.
Yes, FX was invented by geniuses in the beforetimes and is probably
good, but if we "cargo cult" without reasoning anew for ourselves why it's good, we
just might regret it.

We'd like to acknowledge Brandon Bloom, who imagined something similar
to the problem we describe on the Feature Expressions design page back
in 2013. [2] We thank also Colin Fleming, whose mention in IRC of a "fear of an explosion of
+ macros" caused us to see ns+ in a new and unsavory light.

Everything we mention was probably also known by somebody, but it was
hard to Google. If anyone knows any related references regarding the
weird missing macroexpand semantic, do send them our way.

Oh, and here is a prototype implementation of the Weird Semantic:
https://gist.github.com/alandipert/331885e36756e691f41a

Alan Dipert
Micha Niskin

1. https://github.com/feature-macros/clojurescript/tree/feature-macros/feature-macros-demo
2. http://dev.clojure.org/display/design/Feature+Expressions?focusedCommentId=6390065#comment-6390065

Alan Dipert

unread,

Jan 22, 2015, 11:58:46 PM1/22/15

to cloju...@googlegroups.com

I forgot to acknowledge also Adrian, who responded to our thread yesterday with maybe the simplest formulation of the Feature Macro problem: macroexpansion works from the outside in.
Alan

James Reeves

unread,

Jan 23, 2015, 11:27:04 AM1/23/15

to cloju...@googlegroups.com

Congratulations on your unannouncement. Being willing to try out new ideas is impressive; being able to accept and act upon criticism is even more so.

- James

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure-dev...@googlegroups.com.
To post to this group, send email to cloju...@googlegroups.com.
Visit this group at http://groups.google.com/group/clojure-dev.
For more options, visit https://groups.google.com/d/optout.

Rich Hickey

unread,

Jan 23, 2015, 5:32:01 PM1/23/15

to cloju...@googlegroups.com

Thanks for acknowledging the shortcomings, and prior work on the same idea.

I do think there are shortcomings to Common Lisp's #+, the primary being the one cited: it's not data. Without that, it is hard to make programs that generate feature-conditional programs, or ones that transform them.

That can easily be solved with:

1) a proper conditional-read form:

(#? ...)

and

2) a mode of the reader that does *not* do conditional read processing.

==
Another deficit of #+ is that each condition is floating around in the code, not visible as a single set of choices. The case-* of the feature-macro proposal was nice in organizing these choices.

I'm not sure about case-this vs case-that (i.e. features is a map with slots for this and that), or even 'case' as the right model (vs order-sensitive 'cond'), but the grouping is nice.

Setting aside features-as-map for a moment, given features-as-set, a form version could look like:

(#? :clj this :cljs that)

where #? reads as 'clojure.core/read-cond (or something).

Also interesting would be a splicing version ('clojure.core/read-cond-splicing):

(#?@ :clj [these-forms ...] :cljs [those-forms ...])

It's important that these be able to yield nothing, not nil, in order to be useful everywhere.

An obvious enhancement might be and/or/not boolean expressions:

(#? (or :clj :cljr) this
:cljs that)

Clauses would be considered in order, first match wins.

:default can be reserved for supplying a default when no options are true.

(#? :clj this
:cljs that
:default 42)

If none true and no default, the form reads nothing (not nil, not error).

'default-features' would still be available for macros to use.

====
Future possibilities

There was some talk about feature-macros being open, but in reality none of these things are very open. Either the feature is considered in the code or there is a default, there's no way to extend case/cond-like things without touching the code or having a true extension point.

Taking this idea further, we can say a final solitary (namespaced!) keyword can serve as a semantic label for extension:

(#? :clj this
:cljs that
::whatever-extension)

Given this, some read-time mechanism can first be consulted to see if there is an entry for :your-lib/whatever-extension. If so, it is used as the value read from the form, else the form logic runs. Now if none of the conditions hold and no default it is an error, since there is a way to provide customization for your environment w/o changing the source.

This would also allow for shared, named snippets:

(#? :clojure.port/date-ctor)

There is a lot more infrastructure required for this extensibility, and it remains a future prospect.
====

For the present, Alex Miller will be coordinating prototyping of this mechanism ('read-conditionals' ?)

Feedback and help welcome.

Thanks,

Rich

Alex Miller

unread,

Jan 25, 2015, 10:48:11 PM1/25/15

to cloju...@googlegroups.com

Quick update - Luke VanderHart built a prototype of this for Clojure over the weekend, which is attached at http://dev.clojure.org/jira/browse/CLJ-1424 as clj-1424-5.diff. I'll be doing some review of it Monday. (Thanks Luke!!)

One related piece of work that will be needed is a port of these changes to tools.reader (some of the code will already be present in the latest fx patch at http://dev.clojure.org/jira/browse/TRDR-14). If anyone is interested in helping out, I'd be happy to get a hand there! Let me know here so we're not duplicating effort.

The existing ClojureScript patch on CLJS-27 is probably the same, so that one should be ok.

My hope is that (pending feedback, which is welcome), we will have a complete prototype in the next day or two and can move towards inclusion in the next Clojure alpha (and tools.reader and ClojureScript).

Alex

Colin Fleming

unread,

Jan 26, 2015, 3:52:40 AM1/26/15

to cloju...@googlegroups.com

So the plan is to move ahead with the #? and #?@ syntax?

al...@puredanger.com

unread,

Jan 26, 2015, 8:23:53 AM1/26/15

to cloju...@googlegroups.com, cloju...@googlegroups.com

Well, as I said, pending feedback. Would love to have some!

Luke VanderHart

unread,

Jan 26, 2015, 10:13:26 AM1/26/15

to cloju...@googlegroups.com

Hi Alex,

I'll be taking a crack at tools.reader today, probably.

Thanks,

-Luke

Chouser

unread,

Jan 26, 2015, 10:24:17 AM1/26/15

to cloju...@googlegroups.com

Did I read the patch correctly, that :default, :none, and :else are all reserved but mean exactly the same thing? If that's correct, may I suggest we pick just one instead?

In my experience, synonyms for the same functionality provide little benefit, as everyone needs to know all the options anyway, in order to read each other's code. So it's just more things for people and tools to recognize, without providing any new meaning.

I'd recommend :default since that's already a special word in a couple other places in Clojure (tagged literals, and multimethods). But even if that's not chosen, I'd prefer any other single value over a set of three.

What do you think?

—Chouser

Alex Miller

unread,

Jan 26, 2015, 10:31:45 AM1/26/15

to cloju...@googlegroups.com

Hey Chouser,

Rich wanted to reserve those for future possible meanings. I think I'd agree that we should pick one (:default seems right to me) as the canonical term. The others should probably throw errors for now.

Alex

Brent Millare

unread,

Jan 26, 2015, 11:01:35 AM1/26/15

to cloju...@googlegroups.com

This sounds great but I'm still fuzzy on the basics. What's the purpose of "#? reads as 'clojure.core/read-cond"? Also, what shortcoming does "2) a mode of the reader that does *not* do conditional read processing." fix and how?

Alex Miller

unread,

Jan 26, 2015, 11:23:41 AM1/26/15

to cloju...@googlegroups.com

On Mon, Jan 26, 2015 at 10:01 AM, Brent Millare <brent....@gmail.com> wrote:

This sounds great but I'm still fuzzy on the basics. What's the purpose of "#? reads as 'clojure.core/read-cond"?

Analogous to how #( ... ) reads as (fn [...] ...). It gives you a non-syntactic way to say the same thing as the dispatch macro.

Also, what shortcoming does "2) a mode of the reader that does *not* do conditional read processing." fix and how?

One problem people have identified with the former feature expressions proposal was that there was no way for someone to read and actually recover the original set of choices that were present in the text. This is useful for various kinds of tooling or where you want to do source transformation but maintain the feature conditionals. With this mode, you can get back the read form with the conditional and both branches in it.

Alex

Chouser

unread,

Jan 26, 2015, 1:22:25 PM1/26/15

to cloju...@googlegroups.com

Ok, makes sense.

Also, is it intentional that reading (clojure.core/read-cond ...) does not behave the same as (#? ...)? That is, (#? ...) can be read as c.c/read-cond depending on read options, but having been read, if it is printed again it doesn't round-trip back to #?. This is different, for example, from how #(...) is read as (fn* [] (...)), which then retains its meaning.

—Chouser

Alex Miller

unread,

Jan 26, 2015, 3:20:18 PM1/26/15

to cloju...@googlegroups.com

The intention is that clojure.core/read-cond can be read like #? (so while not identical in round-trip, they are at least semantically identical). There is a bug in the current patch in shouldReadConditionally() - should be .equals() instead of == for the symbol comparison.

After fixing that issue:

user=> (defn sr [s] (java.io.PushbackReader. (java.io.StringReader. s)))

user=> (read {} (sr "(#? :clj :x :default :y)"))

:x

user=> (read {:preserve-read-cond true} (sr "(#? :clj :x :default :y)"))

(clojure.core/read-cond :clj :x :default :y)

user=> (read {} (sr "(clojure.core/read-cond :clj :x :default :y)"))

:x

Alex Miller

unread,

Jan 28, 2015, 10:23:52 AM1/28/15

to cloju...@googlegroups.com

Forwarding some comments from Rich based on Luke's latest work in prototyping this. In particularly, there has been discussion over how to represent tagged literals when reading conditionally in a non-chosen branch.

From Rich:

----------------

I think we should go with #?(…) and #?@(…), i.e. read macro outside of form.

So what do you get when reading #?(...) in suspend-conditional-read mode?

It immediately raises another question - what about #tagged literals occurring in read-conditionals? The whole point of #tagged literals is to support an extensible notion of data, but it often relies on context/platform-aware data readers, which might need to be different per branch of the read-conditional. They'll need to be skipped for non-winning branches when in normal (conditional-read) mode, the presumption being that a compatible set of handlers is installed (only) for the branch supported by the current features, and defaulted for the *whole* form in suspend-conditional-read mode.

In both cases the normal data-reader logic is skipped, and the form is read as a known data structure that will print again as what was read:

#some-tag some-form ===> x

(tagged-literal? x) -> true

(= (tagged-literal 'some-tag some-form) x) -> true

(:tag x) -> 'some-tag

(:form x) -> some-form

(pr x) -> #some-tag some-form ;; prints same as read

A similar approach can then be taken for reader conditionals when read in suspend-conditional-read mode:

#?(...) ==> x

(reader-conditional? x) -> true

(= (reader-conditional '(...)) x) -> true ;; only = when ... uses generic tagged literals

(:form x) -> (...)

(:splicing? x) -> false ;; true when #?@

(pr x) -> #?(...) ;; prints same as read

The data returned by these forms is defined by the above protocols, and the factory functions (tagged-literal and reader-conditional) can be used in code that needs to produce tagged literals and read-conditionals from scratch.

----------

The "known data structures" above for representing tagged literals and reader conditionals will be implemented as concrete Java classes with constructor, predicate, keyword accessors as seen in the code above. In general, these new instances would only be returned to the caller when reading in suspend-conditional-read mode. The tagged-literal and reader-conditional constructors would be useful for cases where these need to be defined without having the type available, so they can be more thoroughly treated as just data.

Luke is working on the patch for this.

Chouser

unread,

Jan 28, 2015, 10:34:05 AM1/28/15

to cloju...@googlegroups.com

Thanks very much for keeping us updated, Alex. Highly appreciated.

—Chouser

Chouser

unread,

Jan 28, 2015, 3:34:25 PM1/28/15

to cloju...@googlegroups.com

Has there been any thought of splitting the functionality into two
functions, rather than adding a flag to 'read'?

One function, you could call it 'parse', could consume text and
produce the kind of objects described above as
suspend-conditional-read (including reader-conditional and
tagged-literal objects).

A second function, maybe called 'read-expand', would take those
objects as input, translate the reader conditionals and tagged
literals, and return what a regular 'read' does today.

Then 'read' could be defined as the composition of parse and
read-expand, and no flag would be necessary. Would that be strictly
simpler?

—Chouser

Rich Hickey

unread,

Jan 28, 2015, 4:00:46 PM1/28/15

to cloju...@googlegroups.com

Doing this in two steps means two passes and tree-rewriting. suspend-conditional-read is not important enough to engender that overhead for normal read. Plus, there's only one substrate so we'll have a flag internally anyway.

Chouser

unread,

Jan 28, 2015, 4:46:04 PM1/28/15

to cloju...@googlegroups.com

If both styles of read (with and without suspend-conditional-read) consume text.

I think that means tools that manipulate reader-conditionals and then
want to eval the results will round trip back through text. That is,
they will:
1. read with suspend-conditional-read on
2. do their tree-walking, manipulation, whatever
3. print the results out to text (byte-array/disk/etc.)
4. read that with suspend-conditional-read off
...and then will have data in the shape eval wants.

Did I get that right?

On an entirely unrelated note, as anyone serialized a tree such that
it can be run through a pipeline of transducers and efficiently built
into a tree again? :-)

—Chouser

Herwig Hochleitner

unread,

Jan 28, 2015, 9:37:40 PM1/28/15

to cloju...@googlegroups.com

Ad. syntax proposal: I like how #? and #?@, are mirrors of `~ and `~@

Though, ~ normally occurs some levels down of `. Can something like this be done for ? and # .. what would this mean?

When (pr x) -> #tag form, then maybe (read {:preserve-read-cond true} ..) should be (read {:preserve true} ..) and also preserve white space and newlines, closing the gap to editors.

@ Alex, Rich: Do you think that giving :splicing? to regular reader tags and implementing #? and #?@ as such would be missing a critical API distinction? Invite people to mess with side effects in reader tags?

@ Chris: The idea of using a transducer on a tree intrigues me ;-). Do you think this can be efficient as in taking advantage of -XX:+DoEscapeAnalysis, thus avoiding to allocate intermediate trees on the heap? Also, how do you feel about "The compiler needs an entry-point distinct from read" vs "Conditional reading as well as Reader tags are not part of the compilation process"?

What is the story for going from (UUID/randomUUID) to (tagged-literal 'uuid "...")?

Currently it seems that this would be achieved by (read-string #{:preserve} (pr-str (UUID/randomUUID)).

Could anything be gained by splitting this out from print-method, e.g. *data-writers*?

Alex Miller

unread,

Jan 28, 2015, 11:03:33 PM1/28/15

to cloju...@googlegroups.com

On Wed, Jan 28, 2015 at 8:37 PM, Herwig Hochleitner <hhochl...@gmail.com> wrote:

Ad. syntax proposal: I like how #? and #?@, are mirrors of `~ and `~@

Though, ~ normally occurs some levels down of `. Can something like this be done for ? and # .. what would this mean?

While they are intentionally reminiscent, they are different things and this is not meaningful.

When (pr x) -> #tag form, then maybe (read {:preserve-read-cond true} ..) should be (read {:preserve true} ..)

Why? Just to be shorter or are you getting at something else?

and also preserve white space and newlines, closing the gap to editors.

I don't think that's a goal here.

@ Alex, Rich: Do you think that giving :splicing? to regular reader tags and implementing #? and #?@ as such would be missing a critical API distinction? Invite people to mess with side effects in reader tags?

I don't think there's any reason to do that.

@ Chris: The idea of using a transducer on a tree intrigues me ;-). Do you think this can be efficient as in taking advantage of -XX:+DoEscapeAnalysis, thus avoiding to allocate intermediate trees on the heap? Also, how do you feel about "The compiler needs an entry-point distinct from read" vs "Conditional reading as well as Reader tags are not part of the compilation process"?

What is the story for going from (UUID/randomUUID) to (tagged-literal 'uuid "...")?

Currently it seems that this would be achieved by (read-string #{:preserve} (pr-str (UUID/randomUUID)).
Could anything be gained by splitting this out from print-method, e.g. *data-writers*?

I don't understand why you want to do that. You can leverage print-dup to print instances in a readable form, which is typically the tagged literal form right now. I don't see the benefit of using the tagged-literal constructor form for that but I could be missing something.

Herwig Hochleitner

unread,

Jan 29, 2015, 4:20:22 AM1/29/15

to cloju...@googlegroups.com

2015-01-29 5:03 GMT+01:00 Alex Miller <al...@puredanger.com>:

When (pr x) -> #tag form, then maybe (read {:preserve-read-cond true} ..) should be (read {:preserve true} ..)

Why? Just to be shorter or are you getting at something else?

No. To me this is about: What is the canonical serialization of a suspended-read tree?

Rich seems to say: The text whence it came.

Remember, this will be the primary representation for emacs, intellij, ...

What other consumers is this aimed at?

@ Alex, Rich: Do you think that giving :splicing? to regular reader tags and implementing #? and #?@ as such would be missing a critical API distinction?

I don't think there's any reason to do that.

Is looking for simplexes to build this out of, a reason?

See, once there is :splicing?, people will try to use it for all kinds of purposes, not just feature expressions. Is discouraging such use part of the intended design, or a missed simplification?

What is the story for going from (UUID/randomUUID) to (tagged-literal 'uuid "...")?
Currently it seems that this would be achieved by (read-string #{:preserve} (pr-str (UUID/randomUUID)).
Could anything be gained by splitting this out from print-method, e.g. *data-writers*?

I don't understand why you want to do that.

Well, for a start, print does it. Having it fused to that makes it the only API.

You can leverage print-dup to print instances in a readable form,

I wouldn't call (comp read-string pr-str) instead of ?to-suspended? leverage.

Well, it is, but as a consumer, I've got the _shorter_ lever here. Again, this might either be unfortunate or intended as per 'canonical serialization' above.

which is typically the tagged literal form right now. I don't see the benefit of using the tagged-literal constructor form for that but I could be missing something.

Should a hypothetical structural editor based on codeq be saving tagged literals as literal strings starting with #, or work with the thing yielded by the tagged-literal constructor form?

Should it even be working directly with the runtime instances and only involve tagged literals when transforming source to be evaled in a foreign runtime? (<- I think not this)

Rich Hickey

unread,

Jan 29, 2015, 8:22:46 AM1/29/15

to cloju...@googlegroups.com

Reader conditionals are not an evaluation feature. They are a reader feature. The reader reads text. That means they are *about* text. Basically they are a way to say "I want to write two substantially similar programs in one (text) file". Most tools and interpreters should be interested in one program, in one dialect, at a time, and are greatly simplified by not having to worry if they have been handed more than one :) Some tools need to manipulate files (text). Only those tools need to deal with this meta program.

Macros for this purpose do not work, for many reasons discussed at length. Were this an evaluation feature, it would need to be a phase of macroexpand that ran on code prior to its being sent to macro functions, and again prior to evaluation. And in both cases it would mean tree walking and rebuilding, potentially into nested data structures of arbitrary types, and likely not even possible given the rich, extensible set of types supported by Clojure and its reader.

Most important is this: *All of the branches is not a program in any dialect*

I don't see much need for many (any?) non-text tools to see all of the branches at once (as data). Because of the deeply nested contexts in which these can appear, direct interpretation would be quite convoluted and slow. And if you are writing a program that produces/transforms such a multi-program, you most likely are going to need to serialize it anyway (who will have more than one evaluator at this phase waiting for this as ephemeral data?).

Is this just a theoretical question or do people have particular tooling they envision that would not be well supported by this proposal?

In any case, if people want to write what you called 'read-expand' (really, 'find-one-program') over data returned by read with suspend-conditionals, they can. Doing the elision at the bottom is the efficient thing, and what the reader should do by default.

kovas boguta

unread,

Feb 3, 2015, 7:53:02 PM2/3/15

to cloju...@googlegroups.com

On Wed, Jan 28, 2015 at 10:23 AM, Alex Miller <al...@puredanger.com> wrote:

From Rich:

It immediately raises another question - what about #tagged literals occurring in read-conditionals?

The "known data structures" above for representing tagged literals and reader conditionals will be implemented as concrete Java classes with constructor, predicate, keyword accessors as seen in the code above. In general, these new instances would only be returned to the caller when reading in suspend-conditional-read mode. The tagged-literal and reader-conditional constructors would be useful for cases where these need to be defined without having the type available, so they can be more thoroughly treated as just data.

I really like the new proposal overall.

I'd love to see *default-data-reader-fn* bound to a constructor for this canonical representation for uninterpreted tagged literals.

The current behavior, throwing of an error, is not very useful, and its very, very hard to image programs depending on it.

I know, not impossible, but I find it difficult to rely on tagged literals because inevitably they pass through tooling or other programs that have not set *default-data-reader-fn* to something generic. It feels dangerous, and I have been bitten many times. I've more or less given up on user-defined TL's within source code for this reason.

Its certainly consistent with the EDN spec to do so, and I wonder if the default behavior would have been what I suggest if this canonical representation were available earlier.

Alex Miller

unread,

Feb 4, 2015, 8:45:54 AM2/4/15

to cloju...@googlegroups.com

On Tue, Feb 3, 2015 at 6:53 PM, kovas boguta <kovas....@gmail.com> wrote:

I'd love to see *default-data-reader-fn* bound to a constructor for this canonical representation for uninterpreted tagged literals.

We actually had exactly this conversation last week. :) I think Rich did not want to change default behavior but agree it's likely that this would have been the default behavior had this capability existed earlier. Certainly worth considering more.

Alex

kovas boguta

unread,

Feb 4, 2015, 2:36:39 PM2/4/15

to cloju...@googlegroups.com

Not changing existing behavior is a good default.

However I suspect there might be literally 0 programs that depend on this behavior. Its hard to contrive a use case where it could be fashioned to do something useful.

The cost of the existing behavior is that it is quite easy and common to create broken programs and processing pipelines. This undermines the value proposition of TLs and 'language of the system' type thinking, and leads to the perception that using TL's is not worth the hassle.

A example of this pain is Clojurescript. Getting user-defined TL's into Clojurescript source requires making all the Clojure-based Clojurescript tooling do the right thing. This is non-trivial, frustrating work.

I'm not gonna go to the mat over this question, but I hope it does get considered more :)

Steve Miner

unread,

Feb 4, 2015, 2:45:29 PM2/4/15

to cloju...@googlegroups.com

[I’m changing the subject as my remarks don’t have much to do with feature expressions.]

For anyone interested in the history of *default-data-reader-fn*, you can look up CLJ-927 and related discussions on the dev mailing list. I’ll just say there was not a consensus about the best default, but the important thing was to provide an option for the programmer to take control of unknown tags. Throwing on an unknown tag had already been established in the previous release so keeping that behavior was the conservative approach. And to be fair, it is possible that people depended on it, or at least were thankful that casual testing found unintentionally unknown tags (due to typos or other mistakes.)

Regarding the problem of mixing libraries and unknown tags: It gets tricky if you're trying to handle a whole class (I use the term loosely) of potentially unknown tags. As a library author, I can tell you to use my fancy function as your *default-data-reader*, but what do you do if the other library has the same idea? Who's in charge? It should be the user, not the libraries.

I wrote a little library called Tagged [1] that’s mostly concerned with treating Clojure Records as EDN tagged literals, but it also provides some utilities for authoring data-readers.

[1] https://github.com/miner/tagged

I just updated the README to explain how I use what I call the tag-reader convention:

I use the term tag-reader to describe a function taking two args, the tag symbol and a value, like a *default-data-reader-fn*. Unlike a data-reader, a tag-reader may return nil if it does not want to handle a particular value. (See CLJ-1138 for more information about why a data-reader is not allowed to return nil.) The tag-reader convention makes is simpler to compose multiple tag-reader functions using `some-tag-reader`. You can wrap one or more tag-readers to create a data-reader with `data-reader`. The `throw-tag-reader` always throws so it's appropriate to use as your last resort tag-reader.

I've found it convenient to provide appropriate tag-readers in my libraries and let the user create his own *default-data-reader* (or :default option for clojure.edn/read) by composing those tag-readers.

Steve Miner
steve...@gmail.com

kovas boguta

unread,

Feb 4, 2015, 3:23:09 PM2/4/15

to cloju...@googlegroups.com

On Wed, Feb 4, 2015 at 2:45 PM, Steve Miner <steve...@gmail.com> wrote:

Regarding the problem of mixing libraries and unknown tags: It gets tricky if you're trying to handle a whole class (I use the term loosely) of potentially unknown tags. As a library author, I can tell you to use my fancy function as your *default-data-reader*, but what do you do if the other library has the same idea? Who's in charge? It should be the user, not the libraries.

The problem is TLs need to move across applications. This is their main point - to serve as an extensible, semantically rich data format. The default behavior thwarts this.

I can easily control my own application, but getting inside other tools and systems is difficult or impossible. Getting the old cljs-build lein plugin, or various cljs repls to work with TLs was a nightmare.

In any case, the *default-data-reader-fn* still exists in my proposal, and the user is still in charge, and no libraries are making new decisions for the user in my proposal.

Brandon Bloom

unread,

Feb 5, 2015, 11:28:50 AM2/5/15

to cloju...@googlegroups.com

This is their main point - to serve as an extensible, semantically rich data format. The default behavior thwarts this.

For what it's worth, this matches my experience.

If you do (defrecord Tagged [tag value]) yourself, now you've got a multi-standard problem in which no two pipeline processors can communicate unless they were written to the same Tagged type (read: written by the same author).

At minimum, it would be nice to have a Tagged type in the core language, because then a trivial default-data-reader can actually span the gap.

Even if the default behavior doesn't change outright, it could be readily normalized with a single flag.

Reply all

Reply to author

Forward