Is this behavior of clojure.core/pr a bug?

207 views
Skip to first unread message

Blake Miller

unread,
Aug 3, 2016, 7:37:47 PM8/3/16
to Clojure
I have tried this with Clojure 1.7.0, 1.8.0 and 1.9.0-alpha10

(clojure.core/read-string (clojure.core/with-out-str (clojure.core/pr (clojure.core/keyword "A valid keyword")))) ;; => :A

This just seems wrong. It's valid to have an instance of clojure.lang.Keyword with a space in its name.

(clojure.core/with-out-str (clojure.core/pr (clojure.core/keyword "A valid keyword"))) => ":A valid keyword"


So, it seems like clojure.core/pr and clojure.core/read-string disagree about EDN.

Is EDN formally specified? https://github.com/edn-format/edn/issues/56 seems to suggest it is not.

I ran into this problem using ptaoussanis/sente to pass EDN over a websocket. The EDN contained a keyword with a space in it, and the clojure(jvm) part of sente had no problem serializing it, but the ClojureScript part of sente barfed on it. I thought it was a bug in sente, however sente simply calls clojure.core/pr to do the serialization... so I played with pr vs read-string and found that they disagree.

The serialization that clojure.core/pr does on a keyword with a space in it seems broken to me:

user> (clojure.core/with-out-str (clojure.core/pr {:onekey 1
                                                   (clojure.core/keyword "two key") 2}))
"{:onekey 1, :two key 2}"

There doesn't seem to be any way to parse that unambiguously.

I think this is a bug. What do you think?

https://github.com/ptaoussanis/sente/issues/251

Timothy Baldridge

unread,
Aug 3, 2016, 7:41:11 PM8/3/16
to clo...@googlegroups.com
I highly suggest using transit. It's much faster and formally specified. https://github.com/cognitect/transit-format

It's issues like this that caused the creation of transit in the first place.
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Blake Miller

unread,
Aug 3, 2016, 7:44:17 PM8/3/16
to Clojure
The docstring of clojure.core/pr

https://github.com/clojure/clojure/blob/clojure-1.7.0/src/clj/clojure/core.clj#L3552-L3555

actually says (in lieu of a formal EDN specification?)

"pr and prn print in a way that objects can be read by the reader"

...and the example I showed appears to violate that. Here's a minimal failing case:

user> (read-string (with-out-str (pr {(clojure.core/keyword "key word") 1})) )
RuntimeException Map literal must contain an even number of forms  clojure.lang.Util.runtimeException (Util.java:221)

Blake Miller

unread,
Aug 3, 2016, 7:45:32 PM8/3/16
to Clojure
Thanks, Timothy. I'll give transit a try.

Sean Corfield

unread,
Aug 3, 2016, 8:16:47 PM8/3/16
to Clojure Mailing List

You can programmatically create keywords that are illegal as literals, i.e., will not be accepted by the reader.

 

This is not a fault of clojure.core/pr – if it is given a value that uses legal (readable) keywords, its result will indeed be readable by clojure.core/read-string.

 

You can also programmatically create symbols that are illegal as far as the reader is concerned.

 

Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood

Blake Miller

unread,
Aug 3, 2016, 8:27:41 PM8/3/16
to Clojure
Thanks for that concise explanation, Sean. It makes sense to me that not all valid Clojure data is serializable.

There's still something about this that doesn't quite make sense to me, though:

clojure.core/pr, rather than throwing an exception when asked to serialize an instance of clojure.core.Keyword that cannot be serialized, it simply produces bad output. Bad = will cause the reader to throw.

Wouldn't it be preferable for pr to throw in this case?

The way I found out about this was the not-very-informative exception "Map literal must contain an even number of forms", because pr was fine with making a string that the reader wouldn't accept.

Can anyone think of a good reason why pr should *not* throw an exception on

(pr (keyword "foo bar"))

since there's no way of expressing that keyword as valid EDN?

Dan Burton

unread,
Aug 3, 2016, 8:57:12 PM8/3/16
to clo...@googlegroups.com
Why not just have #keyword and #symbol reader syntax tags that pr could produce for situations like this? It's really preferable that pr not throw exceptions, but it's also quite an abomination that it silently produces bad edn.


On Wednesday, August 3, 2016, Blake Miller <blak3...@gmail.com> wrote:
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
-- Dan Burton

Blake Miller

unread,
Aug 3, 2016, 9:14:16 PM8/3/16
to Clojure
You're right, Dan. Having mulled it over a little more, it's not clear to me why there ought to be any pure Clojure data (no Java objects) that cannot be serialized as EDN. Emitting a #keyword reader literal for this edge case would make sense to me.

Blake Miller

unread,
Aug 3, 2016, 9:16:02 PM8/3/16
to Clojure
Er, I mean "built-in reader macro dispatch".

Daniel Compton

unread,
Aug 4, 2016, 5:21:26 AM8/4/16
to Clojure
> Can anyone think of a good reason why pr should *not* throw an exception on
> (pr (keyword “foo bar"))
> since there’s no way of expressing that keyword as valid EDN?

This would break backwards compatibility, something Clojure rarely does.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Daniel

Herwig Hochleitner

unread,
Aug 4, 2016, 5:56:45 AM8/4/16
to clo...@googlegroups.com
2016-08-04 1:41 GMT+02:00 Timothy Baldridge <tbald...@gmail.com>:
I highly suggest using transit. It's much faster and formally specified. https://github.com/cognitect/transit-format

It's issues like this that caused the creation of transit in the first place.

 I thought transit was created to take advantage of fast JSON parsers. I've never understood it as a "more correct" edn. People reasonably expect transit (and fressian) to be on par with edn.

Are you suggesting we abandon edn, except for hand-written data?

Timothy Baldridge

unread,
Aug 4, 2016, 7:57:02 AM8/4/16
to clo...@googlegroups.com
The problem is that many do not understand that Clojure data is a superset of EDN. The two were never meant to be completely compatible. There are many things, especially when dealing with keywords and symbols, where its possible to have data that doesn't properly round-trip. 

An added problem when dealing with EDN is that there is only really one or two languages that properly parse it: Clojure and Clojurescript. So it's also a poor choice to use in cases where you desire any sort of interop. 

Add on top of all that that EDN parsing is really slow compared to other approaches, and you have a lot of compelling reasons to, as Herwig put it, "abandon edn, except for hand-written data". 

And yes, the original problem that caused the creation of Transit was "how do we get data from language A to language B while still staying fast, not implementing a ton of code, and keeping rich data (dates should be dates, not strings)."

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Herwig Hochleitner

unread,
Aug 4, 2016, 10:23:56 AM8/4/16
to clo...@googlegroups.com
2016-08-04 13:56 GMT+02:00 Timothy Baldridge <tbald...@gmail.com>:
The problem is that many do not understand that Clojure data is a superset of EDN. The two were never meant to be completely compatible. There are many things, especially when dealing with keywords and symbols, where its possible to have data that doesn't properly round-trip.

Then fressian and transit are supersets of edn as well. Are those, at least, meant to be the same set as clojure data?
Also, reader tags are a fantastic opportunity to make arbitrary data round-trippable.

An added problem when dealing with EDN is that there is only really one or two languages that properly parse it: Clojure and Clojurescript. So it's also a poor choice to use in cases where you desire any sort of interop.

There many edn libraries for various languages: https://github.com/edn-format/edn/wiki/Implementations
It is true, that there is a lack of compatibility, especially in the handling of symbols and keywords and the community is hurting for it (I remember a couple of tedious discussions on the matter)


Add on top of all that that EDN parsing is really slow compared to other approaches, and you have a lot of compelling reasons to, as Herwig put it, "abandon edn, except for hand-written data".

My view is, that those reasons should be eliminated, starting with interoperability concerns. I still think edn is a fantastic idea and to me it still holds the promise of being a replacement for json and xml, but only if we can get our act together and develop it towards that goal.

Please note, that my "except for hand-written data" was meant to be hyperbole. Every data is eventually machine-written.

Abandoning edn would send a fatal signal not just to people in the community. Especially if we let it slowly die instead of declaring it a failed experiment in data exchange.

Imagine if pr wouldn't handle embedded " quotes in strings and the inofficial recommendation would be to just avoid that use case or use a different encoding.

And yes, the original problem that caused the creation of Transit was "how do we get data from language A to language B while still staying fast, not implementing a ton of code, and keeping rich data (dates should be dates, not strings)."

I like the idea of having various encodings for different uses, but we should strife towards compatibility.

Blake Miller

unread,
Aug 5, 2016, 9:42:38 PM8/5/16
to clo...@googlegroups.com
I agree with Herwig in principal ... even though EDN is not meant to cover the whole set of possible pure Clojure data, if it can be made to cover more (all other things being equal) that would be a Good Thing. 

I think it would be possible to fix these edge cases with reader macro dispatches without breaking compatibility. The major snag though is that performance would suffer ... every single keyword or symbol being `pr`d would have to be tested, even those in the vast majority that don't need to be emitted in any special way. So my conclusion is it's not worth trying ...

It sucks that the docstring for pr https://github.com/clojure/clojure/blob/clojure-1.7.0/src/clj/clojure/core.clj#L3552-L3555 fails to mention that the function may succeed and produce a string that the reader will barf on, but I think we're pretty much stuck with it.

For posterity: I switched to using Transit for the Clojure(Script) app that had me run across this issue.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to

For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/Rc_b4_Da-KU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+unsubscribe@googlegroups.com.

Stuart Halloway

unread,
Aug 6, 2016, 1:10:38 PM8/6/16
to clo...@googlegroups.com
Has anybody written a pr-edn that enforces the rules of https://github.com/edn-format/edn, refusing to print anything nonconformant?

This would solve the original problem on this thread without requiring any changes to Clojure.

You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscribe@googlegroups.com.

Andy Fingerhut

unread,
Aug 6, 2016, 2:46:30 PM8/6/16
to clo...@googlegroups.com
The user-provided examples and comments at ClojureDocs.org are not official Clojure documentation, but in many cases contain useful additional info.

I have just added examples to clojure.core/pr similar to those that motivated this thread, and Timothy Baldridge's suggestion to use transit in cases where such Clojure data may be present.



Andy


You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages