Bug in (keys/seq (IdentityHashMap. ...))?

46 views
Skip to first unread message

Jason Wolfe

unread,
Nov 1, 2011, 2:54:21 PM11/1/11
to Clojure Dev
Not sure if this is an issue with keys, IdentityHashMap, or some
interaction, but it's definitely not what I expected:

user> *clojure-version*
{:major 1, :minor 4, :incremental 0, :qualifier "alpha1"}
user> (def x (keys (java.util.IdentityHashMap. {:a true :b true})))
#'user/x
user> x
(:b :a)
user> x
(:a :a)

Should I create a ticket?

-Jason

Stuart Halloway

unread,
Nov 2, 2011, 8:18:00 AM11/2/11
to cloju...@googlegroups.com

Clojure seqs over iterators to provide a stable immutable view of them. But the nastiness of an IdentityHashMap is deeper. It isn't just the iterator that is mutable. The thing *returned* by the iterator is also mutable. (In fact, the iterator returns itself.)

So the process of consuming the iterator wrecks the objects returned by the iterator.

To patch this up on the Clojure side would require special casing for IdentityHashMap (and more special casing for any other classes that behave like this).

I am tempted to lay the blame with IdentityHashMap, document, and move on. What do others think?

Stu


Rich Hickey

unread,
Nov 2, 2011, 8:24:24 AM11/2/11
to cloju...@googlegroups.com

Did you draw your conclusions from looking at docs/behavior/code?

Pointers please.

Thanks,

Rich


Stuart Halloway

unread,
Nov 2, 2011, 8:37:51 AM11/2/11
to cloju...@googlegroups.com

Behavior, code, and comments:

Behavior:

(-> (java.util.IdentityHashMap. {:a true :b true})
(.entrySet)
(.iterator)
(.next))
=> #<EntryIterator :a=true> ;; returns the iterator! yikes!

I don't have a link into JDK source (will look after I hit send), but the OpenJDK source for the nested class has the following comment above the EntryInterator nested class of IdentityHashMap:

/**
* Since we don't use Entry objects, we use the Iterator
* itself as an entry.
*/

The behavior is *not* documented in the javadoc http://download.oracle.com/javase/6/docs/api/java/util/IdentityHashMap.html.

Stu

Jason Wolfe

unread,
Nov 2, 2011, 2:02:41 PM11/2/11
to Clojure Dev
I couldn't find much in the Javadocs either (Iterators don't seem to
promise much about their return values either way). This stack
overflow answer may be of use in describing the actual state of
affairs, however:
http://stackoverflow.com/questions/5455824/doesnt-that-iterate-thru-entryset-create-too-many-map-entry-instances/5455903#5455903.

It's also the case that if keys and vals used iterators on .keySet
and .values rather than on .entrySet they would avoid this problem,
perhaps be more performant, and also be robust to changes in the
underlying map (although this is still of no help to seq, obviously):

user> (def h (java.util.HashMap. {1 2 3 4}))
#'user/h
user> (def x (vals h))
#'user/x
user> x
(2 4)
user> (.put h 1 5)
2
user> x
(5 4)

(this behavior is documented in the javadoc:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/Map.Entry.html,
although I suppose it's not much more surprising than when you make a
seq of an array and then mutate it.)

Thanks,
Jason
> The behavior is *not* documented in the javadochttp://download.oracle.com/javase/6/docs/api/java/util/IdentityHashMa....
>
> Stu

Jason Wolfe

unread,
Nov 11, 2011, 4:20:10 PM11/11/11
to Clojure Dev
I made a ticket [1].

This was a particularly nasty bug to track down, so if it would be
possible to fix or clearly document it could be a significant help to
others.

Perhaps the most relevant javadoc quote: "These Map.Entry objects are
valid *only* for the duration of the iteration; more formally, the
behavior of a map entry is undefined if the backing map has been
modified after the entry was returned by the iterator, except through
the iterator's own remove operation, or through the setValue operation
on a map entry returned by the iterator." [2].

This description is especially nasty, since (1) the first half of the
statement is ambiguous as to whether "duration of the iteration"
refers to the time between each call to .next(), or the time until the
entire iterator is consumed (and IdentityHashMap seems to satisfy the
former description but not the latter), and (2) the first and second
half of the statement seem to be saying quite different things, where
in my limited understanding Clojure's behavior is okay with respect to
the latter "more formally" description but not the former "duration of
the iteration" description.

-Jason

[1] http://dev.clojure.org/jira/browse/CLJ-875
[2] http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Map.Entry.html

On Nov 2, 10:02 am, Jason Wolfe <ja...@w01fe.com> wrote:
> I couldn't find much in the Javadocs either (Iterators don't seem to
> promise much about their return values either way).  This stack
> overflow answer may be of use in describing the actual state of
> affairs, however:http://stackoverflow.com/questions/5455824/doesnt-that-iterate-thru-e....
>
> It's also the case that if keys and vals used iterators on .keySet
> and .values rather than on .entrySet they would avoid this problem,
> perhaps be more performant, and also be robust to changes in the
> underlying map (although this is still of no help to seq, obviously):
>
> user> (def h (java.util.HashMap. {1 2 3 4}))
> #'user/h
> user> (def x (vals h))
> #'user/x
> user> x
> (2 4)
> user> (.put h 1 5)
> 2
> user> x
> (5 4)
>
> (this behavior is documented in the javadoc:http://download.oracle.com/javase/1.4.2/docs/api/java/util/Map.Entry....,
Reply all
Reply to author
Forward
0 new messages