Peculiar transients behaviour

71 views
Skip to first unread message

Peter Taoussanis

unread,
Aug 24, 2011, 2:22:06 AM8/24/11
to Clojure
Hi all,

I've just run into what appears to be either some subtle transients
behaviour I'm misunderstanding, or a bug (I'm running the 1.3 beta).

Given the function:

(defn test-fn1
[map-context1]
(persistent!
(reduce
(ƒ [map-context2 key]
(when-not (get map-context2 key) (throw (Exception. "Woops!")))
(dissoc! map-context2 key))

(transient map-context1)
(keys map-context1))))

(test-fn1
(zipmap (repeatedly 1000 #(rand))
(repeatedly 1000 #(rand))))

operates as I would expect, and returns the empty map {}.


Running against a larger input map, however:

(test-fn1
(zipmap (repeatedly 10000 #(rand))
(repeatedly 10000 #(rand))))

and the exception starts triggering.


Now I thought maybe this had something to do with transients having a
limit to the number of ops that can be performed on them (?), so I
tried an alternative definition:

(defn test-fn2
[map-context1]
(persistent!
(reduce
(ƒ [map-context2 key]
(when-not (get map-context1 key) (throw (Exception. "Woops!")))
(dissoc! map-context2 key))

(transient map-context1)
(keys map-context1))))

And what's getting me is that the exception is -still- triggering here
for sufficiently large input maps.

It won't trigger if transients aren't being used.

Does this make sense? Even if I'm missing something and using the
transients incorrectly, why would the second example still be
exhibiting the same problem assuming the initial map-context1 is still
persistent?

Would really appreciate any kind of input- thanks!

--
Peter Taoussanis

Ken Wesson

unread,
Aug 24, 2011, 2:38:17 AM8/24/11
to clo...@googlegroups.com
What does zipmap do if the key seq contains duplications?

--
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

Peter Taoussanis

unread,
Aug 24, 2011, 2:45:28 AM8/24/11
to Clojure
Hi Ken,

> What does zipmap do if the key seq contains duplications?

It acts like a merge:

(zipmap
'(:a :b :c :d :c :c)
'(:A :B :C :D :E :F))

gives {:a :A :b :B :c :F :d :D}.

The input map itself seems to be irrelevant: the example I gave here
was synthetic just for simplicity.

Alan Malloy

unread,
Aug 24, 2011, 3:27:04 AM8/24/11
to Clojure
On Aug 23, 11:38 pm, Ken Wesson <kwess...@gmail.com> wrote:
> What does zipmap do if the key seq contains duplications?

That was my instinct too, but (a) a few thousand numbers won't collide
very often at all given the problem space, and (b) some experimenting
indicates that the key-seq is always 100k elements large - no
duplicates.

HOWEVER, I did find the problem: the keys themselves don't collide,
but their hashes do. The number of elements that get messed up is
related to the number of hash collisions, but I still don't quite
understand how they interact.

Here's a modified code snippet that demonstrates the issue:

user> (let [m (apply zipmap (repeatedly 2 #(repeatedly 100000 rand)))]
(println (count (distinct (map hash (keys m)))))
((juxt count identity) (persistent!
(reduce dissoc! (transient m) (keys m)))))
100000 ;; no collisions
[0 {}] ;; map is empty at the end

user> (let [m (apply zipmap (repeatedly 2 #(repeatedly 100000 rand)))]
(println (count (distinct (map hash (keys m)))))
((juxt count identity) (persistent!
(reduce dissoc! (transient m) (keys m)))))
99996 ;; four collisions
[8 {0.30426231137219917 0.8531183785687654, 0.8893047006425385
0.4788315896128895, 0.47854633997540674 0.45133768991797785,
0.5265638224227486 0.7724779126227945}] ;; a four-element map that
reports its count as eight!!!

That last comment seems to indicate a very serious error somewhere:
not only is the transient map broken, but it creates a broken
persistent object. I'll file a JIRA issue for this, and see if I can
find out any more about the cause.

FWIW, I'm using 1.2.1 for the above output.

Alan Malloy

unread,
Aug 24, 2011, 4:32:02 AM8/24/11
to Clojure

Peter Taoussanis

unread,
Aug 24, 2011, 10:41:15 AM8/24/11
to Clojure
> Ticket is athttp://dev.clojure.org/jira/browse/CLJ-829

Thanks Alan, that's great!

Aaron Bedra

unread,
Aug 25, 2011, 9:03:26 AM8/25/11
to clo...@googlegroups.com
Please search through previous messages/tickets before posting new
issues. This issue has been fixed as of commit da412909d36551a526ed and
will be included in the -beta2 release. It was originally ticketed here:

http://dev.clojure.org/jira/browse/CLJ-816

(let [m (into {} (for [x (range 100000)] [(rand) (rand)]))]


(println (count (distinct (map hash (keys m)))))
((juxt count identity) (persistent!
(reduce dissoc! (transient m) (keys m)))))
100000

[0 {}]

Cheers,

Aaron Bedra
--
Clojure/core
http://clojure.com

Alan Malloy

unread,
Aug 25, 2011, 1:14:23 PM8/25/11
to Clojure
I did, of course. I searched for "transient", and didn't find any
describing this issue. And looking at the issue you link to, I still
don't see how it's related: it's a patch specifically for vectors, and
this code doesn't touch vectors; and it involves changes to a
transient "leaking" back into the original persistent, while this code
is about getting the wrong persistent object back after some changes.

I don't have a lot of experience with building clojure, but I'll try
compiling latest master and trying it out. In the meantime, did you
try this code more than once? Because it includes randomness,
sometimes it works without problems.

On Aug 25, 6:03 am, Aaron Bedra <aaron.be...@gmail.com> wrote:
> Please search through previous messages/tickets before posting new
> issues.  This issue has been fixed as of commit da412909d36551a526ed and
> will be included in the -beta2 release.  It was originally ticketed here:
>
> http://dev.clojure.org/jira/browse/CLJ-816
>
> (let [m (into {} (for [x (range 100000)] [(rand) (rand)]))]
> (println (count (distinct (map hash (keys m)))))
> ((juxt count identity) (persistent!
> (reduce dissoc! (transient m) (keys m)))))
> 100000
> [0 {}]
>
> Cheers,
>
> Aaron Bedra
> --
> Clojure/corehttp://clojure.com

Alan Malloy

unread,
Aug 25, 2011, 1:21:56 PM8/25/11
to Clojure
Update: just built master, and issue still exists. If you want to be
"sure" you get it, just add another zero to the input range. I'll
mention that in the ticket as well.

Aaron Bedra

unread,
Aug 26, 2011, 10:22:07 AM8/26/11
to clo...@googlegroups.com
Adding the additional zero did trigger the issue for me. I have bumped
the ticket into Release.Next. Thanks for the clarification.

Cheers,

Aaron Bedra
--
Clojure/core
http://clojure.com

Reply all
Reply to author
Forward
0 new messages