Duplicate key bug in hash-maps

387 views
Skip to first unread message

Tim Robinson

unread,
Jun 25, 2010, 6:27:19 AM6/25/10
to Clojure
I tried Clojure via Githhub today.

Anyone notice this bug that hadn't existed in Version 1.1

user=> #{:item1 {:a "A" :b "B"} :item2 {:a "A" :b "B"}}
java.lang.IllegalArgumentException: Duplicate key: {:a "A", :b "B"}

Tim

Stuart Halloway

unread,
Jun 25, 2010, 7:55:03 AM6/25/10
to clo...@googlegroups.com
Duplicate key prevention is a feature added in commit c733148ba0fb3ff7bbab133f5375422972e62d08.

Stu

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Michael Wood

unread,
Jun 25, 2010, 9:36:31 AM6/25/10
to clo...@googlegroups.com

You're trying to put duplicate values into a set.

Did you mean this instead?

user=> (def x {:item1 {:a "A" :b "B"} :item2 {:a "A" :b "B"}})
#'user/x
user=> (:item1 x)


{:a "A", :b "B"}

user=> (:item2 x)


{:a "A", :b "B"}

user=>

--
Michael Wood <esio...@gmail.com>

Mike Meyer

unread,
Jun 25, 2010, 9:53:54 AM6/25/10
to clo...@googlegroups.com, esio...@gmail.com
On Fri, 25 Jun 2010 15:36:31 +0200
Michael Wood <esio...@gmail.com> wrote:

> On 25 June 2010 12:27, Tim Robinson <tim.bl...@gmail.com> wrote:
> > I tried Clojure via Githhub today.
> >
> > Anyone notice this bug that hadn't existed in Version 1.1
> >
> > user=> #{:item1 {:a "A" :b "B"} :item2 {:a "A" :b "B"}}
> > java.lang.IllegalArgumentException: Duplicate key: {:a "A", :b "B"}
>
> You're trying to put duplicate values into a set.

So? Most places, putting a value that's already in a set into the set
is a nop. Even in clojure that exhibits the above behavior:

user=> #{:a :a}
java.lang.IllegalArgumentException: Duplicate key: :a
user=> (set [:a :a])
#{:a}
user=> (conj #{:a} :a)
#{:a}
user=>

Apparently, duplicate keys in sets are only disallowed in set
literals. Arguably, that must be a mistake on the users part, but
it sure seems to clash with the behavior of sets elsewhere.

<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Daniel Gagnon

unread,
Jun 25, 2010, 10:15:04 AM6/25/10
to clo...@googlegroups.com, esio...@gmail.com

Apparently, duplicate keys in sets are only disallowed in set
literals. Arguably, that must be a mistake on the users part, but
it sure seems to clash with the behavior of sets elsewhere.


Why would you ever want to write a duplicate in a set literal? 

Stuart Halloway

unread,
Jun 25, 2010, 10:31:57 AM6/25/10
to clo...@googlegroups.com
Duplicate keys in maps/sets are disallowed in literals and factory functions, where data is generally literal & inline and therefore likely represents coder error:

; all disallowed
#{:a :a}
{:a 1 :a 2}
(hash-map :a 1 :a 2)
(hash-set :a :a)

They are allowed in other contexts, where the data could come from anywhere:

; dumb, but these forms not generally called with a literal
(set [:a :a])
(into #{} [:a :a])

I find this behavior consistent and easy to explain, but I was involved in the design conversation so maybe I have participant blindness. :-)

Stu

Mike Meyer

unread,
Jun 25, 2010, 10:46:59 AM6/25/10
to clo...@googlegroups.com, stuart....@gmail.com
On Fri, 25 Jun 2010 10:31:57 -0400
Stuart Halloway <stuart....@gmail.com> wrote:

> Duplicate keys in maps/sets are disallowed in literals and factory functions, where data is generally literal & inline and therefore likely represents coder error:
>
> ; all disallowed
> #{:a :a}
> {:a 1 :a 2}
> (hash-map :a 1 :a 2)
> (hash-set :a :a)

Maps I can see being an error - you lose data in the process.

However, since you can plug variables of unknown provenance into
either the constructor or the literal, that's liable to create a nasty
surprise for someone at some point.

user=> (def a :a)
#'user/b
user=> (def b :a)
#'user/b
user=> (hash-set a b)
java.lang.IllegalArgumentException: Duplicate key: :a (NO_SOURCE_FILE:6)
user=> #{a b}
java.lang.IllegalArgumentException: Duplicate key: :a (NO_SOURCE_FILE:0)
user=>


> They are allowed in other contexts, where the data could come from anywhere:

It could come from anywhere in the two "forbidden" contexts as well.

> ; dumb, but these forms not generally called with a literal
> (set [:a :a])
> (into #{} [:a :a])
>
> I find this behavior consistent and easy to explain, but I was involved in the design conversation so maybe I have participant blindness. :-)

My initial reaction was "that's a bit odd, but probably a good idea."
However, given that I can use variables inside the literal and
constructor, I'm leaning the other way.

Or is (set [a b c]) idiomatic usage in this case, and (hash-set a b c)
or #{a b c} to be avoided?

Stuart Halloway

unread,
Jun 25, 2010, 11:37:32 AM6/25/10
to clo...@googlegroups.com
I think there are two important considerations in favor of how it works now:

(1) The "common case" presumptions (which admittedly may need to be learned).

(2) The need for both flavors. If there wasn't a flavor that rejected duplicate keys, somebody would surely ask for it.

Add to these considerations the names of the functions already in play, and you get the implementation you see.

Mike Meyer

unread,
Jun 25, 2010, 12:00:24 PM6/25/10
to clo...@googlegroups.com, stuart....@gmail.com
On Fri, 25 Jun 2010 11:37:32 -0400
Stuart Halloway <stuart....@gmail.com> wrote:

> (2) The need for both flavors. If there wasn't a flavor that rejected duplicate keys, somebody would surely ask for it.

I guess it makes as much sense as anything, given that you don't want
to get into -unique or some such.

But it does leave me wondering where the duplicate-rejecting version
of into is :-).

Tim Robinson

unread,
Jun 25, 2010, 12:24:34 PM6/25/10
to Clojure
Can I change the title to:

"Duplicate key error handling feature in hash-sets" ?

I was using the '#' thinking it was short for a hash-map, rather than
a hash-set.

Clojure has more data structures available than I'm used to working
with.
So thanks for the error handling.

Tim

Mike Anderson

unread,
Jun 26, 2010, 8:24:34 AM6/26/10
to Clojure
I agree that duplicate keys in literals are probably a coder error but
IMO this deserves some kind of compiler warning rather than an error.

You're going to get into lots of sticky situations otherwise that only
confuse people if the semantics are different between literals and
other usage. Simple is good.

On Jun 25, 3:31 pm, Stuart Halloway <stuart.hallo...@gmail.com> wrote:
> Duplicate keys in maps/sets are disallowed in literals and factory functions, where data is generally literal & inline and therefore likely represents coder error:
>
> ; all disallowed
> #{:a :a}
> {:a 1 :a 2}
> (hash-map :a 1 :a 2)
> (hash-set :a :a)
>
> They are allowed in other contexts, where the data could come from anywhere:
>
> ; dumb, but these forms not generally called with a literal
> (set [:a :a])
> (into #{} [:a :a])
>
> I find this behavior consistent and easy to explain, but I was involved in the design conversation so maybe I have participant blindness. :-)
>
> Stu
>
>
>
> > On Fri, 25 Jun 2010 15:36:31 +0200
> > Michael Wood <esiot...@gmail.com> wrote:

Anton Josua

unread,
Jun 26, 2010, 10:34:07 AM6/26/10
to Clojure
Easy to explain - absolutely, consistent - mm, not really...

I found this new behavior a bit confusing, imo it breaks principle of
least surprise.
This feature is uncommon in dynamic languages (even Scala allows
duplicate keys - Set('a,'a)/Map('a->1,'a->1)).

Also, from the practical point of view, when prototyping:
- repl: user>(some-fn #{1 ...}) ;just experiment here
- file: #{"http://xyz" ...} ;bunch of statistical data
In both cases occasional duplicate is not critical.
However, one will be forced to spend extra time on input double-check
and/or deal with error message(s).

-anton

On Jun 25, 5:31 pm, Stuart Halloway <stuart.hallo...@gmail.com> wrote:
> Duplicate keys in maps/sets are disallowed in literals and factory functions, where data is generally literal & inline and therefore likely represents coder error:
>
> ; all disallowed
> #{:a :a}
> {:a 1 :a 2}
> (hash-map :a 1 :a 2)
> (hash-set :a :a)
>
> They are allowed in other contexts, where the data could come from anywhere:
>
> ; dumb, but these forms not generally called with a literal
> (set [:a :a])
> (into #{} [:a :a])
>
> I find this behavior consistent and easy to explain, but I was involved in the design conversation so maybe I have participant blindness. :-)
>
> Stu
>
>
>
> > On Fri, 25 Jun 2010 15:36:31 +0200
> > Michael Wood <esiot...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages