ordered map in Clojure?

4,245 views
Skip to first unread message

Alex Baranosky

unread,
Jun 25, 2011, 3:55:57 PM6/25/11
to clo...@googlegroups.com
What are some options for having a map that guarantees ordering of its keys in Clojure? (note: sorted-map won't do!)  My first try was to use LinkedHashMap, but am running into exceptions of the " java.util.LinkedHashMap cannot be cast to clojure.lang.Associative" variety.  

So then I tried to use extend-type LinkedHashMap to Associative.  Then I received the error message: "interface clojure.lang.Associative is not a protocol" ... makes sense :)

So now that I've gone through all of that, which was fun while it lasted, could any of you help me figure out how to use a map that guarantees the entries are in the order they were entered?

Thanks for the help!

Alex

James Estes

unread,
Jun 25, 2011, 4:00:38 PM6/25/11
to clo...@googlegroups.com
ArrayMap?
http://clojure.org/data_structures#toc21

James

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Alex Baranosky

unread,
Jun 25, 2011, 4:11:32 PM6/25/11
to clo...@googlegroups.com
Ha!  Perfect.  Thanks.

Alan Malloy

unread,
Jun 25, 2011, 4:17:08 PM6/25/11
to Clojure
ArrayMap isn't very performant for large collections. You might like
https://github.com/flatland/ordered

On Jun 25, 1:11 pm, Alex Baranosky <alexander.barano...@gmail.com>
wrote:

Laurent PETIT

unread,
Jun 25, 2011, 4:18:23 PM6/25/11
to clo...@googlegroups.com
Beware the devil hidden in the details:
"
Note that an array map will only maintain sort order when
un-'modified'. Subsequent assoc-ing will eventually cause it to
'become' a hash-map.
"

2011/6/25 Alex Baranosky <alexander...@gmail.com>:

Ken Wesson

unread,
Jun 25, 2011, 4:33:34 PM6/25/11
to clo...@googlegroups.com
On Sat, Jun 25, 2011 at 4:18 PM, Laurent PETIT <lauren...@gmail.com> wrote:
> Beware the devil hidden in the details:
> "
> Note that an array map will only maintain sort order when
> un-'modified'. Subsequent assoc-ing will eventually cause it to
> 'become' a hash-map.
> "

Besides a LinkedHashMap-alike, Clojure could probably use at least
these other map variations:

* WeakHashMap (keys weakly bound)
* Map with weak values (useful for caching)
* ConcurrentHashMap

and possibly combinations of the above. Even Java seems to lack
weak-value maps; you can fudge it though by putting a
WeakReference(val) in in a wrapper layer and using a ReferenceQueue
and a thread that drains the queue every so often to prune dead
Map.Entries from the map.

Note that a map of refs doesn't cut it as a substitute for a
ConcurrentHashMap if keys will be added/removed by multiple threads,
and sticking the entire map in an atom doesn't if you want to support
more concurrency than a plain old Collections.synchronizedMap.
Sticking the entire map in a ref and using commute doesn't quite work,
if two concurrent assoces of the same key should force a retry, and
sticking the entire map in a ref and using alter allows no more
concurrency than synchronizedMap.

We really could use an STM-map that supports normal Clojure map get
semantics, but not assoc or dissoc, and supports alter-key and
remove-key which have to be called in a transaction and act like
commuted assoc and dissoc except that one transaction retries if there
are concurrent attempts to change the *same* key.

Perhaps we should also have a way to simply specify, separately and
combinably, in both stm-map and hash-map, options to make the keys
weak, the values weak, or the seq/keys/vals views use the insertion
order. With both weakable keys and weakable vals you can also throw in
a weak-hash-set which can be used to "intern" or "canonicalize" any
desired immutable objects, without packratting them and eventually
blowing the heap, and is implemented under the hood as a weak-keyed,
weak-valued map of objects to themselves.

--
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

Ivan Koblik

unread,
Jun 25, 2011, 6:52:33 PM6/25/11
to clo...@googlegroups.com
Hi Ken,

Having a mutable stm-map would be awesome. I actually had to implement one for my project at work. Sorry can't publish it due to licensing issues, but anyway my implementation is dead simple and won't perform well on big data sets. I was trying to understand how to implement detection of new/removed keys with current Clojure's STM, and could find a solution. As you mentioned, second issue is merging of changes to the same map from two different transactions.

I know that there is a transactional map somewhere inside Clojure that is based on references, but its partitioning is too coarse and doesn't depend on the data set.

I would love to implement stm-map, but I couldn't find a data structure that would give good performance and small memory usage. I would appreciate if you could point me to relevant documentation or research.

Cheers,
Ivan.



--

Alex Baranosky

unread,
Jun 25, 2011, 6:56:30 PM6/25/11
to clo...@googlegroups.com
I wonder at what point array-map becomes a hash-map?  Maybe I could use it?  .... On second thought, I don't really want to depend on something so flimsy.

Alan, is your ordered map code available on clojars?

Alex


Alan Malloy

unread,
Jun 25, 2011, 7:09:17 PM6/25/11
to Clojure
array-map turns into a hashmap after ten insertions, currently, but
that's far from guaranteed:

user> (->> {}
(iterate #(assoc % (rand-int 1e6) 1))
(drop-while (comp #{(class {})} class))
(first)
(count))
10

My ordered-set/map are now available on clojars: depend on [ordered
"0.3.0"]

On Jun 25, 3:56 pm, Alex Baranosky <alexander.barano...@gmail.com>
wrote:

Alan Malloy

unread,
Jun 25, 2011, 7:14:21 PM6/25/11
to Clojure
Note that it's still pretty immature - I started writing it last
weekend. It's tested fairly thoroughly, and I spent four or five days
hunting down the best performance I could, so it should be perfectly
usable. However, if you have any problems please let me know.

Especially, if you plan to dissoc a lot, that will degrade the
performance of seq, keys, etc. - anything that traverses in sorted
order will get a bit slower every time you dissoc. You can call
(compact m) to clean up the mess, but it's not documented yet; I
wasn't planning for anyone to start using it so soon.

Tassilo Horn

unread,
Jun 28, 2011, 3:54:04 PM6/28/11
to clo...@googlegroups.com
Alan Malloy <al...@malloys.org> writes:

Hi Alan,

> ArrayMap isn't very performant for large collections. You might like
> https://github.com/flatland/ordered

in my clojure app, I totally rely on ordered sets. Currently, I use

https://github.com/ninjudd/ordered-set

for which I've implemented transient support which is already
incorporated and released. Looking at your code, it basically looks
identical wrt features, except that your implementation is comletely in
clojure (and includes maps) while ninjudd's implementation is mainly
java.

If you know ninjudd's lib, can you give advice in what use-cases you'd
prefer your own lib? Basically, in my scenario I only conjoin using
`into' but never disjoin, and performance for that is very important.
That's why I've implemented the transient support. But as it turned
out, that didn't improve performance significantly, cause the
calculation of set members is far more expensive than aggregating
them...

Well, in any case, I think I'll simply try it out at the weekend.
Hopefully, I'm able to get some totally non-empirical benchmark results
from my test suite that I can poste here.

Bye,
Tassilo

Alan Malloy

unread,
Jun 29, 2011, 1:12:58 AM6/29/11
to Clojure
Well, I wrote the clojure version after boasting to ninjudd that
deftype would let you do it without writing any java at all. I did
some simple-minded benchmarking, and while I don't recall the exact
numbers his java version is ~10-50% faster. If you look through the
git logs, somewhere I have a version that is only 5% slower than his,
but when I added efficient disjoin support that slowed down the conj
case a bit; we decided that the purity was worth the performance cost.
If you want to resurrect that version, let me know and I'll hunt it
down.

I also have a version with even more-efficient disjoin but it slowed
conjing down to log2(n) instead of log32(n), and that performance hit,
while theoretically nil, was very noticeable in practice.

If I were using this for some performance-critical task and never
disjoined, I'd use ninjudd's version. Otherwise I'd use mine, just
because it's easier to build and tweak since it's in clojure.

On Jun 28, 12:54 pm, Tassilo Horn <tass...@member.fsf.org> wrote:

Tassilo Horn

unread,
Jul 2, 2011, 11:29:56 AM7/2/11
to clo...@googlegroups.com
Alan Malloy <al...@malloys.org> writes:

Hi Alan,

> If I were using this for some performance-critical task and never


> disjoined, I'd use ninjudd's version. Otherwise I'd use mine, just
> because it's easier to build and tweak since it's in clojure.

I tried your implementation, and I couldn't see any relevant slowdown
compared to the java version. So I'll stick with that.

Bye,
Tassilo

Reply all
Reply to author
Forward
0 new messages