Error in documentation? (sorted-map vs. array-map)

47 views
Skip to first unread message

Brian Marick

unread,
Jun 17, 2013, 4:55:13 PM6/17/13
to clojure...@googlegroups.com
In http://clojuremongodb.info/articles/querying.html, there's this example:

;; find top 10 scores that will be returned as Clojure maps
(with-collection "scores"
(find {})
(fields [:score :name])
;; it is VERY IMPORTANT to use sorted maps with sort
(sort (sorted-map :score -1 :name 1))
(limit 10))

Is that advice about sorted-map right? It seems like it should be an array-map (relying on the [undocumented?] feature that array-maps are in key order). For example:

I have this compound index:
{
"v" : 1,
"key" : {
"user" : NumberLong(1),
"timestamp" : NumberLong(1)
},
"ns" : "student.student_records",
"name" : "user_1_timestamp_1"
}

It was created like this (per "Creating indexes" in http://clojuremongodb.info/articles/collections.html):

(mc/ensure-index collection (array-map :user 1, :timestamp 1))

I populated the collection like this:

(mc/insert ps/collection {:_id 6, :user 111, :timestamp (DateTime. 343)})
(mc/insert ps/collection {:_id 4, :user 11, :timestamp (DateTime. 10000)})
(mc/insert ps/collection {:_id 2, :user 11, :timestamp (DateTime. 100)})
(mc/insert ps/collection {:_id 3, :user 11, :timestamp (DateTime. 1000)})
(mc/insert ps/collection {:_id 5, :user 111, :timestamp (DateTime. 10)})
(mc/insert ps/collection {:_id 1, :user 1, :timestamp (DateTime. 1)})

The expected order after a sort should be ascending order by ids. I get that expected order from these three queries:

(with-collection collection (find {}) (sort (sorted-map :user 1)))
;; including these two from the Mongo shell
> db.student_records_test.find().sort({user: 1})
> db.student_records_test.find().sort({user: 1, timestamp: 1})

However, this doesn't work:

(with-collection collection (find {}) (sort (sorted-map :user 1, :timestamp 1)))

It's in timestamp order:

:user 1, :timestamp (DateTime. 1
:user 111, :timestamp (DateTime. 10)
:user 11, :timestamp (DateTime. 100
:user 111, :timestamp (DateTime. 343
:user 11, :timestamp (DateTime. 1000)})
:user 11, :timestamp (DateTime. 10000)})

That's what I'd expect because the :timestamp will be sorted before the :user.

If I use an array-map, things work as I'd expect:

(map :_id (with-collection collection (find {}) (sort (array-map :user 1, :timestamp 1))))
(1 2 3 4 5 6)

---
O wad some Pow'r the giftie gie us
To see oursels as ithers see us!
-- Bobbie Burns

Michael Klishin

unread,
Jun 17, 2013, 5:01:06 PM6/17/13
to Monger, a Clojure MongoDB driver

2013/6/18 Brian Marick <br...@getset.com>

If I use an array-map, things work as I'd expect:

so, sorted-map does not preserve ordering of entries?

That's unexpected. Feel free to submit a PR that replaces all sorted-map uses with array-map.

The idea is to use maps that preserve ordering 'cause that's obviously crucial for indexes.
--
MK

http://github.com/michaelklishin
http://twitter.com/michaelklishin

Brian Marick

unread,
Jun 17, 2013, 5:41:57 PM6/17/13
to clojure...@googlegroups.com

On Jun 17, 2013, at 4:01 PM, Michael Klishin <michael....@gmail.com> wrote:
> so, sorted-map does not preserve ordering of entries?

sorted-map is sorted in natural sort order of keys, not order of insertion. For keywords, that's alphabetical:

owl.main=> (sorted-map :z 1, :b 2)
{:b 2, :z 1}

Michael Klishin

unread,
Jun 17, 2013, 5:46:34 PM6/17/13
to Monger, a Clojure MongoDB driver

2013/6/18 Brian Marick <br...@getset.com>

sorted-map is sorted in natural sort order of keys, not order of insertion. For keywords, that's alphabetical:

owl.main=> (sorted-map :z 1, :b 2)
{:b 2, :z 1}

This makes sense. We should use array-map everywhere then. Including tests (people sometimes use them
for examples).

Brian Marick

unread,
Jun 17, 2013, 5:55:35 PM6/17/13
to clojure...@googlegroups.com

On Jun 17, 2013, at 4:46 PM, Michael Klishin <michael....@gmail.com> wrote:

> This makes sense. We should use array-map everywhere then. Including tests (people sometimes use them
> for examples).

I'll put a pull request on my non-work todo list, but it will be a while.

Eric Arthen

unread,
Jun 18, 2013, 8:16:35 PM6/18/13
to clojure...@googlegroups.com
Yes, I've run into this with needing to use array-map for sort params, as Brian described.

Eric

Brian Marick

unread,
Jun 19, 2013, 9:52:44 AM6/19/13
to clojure...@googlegroups.com

On Jun 18, 2013, at 7:16 PM, Eric Arthen <ear...@brightcove.com> wrote:

> Yes, I've run into this with needing to use array-map for sort params, as Brian described.

One thought: the Midje internals use the ordered maps from https://github.com/flatland/ordered Might it be better to use those than rely on an implementation detail of array map?

Eric Arthen

unread,
Jun 19, 2013, 8:46:16 PM6/19/13
to clojure...@googlegroups.com
Preserving insert order is a defined property of array-maps according to here, not an implementation detail. So I think we can count on it.

  http://clojure.org/data_structures#Data%20Structures-ArrayMaps

The one warning they have there is: "Note that an array map will only maintain sort order when un-'modified'. Subsequent assoc-ing will eventually cause it to 'become' a hash-map."

So if you wanted to do much manipulation to a list and still maintain order then something like 'flatland/ordered' could be good. In my use for sorting in mongo I've generally made the list once and used it, so that is not a problem. Also, of course, array-map's speed is only good for small numbers of entries since it is linear access.

Eric

Michael Klishin

unread,
Jun 20, 2013, 1:35:54 AM6/20/13
to Monger, a Clojure MongoDB driver

2013/6/19 Brian Marick <br...@getset.com>

Might it be better to use those than rely on an implementation detail of array map?

We can borrow their implementation to ClojureWerkz Support.

I need to take a look at what it does. Not asking the user to know what map impl
is required is a good thing.
Reply all
Reply to author
Forward
0 new messages