Optimizaton for finding the next key in a sorted-map

85 views
Skip to first unread message

Harmon Nine

unread,
May 25, 2020, 9:47:58 PM5/25/20
to Clojure
Is there an optimization for sorted-maps that, when you have a given key in the map, you can get the next key in O(log n) time?

Given "compare" is the boolean function on which the sorted-map is based, the following code will get the next-key given a current-key:

(defn get-next-key [my-map current-key]
  (first (filter #(compare current-key %) (keys my-map))))


In the worst case, this would take O(n) time, as the "keys" function would iterate through the n keys of the map.

However, if it could be detected that the the filter is using the "compare" function on which the map is based, this could speed up the search:  the "current-key" could be found first and iteration could start from there.  Then the "first" function could cut the search short, resulting in worst-case O(log n) time to get the next key (given the sorted-map is based on a tree).


Does such an optimization exist?  If not, is there an means of getting the next-key in a sorted-map given a current-key that is better than O(n)?

Thanks :)
-- Harmon

Alex Miller

unread,
May 25, 2020, 9:58:49 PM5/25/20
to Clojure
Hi, there is no such optimization and that's not really feasible in the sorted-map impl. However there are other sorted map data structures like https://github.com/clojure/data.avl which have facilities in this area.

Carlo Zancanaro

unread,
May 25, 2020, 10:32:26 PM5/25/20
to clo...@googlegroups.com, Harmon Nine
Hey Harmon,

On Tue, May 26 2020, Harmon Nine wrote:
> Does such an optimization exist? If not, is there an means of getting the
> next-key in a sorted-map given a current-key that is better than O(n)?

I just had a look at clojure.core and found that subseq operates on a
sorted collection. Based on my quick test I think it can find the next
element in a sublinear number of comparisons:

(def ^:dynamic comparisons nil)
(defn count-comparisons [x]
(when comparisons (swap! comparisons inc))
x)
(def s (into (sorted-set-by (comp count-comparisons compare))
(range 1000)))

(= (for [i (range 1001)]
(first (subseq s > (- i 0.5))))
(concat (range 1000) [nil]))
;; true

(frequencies
(for [i (range 1001)]
(binding [comparisons (atom 0)]
(first (subseq s > (- i 0.5)))
@comparisons)))
;; => {10 256, 11 384, 12 192, 13 96, 14 56, 15 16, 16 1}

I can only confirm that this works for > and >= as the comparison,
though, which are interpreted relative to the sort order for the
collection. Switching the sort order in the collection effectively flips
the meaning of >, too.

(def s2 (into (sorted-set-by (comp - compare))
(range 1000)))
(= (for [i (range 1001)]
(first (subseq s2 > (- i 0.5))))
(cons nil (range 1000)))
;; true

Other comparisons seem to fall back to just calling take-while, which
doesn't seem right to me.

I hope that helps!

Carlo

Matthew Downey

unread,
May 25, 2020, 11:27:11 PM5/25/20
to Clojure
Don't forget about subseq and rsubseq, the technology behind [the world's smallest time series database](https://www.dotkam.com/2015/12/02/time-series-database-in-one-line-of-clojure/) ;) 


On Monday, May 25, 2020 at 4:47:58 PM UTC-5, Harmon Nine wrote:

Harmon Nine

unread,
May 26, 2020, 4:31:41 PM5/26/20
to Clojure
Thanks very much for your responses. :)  'subseq' looks like it will do what's needed.
-- Harmon

On Monday, May 25, 2020 at 4:47:58 PM UTC-5, Harmon Nine wrote:
Reply all
Reply to author
Forward
0 new messages