Hi,
I am pleased to announce the initial release of maxiphobe, a meldable
priority queue library based on Okasaki's maxiphobic heaps (see
Okasaki, "Alternatives to Two Classic Data Structures"). Maxiphobic
heaps are a strikingly simple purely functional approach to
implementing priority queues that offers very reasonable performance
given the available operations.
Maxiphobe priority queues are, in effect, persistent lists that
maintain a non-decreasing order of elements, as determined either by
Clojure's default comparator or a custom comparator passed in by the
user.
Additionally, instances using the same comparator can be "melded" into
a single priority queue containing all the elements of the inputs.
(require '[maxiphobe.core :as pq])
(pq/pq [0 -1 3 4 2])
;= (-1 0 2 3 4)
(pq/pqueue 0 -1 3 4 2)
;= (-1 0 2 3 4)
(pq/pq-by > [0 -1 3 4 2])
;= (4 3 2 0 -1)
(pq/pqueue-by > 0 -1 3 4 2)
;= (4 3 2 0 -1)
(peek (pq/pqueue-by > 0 -1 3 4 2)) ; or first
;= 4
(pop (pq/pqueue-by > 0 -1 3 4 2)) ; or next
;= (3 2 0 -1)
(pq/meld (pq/pqueue -1 3 2 8) (pq/pqueue 0 -3 2 4))
;= (-3 -1 0 2 2 3 4 8)
To include maxiphobe in your project:
[maxiphobe "0.0.1"]
<dependency>
<groupId>maxiphobe</groupId>
<artifactId>maxiphobe</artifactId>
<version>0.0.1</version>
</dependency>
compile "maxiphobe:maxiphobe:0.0.1"
See below for some benchmark results.
Cheers,
Michał
Benchmarks
==========
1. Compare basic operations to sorted-set-as-PQ and to regular (FIFO)
peristent queues to get a high-level picture of maxiphobe
performance. FIFO queue operations are included to provide a point
of reference – they are, of course, completely different
semantically.
(let [pq (pq/pq (range 1000))
s (apply sorted-set (range 1000))
q (into clojure.lang.PersistentQueue/EMPTY (range 1000))]
(assert (= (seq s) pq q))
(println "peek")
(c/quick-bench (peek pq))
(c/quick-bench (first s))
(c/quick-bench (peek q))
(println "pop")
(c/quick-bench (pop pq))
(c/quick-bench (disj s (first s)))
(c/quick-bench (pop q))
(let [pq (nth (iterate pop pq) 500)
s (apply sorted-set pq)
q (nth (iterate pop q) 500)]
(assert (= (seq s) pq q))
(println "pop2")
(c/quick-bench (pop pq))
(c/quick-bench (disj s (first s)))
(c/quick-bench (pop q))))
peek
;;; maxiphobe PQ:
;;; (peek pq)
Evaluation count : 32624694 in 6 samples of 5437449 calls.
Execution time mean : -2.422742 ns
Execution time std-deviation : 0.630313 ns
Execution time lower quantile : -2.934682 ns ( 2.5%)
Execution time upper quantile : -1.681600 ns (97.5%)
Overhead used : 21.406378 ns
;;; sorted set:
;;; (first s)
Evaluation count : 1944576 in 6 samples of 324096 calls.
Execution time mean : 3.262937 µs
Execution time std-deviation : 440.634979 ns
Execution time lower quantile : 2.728010 µs ( 2.5%)
Execution time upper quantile : 3.753706 µs (97.5%)
Overhead used : 21.406378 ns
;;; queue
;;; (peek q)
Evaluation count : 22712976 in 6 samples of 3785496 calls.
Execution time mean : 7.395679 ns
Execution time std-deviation : 0.615558 ns
Execution time lower quantile : 6.766551 ns ( 2.5%)
Execution time upper quantile : 8.050620 ns (97.5%)
Overhead used : 21.406378 ns
pop
;;; maxiphobe PQ:
;;; (pop pq)
Evaluation count : 1107840 in 6 samples of 184640 calls.
Execution time mean : 2.143321 µs
Execution time std-deviation : 1.064519 µs
Execution time lower quantile : 362.028152 ns ( 2.5%)
Execution time upper quantile : 3.175097 µs (97.5%)
Overhead used : 21.406378 ns
;;; sorted set:
;;; (disj s (first s))
Evaluation count : 508608 in 6 samples of 84768 calls.
Execution time mean : 10.498305 µs
Execution time std-deviation : 2.658746 µs
Execution time lower quantile : 6.260925 µs ( 2.5%)
Execution time upper quantile : 12.859465 µs (97.5%)
Overhead used : 21.406378 ns
;;; queue:
;;; (pop q)
Evaluation count : 8807040 in 6 samples of 1467840 calls.
Execution time mean : 730.736239 ns
Execution time std-deviation : 122.924195 ns
Execution time lower quantile : 561.778923 ns ( 2.5%)
Execution time upper quantile : 839.133234 ns (97.5%)
Overhead used : 21.406378 ns
pop2
;;; maxiphobe PQ:
;;; (pop pq)
Evaluation count : 1816404 in 6 samples of 302734 calls.
Execution time mean : 2.874078 µs
Execution time std-deviation : 310.794245 ns
Execution time lower quantile : 2.614476 µs ( 2.5%)
Execution time upper quantile : 3.233927 µs (97.5%)
Overhead used : 21.406378 ns
;;; sorted set:
;;; (disj s (first s))
Evaluation count : 554400 in 6 samples of 92400 calls.
Execution time mean : 11.149426 µs
Execution time std-deviation : 1.675888 µs
Execution time lower quantile : 8.494310 µs ( 2.5%)
Execution time upper quantile : 12.890621 µs (97.5%)
Overhead used : 21.406378 ns
;;; queue:
;;; (pop q)
Evaluation count : 6264192 in 6 samples of 1044032 calls.
Execution time mean : 698.396459 ns
Execution time std-deviation : 245.916218 ns
Execution time lower quantile : 209.744627 ns ( 2.5%)
Execution time upper quantile : 850.794780 ns (97.5%)
Overhead used : 21.406378 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 81.8747 % Variance is severely inflated by outliers
2. Compare conj on maxiphobe PQs to cons/conj on sorted seqs and conj
on sorted sets. NB. this benchmark uses an initial collection of
elements with no repeats so that the set case can be incorporated
in the most straightforward way.
(let [xs (interleave (range 3 999 6) (range 1000 0 -10))
p (pq/pq xs)
l (sort xs)
s (apply sorted-set xs)]
(assert (= l p (seq s)))
(assert (= (cons -9999 l) (conj l -9999) (seq (conj s -9999)) (conj p -9999)))
(c/quick-bench (cons -9999 l))
(c/quick-bench (conj l -9999))
(c/quick-bench (conj s -9999))
(c/quick-bench (conj p -9999))
(c/quick-bench (conj s 9999))
(c/quick-bench (conj p 9999)))
;;; sorted seq:
;;; (cons -9999 l)
Evaluation count : 13687788 in 6 samples of 2281298 calls.
Execution time mean : 26.029969 ns
Execution time std-deviation : 3.806416 ns
Execution time lower quantile : 22.853333 ns ( 2.5%)
Execution time upper quantile : 30.453554 ns (97.5%)
Overhead used : 21.437655 ns
;;; sorted seq:
;;; (conj l -9999)
Evaluation count : 15064446 in 6 samples of 2510741 calls.
Execution time mean : 24.059539 ns
Execution time std-deviation : 11.080865 ns
Execution time lower quantile : 18.574134 ns ( 2.5%)
Execution time upper quantile : 43.196391 ns (97.5%)
Overhead used : 21.437655 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 82.4808 % Variance is severely inflated by outliers
;;; sorted set:
;;; (conj s -9999)
Evaluation count : 1355100 in 6 samples of 225850 calls.
Execution time mean : 476.006908 ns
Execution time std-deviation : 60.102301 ns
Execution time lower quantile : 431.885619 ns ( 2.5%)
Execution time upper quantile : 576.525876 ns (97.5%)
Overhead used : 21.437655 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 31.5184 % Variance is moderately inflated by outliers
;;; maxiphobe PQ:
;;; (conj p -9999)
Evaluation count : 7766124 in 6 samples of 1294354 calls.
Execution time mean : 59.089168 ns
Execution time std-deviation : 3.717189 ns
Execution time lower quantile : 55.532863 ns ( 2.5%)
Execution time upper quantile : 64.748707 ns (97.5%)
Overhead used : 21.437655 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 14.8388 % Variance is moderately inflated by outliers
;;; sorted set:
;;; (conj s 9999)
Evaluation count : 1099272 in 6 samples of 183212 calls.
Execution time mean : 553.592811 ns
Execution time std-deviation : 29.862010 ns
Execution time lower quantile : 521.789261 ns ( 2.5%)
Execution time upper quantile : 587.150710 ns (97.5%)
Overhead used : 21.437655 ns
;;; maxiphobe PQ:
;;; (conj p 9999)
Evaluation count : 4952460 in 6 samples of 825410 calls.
Execution time mean : 103.749482 ns
Execution time std-deviation : 2.629640 ns
Execution time lower quantile : 100.909840 ns ( 2.5%)
Execution time upper quantile : 106.530328 ns (97.5%)
Overhead used : 21.437655 ns
3. Compare maxiphobe-as-sort to one-off sorting. Note that the
maxiphobe PQ then supports efficient order-preserving insertion,
whereas the sorted seq only supports prepending efficiently. The
second pair of benchmarks includes a complete traversal of the
returned seq.
(let [xs (interleave (range 100) (range 1000 0 -10))]
(assert (= (sort xs) (pq/pq xs)))
(c/quick-bench (pq/pq xs))
(c/quick-bench (sort xs)))
;;; maxiphobe PQ:
Evaluation count : 10680 in 6 samples of 1780 calls.
Execution time mean : 56.008406 µs
Execution time std-deviation : 367.611690 ns
Execution time lower quantile : 55.374716 µs ( 2.5%)
Execution time upper quantile : 56.302697 µs (97.5%)
Overhead used : 21.434811 ns
;;; sort:
Evaluation count : 15522 in 6 samples of 2587 calls.
Execution time mean : 39.135425 µs
Execution time std-deviation : 911.312830 ns
Execution time lower quantile : 38.147855 µs ( 2.5%)
Execution time upper quantile : 40.423672 µs (97.5%)
Overhead used : 21.434811 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
(let [xs (interleave (range 100) (range 1000 0 -10))]
(assert (= (sort xs) (pq/pq xs)))
(c/quick-bench (last (pq/pq xs)))
(c/quick-bench (last (sort xs))))
;;; maxiphobe PQ:
;;; (last (pq/pq xs))
Evaluation count : 4878 in 6 samples of 813 calls.
Execution time mean : 116.111769 µs
Execution time std-deviation : 6.373020 µs
Execution time lower quantile : 111.956882 µs ( 2.5%)
Execution time upper quantile : 126.463648 µs (97.5%)
Overhead used : 21.462580 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 14.2655 % Variance is moderately inflated by outliers
;;; sorted seq:
;;; (last (sort xs))
Evaluation count : 13434 in 6 samples of 2239 calls.
Execution time mean : 45.642911 µs
Execution time std-deviation : 1.385229 µs
Execution time lower quantile : 44.438177 µs ( 2.5%)
Execution time upper quantile : 47.916844 µs (97.5%)
Overhead used : 21.462580 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
4. Benchmark meld, compare to merging sorted seqs. The PQ operations
are slower, but again, the resulting PQs support performant
order-preserving conj.
(defn merge-lists [xs ys]
(lazy-seq
(if-let [xs (seq xs)]
(if-let [ys (seq ys)]
(let [x (first xs)
y (first ys)]
;; NB. specialized – not going through the default comparator
(if (<= x y)
(cons x (merge-lists (next xs) ys))
(cons y (merge-lists xs (next ys)))))
xs)
ys)))
(let [xs (range 3 999 3)
ys (range 2 1000 2)
sx (sort xs)
sy (sort ys)
px (pq/pq xs)
py (pq/pq ys)]
(assert (= (+ (count xs) (count ys))
(count (merge-lists sx sy))
(count (pq/meld px py))))
(assert (= (merge-lists sx sy) (pq/meld px py)))
(assert (= (last (merge-lists sx sy)) (last (pq/meld px py))))
(c/quick-bench (pq/meld px py))
(c/quick-bench (last (pq/meld px py)))
(c/quick-bench (merge-lists sx sy))
(c/quick-bench (last (merge-lists sx sy)))
[(mapv count [xs ys sx sy px py])
(mapv last [sx sy px py])
(last (merge-lists sx sy))
(last (pq/meld px py))])
;;; initial meld call:
Evaluation count : 1874994 in 6 samples of 312499 calls.
Execution time mean : 321.135792 ns
Execution time std-deviation : 5.338891 ns
Execution time lower quantile : 314.126845 ns ( 2.5%)
Execution time upper quantile : 327.154227 ns (97.5%)
Overhead used : 21.434811 ns
;;; traversal of the melded heap:
Evaluation count : 1086 in 6 samples of 181 calls.
Execution time mean : 554.958613 µs
Execution time std-deviation : 13.638703 µs
Execution time lower quantile : 540.363691 µs ( 2.5%)
Execution time upper quantile : 574.684430 µs (97.5%)
Overhead used : 21.434811 ns
;;; initial merge-lists call:
Evaluation count : 15446616 in 6 samples of 2574436 calls.
Execution time mean : 22.614174 ns
Execution time std-deviation : 1.360240 ns
Execution time lower quantile : 21.169671 ns ( 2.5%)
Execution time upper quantile : 24.090622 ns (97.5%)
Overhead used : 21.434811 ns
;;; traversals of the merged sorted seqs:
Evaluation count : 4998 in 6 samples of 833 calls.
Execution time mean : 126.278406 µs
Execution time std-deviation : 7.495649 µs
Execution time lower quantile : 120.479810 µs ( 2.5%)
Execution time upper quantile : 138.715916 µs (97.5%)
Overhead used : 21.434811 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 14.6136 % Variance is moderately inflated by outliers
;= [[332 499 332 499 332 499] [996 998 996 998] 998 998]