[ANN] maxiphobe 0.0.1 – Persistent Meldable Priority Queues

258 views

Skip to first unread message

Michał Marczyk

unread,

Nov 21, 2016, 3:52:56 PM11/21/16

to clojure

Hi,

I am pleased to announce the initial release of maxiphobe, a meldable

priority queue library based on Okasaki's maxiphobic heaps (see

Okasaki, "Alternatives to Two Classic Data Structures"). Maxiphobic

heaps are a strikingly simple purely functional approach to

implementing priority queues that offers very reasonable performance

given the available operations.

https://github.com/michalmarczyk/maxiphobe

https://clojars.org/maxiphobe

Maxiphobe priority queues are, in effect, persistent lists that

maintain a non-decreasing order of elements, as determined either by

Clojure's default comparator or a custom comparator passed in by the

user.

Additionally, instances using the same comparator can be "melded" into

a single priority queue containing all the elements of the inputs.

(require '[maxiphobe.core :as pq])

(pq/pq [0 -1 3 4 2])

;= (-1 0 2 3 4)

(pq/pqueue 0 -1 3 4 2)

;= (-1 0 2 3 4)

(pq/pq-by > [0 -1 3 4 2])

;= (4 3 2 0 -1)

(pq/pqueue-by > 0 -1 3 4 2)

;= (4 3 2 0 -1)

(peek (pq/pqueue-by > 0 -1 3 4 2)) ; or first

;= 4

(pop (pq/pqueue-by > 0 -1 3 4 2)) ; or next

;= (3 2 0 -1)

(pq/meld (pq/pqueue -1 3 2 8) (pq/pqueue 0 -3 2 4))

;= (-3 -1 0 2 2 3 4 8)

To include maxiphobe in your project:

[maxiphobe "0.0.1"]

<groupId>maxiphobe</groupId>

<artifactId>maxiphobe</artifactId>

</dependency>

compile "maxiphobe:maxiphobe:0.0.1"

See below for some benchmark results.

Cheers,

Michał

Benchmarks

==========

1. Compare basic operations to sorted-set-as-PQ and to regular (FIFO)

peristent queues to get a high-level picture of maxiphobe

performance. FIFO queue operations are included to provide a point

of reference – they are, of course, completely different

semantically.

(let [pq (pq/pq (range 1000))

s (apply sorted-set (range 1000))

q (into clojure.lang.PersistentQueue/EMPTY (range 1000))]

(assert (= (seq s) pq q))

(println "peek")

(c/quick-bench (peek pq))

(c/quick-bench (first s))

(c/quick-bench (peek q))

(println "pop")

(c/quick-bench (pop pq))

(c/quick-bench (disj s (first s)))

(c/quick-bench (pop q))

(let [pq (nth (iterate pop pq) 500)

s (apply sorted-set pq)

q (nth (iterate pop q) 500)]

(assert (= (seq s) pq q))

(println "pop2")

(c/quick-bench (pop pq))

(c/quick-bench (disj s (first s)))

(c/quick-bench (pop q))))

peek

;;; maxiphobe PQ:

;;; (peek pq)

Evaluation count : 32624694 in 6 samples of 5437449 calls.

Execution time mean : -2.422742 ns

Execution time std-deviation : 0.630313 ns

Execution time lower quantile : -2.934682 ns ( 2.5%)

Execution time upper quantile : -1.681600 ns (97.5%)

Overhead used : 21.406378 ns

;;; sorted set:

;;; (first s)

Evaluation count : 1944576 in 6 samples of 324096 calls.

Execution time mean : 3.262937 µs

Execution time std-deviation : 440.634979 ns

Execution time lower quantile : 2.728010 µs ( 2.5%)

Execution time upper quantile : 3.753706 µs (97.5%)

Overhead used : 21.406378 ns

;;; queue

;;; (peek q)

Evaluation count : 22712976 in 6 samples of 3785496 calls.

Execution time mean : 7.395679 ns

Execution time std-deviation : 0.615558 ns

Execution time lower quantile : 6.766551 ns ( 2.5%)

Execution time upper quantile : 8.050620 ns (97.5%)

Overhead used : 21.406378 ns

pop

;;; maxiphobe PQ:

;;; (pop pq)

Evaluation count : 1107840 in 6 samples of 184640 calls.

Execution time mean : 2.143321 µs

Execution time std-deviation : 1.064519 µs

Execution time lower quantile : 362.028152 ns ( 2.5%)

Execution time upper quantile : 3.175097 µs (97.5%)

Overhead used : 21.406378 ns

;;; sorted set:

;;; (disj s (first s))

Evaluation count : 508608 in 6 samples of 84768 calls.

Execution time mean : 10.498305 µs

Execution time std-deviation : 2.658746 µs

Execution time lower quantile : 6.260925 µs ( 2.5%)

Execution time upper quantile : 12.859465 µs (97.5%)

Overhead used : 21.406378 ns

;;; queue:

;;; (pop q)

Evaluation count : 8807040 in 6 samples of 1467840 calls.

Execution time mean : 730.736239 ns

Execution time std-deviation : 122.924195 ns

Execution time lower quantile : 561.778923 ns ( 2.5%)

Execution time upper quantile : 839.133234 ns (97.5%)

Overhead used : 21.406378 ns

pop2

;;; maxiphobe PQ:

;;; (pop pq)

Evaluation count : 1816404 in 6 samples of 302734 calls.

Execution time mean : 2.874078 µs

Execution time std-deviation : 310.794245 ns

Execution time lower quantile : 2.614476 µs ( 2.5%)

Execution time upper quantile : 3.233927 µs (97.5%)

Overhead used : 21.406378 ns

;;; sorted set:

;;; (disj s (first s))

Evaluation count : 554400 in 6 samples of 92400 calls.

Execution time mean : 11.149426 µs

Execution time std-deviation : 1.675888 µs

Execution time lower quantile : 8.494310 µs ( 2.5%)

Execution time upper quantile : 12.890621 µs (97.5%)

Overhead used : 21.406378 ns

;;; queue:

;;; (pop q)

Evaluation count : 6264192 in 6 samples of 1044032 calls.

Execution time mean : 698.396459 ns

Execution time std-deviation : 245.916218 ns

Execution time lower quantile : 209.744627 ns ( 2.5%)

Execution time upper quantile : 850.794780 ns (97.5%)

Overhead used : 21.406378 ns

Found 1 outliers in 6 samples (16.6667 %)

low-severe 1 (16.6667 %)

Variance from outliers : 81.8747 % Variance is severely inflated by outliers

2. Compare conj on maxiphobe PQs to cons/conj on sorted seqs and conj

on sorted sets. NB. this benchmark uses an initial collection of

elements with no repeats so that the set case can be incorporated

in the most straightforward way.

(let [xs (interleave (range 3 999 6) (range 1000 0 -10))

p (pq/pq xs)

l (sort xs)

s (apply sorted-set xs)]

(assert (= l p (seq s)))

(assert (= (cons -9999 l) (conj l -9999) (seq (conj s -9999)) (conj p -9999)))

(c/quick-bench (cons -9999 l))

(c/quick-bench (conj l -9999))

(c/quick-bench (conj s -9999))

(c/quick-bench (conj p -9999))

(c/quick-bench (conj s 9999))

(c/quick-bench (conj p 9999)))

;;; sorted seq:

;;; (cons -9999 l)

Evaluation count : 13687788 in 6 samples of 2281298 calls.

Execution time mean : 26.029969 ns

Execution time std-deviation : 3.806416 ns

Execution time lower quantile : 22.853333 ns ( 2.5%)

Execution time upper quantile : 30.453554 ns (97.5%)

Overhead used : 21.437655 ns

;;; sorted seq:

;;; (conj l -9999)

Evaluation count : 15064446 in 6 samples of 2510741 calls.

Execution time mean : 24.059539 ns

Execution time std-deviation : 11.080865 ns

Execution time lower quantile : 18.574134 ns ( 2.5%)

Execution time upper quantile : 43.196391 ns (97.5%)

Overhead used : 21.437655 ns

Found 1 outliers in 6 samples (16.6667 %)

low-severe 1 (16.6667 %)

Variance from outliers : 82.4808 % Variance is severely inflated by outliers

;;; sorted set:

;;; (conj s -9999)

Evaluation count : 1355100 in 6 samples of 225850 calls.

Execution time mean : 476.006908 ns

Execution time std-deviation : 60.102301 ns

Execution time lower quantile : 431.885619 ns ( 2.5%)

Execution time upper quantile : 576.525876 ns (97.5%)

Overhead used : 21.437655 ns

Found 1 outliers in 6 samples (16.6667 %)

low-severe 1 (16.6667 %)

Variance from outliers : 31.5184 % Variance is moderately inflated by outliers

;;; maxiphobe PQ:

;;; (conj p -9999)

Evaluation count : 7766124 in 6 samples of 1294354 calls.

Execution time mean : 59.089168 ns

Execution time std-deviation : 3.717189 ns

Execution time lower quantile : 55.532863 ns ( 2.5%)

Execution time upper quantile : 64.748707 ns (97.5%)

Overhead used : 21.437655 ns

Found 1 outliers in 6 samples (16.6667 %)

low-severe 1 (16.6667 %)

Variance from outliers : 14.8388 % Variance is moderately inflated by outliers

;;; sorted set:

;;; (conj s 9999)

Evaluation count : 1099272 in 6 samples of 183212 calls.

Execution time mean : 553.592811 ns

Execution time std-deviation : 29.862010 ns

Execution time lower quantile : 521.789261 ns ( 2.5%)

Execution time upper quantile : 587.150710 ns (97.5%)

Overhead used : 21.437655 ns

;;; maxiphobe PQ:

;;; (conj p 9999)

Evaluation count : 4952460 in 6 samples of 825410 calls.

Execution time mean : 103.749482 ns

Execution time std-deviation : 2.629640 ns

Execution time lower quantile : 100.909840 ns ( 2.5%)

Execution time upper quantile : 106.530328 ns (97.5%)

Overhead used : 21.437655 ns

3. Compare maxiphobe-as-sort to one-off sorting. Note that the

maxiphobe PQ then supports efficient order-preserving insertion,

whereas the sorted seq only supports prepending efficiently. The

second pair of benchmarks includes a complete traversal of the

returned seq.

(let [xs (interleave (range 100) (range 1000 0 -10))]

(assert (= (sort xs) (pq/pq xs)))

(c/quick-bench (pq/pq xs))

(c/quick-bench (sort xs)))

;;; maxiphobe PQ:

Evaluation count : 10680 in 6 samples of 1780 calls.

Execution time mean : 56.008406 µs

Execution time std-deviation : 367.611690 ns

Execution time lower quantile : 55.374716 µs ( 2.5%)

Execution time upper quantile : 56.302697 µs (97.5%)

Overhead used : 21.434811 ns

;;; sort:

Evaluation count : 15522 in 6 samples of 2587 calls.

Execution time mean : 39.135425 µs

Execution time std-deviation : 911.312830 ns

Execution time lower quantile : 38.147855 µs ( 2.5%)

Execution time upper quantile : 40.423672 µs (97.5%)

Overhead used : 21.434811 ns

Found 1 outliers in 6 samples (16.6667 %)

low-severe 1 (16.6667 %)

Variance from outliers : 13.8889 % Variance is moderately inflated by outliers

(let [xs (interleave (range 100) (range 1000 0 -10))]

(assert (= (sort xs) (pq/pq xs)))

(c/quick-bench (last (pq/pq xs)))

(c/quick-bench (last (sort xs))))

;;; maxiphobe PQ:

;;; (last (pq/pq xs))

Evaluation count : 4878 in 6 samples of 813 calls.

Execution time mean : 116.111769 µs

Execution time std-deviation : 6.373020 µs

Execution time lower quantile : 111.956882 µs ( 2.5%)

Execution time upper quantile : 126.463648 µs (97.5%)

Overhead used : 21.462580 ns

Found 1 outliers in 6 samples (16.6667 %)

low-severe 1 (16.6667 %)

Variance from outliers : 14.2655 % Variance is moderately inflated by outliers

;;; sorted seq:

;;; (last (sort xs))

Evaluation count : 13434 in 6 samples of 2239 calls.

Execution time mean : 45.642911 µs

Execution time std-deviation : 1.385229 µs

Execution time lower quantile : 44.438177 µs ( 2.5%)

Execution time upper quantile : 47.916844 µs (97.5%)

Overhead used : 21.462580 ns

Found 1 outliers in 6 samples (16.6667 %)

low-severe 1 (16.6667 %)

Variance from outliers : 13.8889 % Variance is moderately inflated by outliers

4. Benchmark meld, compare to merging sorted seqs. The PQ operations

are slower, but again, the resulting PQs support performant

order-preserving conj.

(defn merge-lists [xs ys]

(lazy-seq

(if-let [xs (seq xs)]

(if-let [ys (seq ys)]

(let [x (first xs)

y (first ys)]

;; NB. specialized – not going through the default comparator

(if (<= x y)

(cons x (merge-lists (next xs) ys))

(cons y (merge-lists xs (next ys)))))

xs)

ys)))

(let [xs (range 3 999 3)

ys (range 2 1000 2)

sx (sort xs)

sy (sort ys)

px (pq/pq xs)

py (pq/pq ys)]

(assert (= (+ (count xs) (count ys))

(count (merge-lists sx sy))

(count (pq/meld px py))))

(assert (= (merge-lists sx sy) (pq/meld px py)))

(assert (= (last (merge-lists sx sy)) (last (pq/meld px py))))

(c/quick-bench (pq/meld px py))

(c/quick-bench (last (pq/meld px py)))

(c/quick-bench (merge-lists sx sy))

(c/quick-bench (last (merge-lists sx sy)))

[(mapv count [xs ys sx sy px py])

(mapv last [sx sy px py])

(last (merge-lists sx sy))

(last (pq/meld px py))])

;;; initial meld call:

Evaluation count : 1874994 in 6 samples of 312499 calls.

Execution time mean : 321.135792 ns

Execution time std-deviation : 5.338891 ns

Execution time lower quantile : 314.126845 ns ( 2.5%)

Execution time upper quantile : 327.154227 ns (97.5%)

Overhead used : 21.434811 ns

;;; traversal of the melded heap:

Evaluation count : 1086 in 6 samples of 181 calls.

Execution time mean : 554.958613 µs

Execution time std-deviation : 13.638703 µs

Execution time lower quantile : 540.363691 µs ( 2.5%)

Execution time upper quantile : 574.684430 µs (97.5%)

Overhead used : 21.434811 ns

;;; initial merge-lists call:

Evaluation count : 15446616 in 6 samples of 2574436 calls.

Execution time mean : 22.614174 ns

Execution time std-deviation : 1.360240 ns

Execution time lower quantile : 21.169671 ns ( 2.5%)

Execution time upper quantile : 24.090622 ns (97.5%)

Overhead used : 21.434811 ns

;;; traversals of the merged sorted seqs:

Evaluation count : 4998 in 6 samples of 833 calls.

Execution time mean : 126.278406 µs

Execution time std-deviation : 7.495649 µs

Execution time lower quantile : 120.479810 µs ( 2.5%)

Execution time upper quantile : 138.715916 µs (97.5%)

Overhead used : 21.434811 ns

Found 1 outliers in 6 samples (16.6667 %)

low-severe 1 (16.6667 %)

Variance from outliers : 14.6136 % Variance is moderately inflated by outliers

;= [[332 499 332 499 332 499] [996 998 996 998] 998 998]

Reply all

Reply to author

Forward

0 new messages