[ann] persistentsummary - persistent summaries for immutable scala collections

23 views
Skip to first unread message

rklaehn

unread,
Feb 1, 2016, 1:38:36 PM2/1/16
to scala-user
Hi all,

I just published a tiny library that allows to define persistent summaries / aggregations for immutable scala collections. Currently supported collections are Vector, TreeSet/TreeMap, HashSet/HashMap, so basically all tree-based immutable collections except for IntMap/LongMap.

https://github.com/rklaehn/persistentsummary

Here is a simple example:

val sumOfElements = new Summary[Int, Int] {
  def empty = 0
  def combine(a: Int, b: Int) = a + b
  def apply(value: Int) = value
}

val sum = PersistentSummary.vector(sumOfElements)

val xs = Vector(0 until 10000: _*)

println(sum(xs))

// update an element. Most of the underlying tree of set will be reused
val xs1 = xs.updated(5000, 20000)

// will reuse most node summaries from last call
println(sum(xs1))

Calculating the sum of the updated collection xs1 will be very quick, because the tree structure of xs (for which we already know the summaries for each node) is mostly reused in xs1. 

There is a less trivial example in the docs. The implementation uses a guava cache, which you can configure to suit your needs. E.g. you might want to keep only a limited number of node summaries using a LRU cache.

The current implementation necessarily uses some implementation details of scala collections. It would be great if there were hooks to do something like this in a cleaner way in a future version of scala collections.


Cheers,

Rüdiger


Reply all
Reply to author
Forward
0 new messages