I've started trying Incanter but I've run into some really difficult performance issues with relatively small data sizes.
Maybe Incanter/clojure just isn’t meant to be used in this way, but I figured I would ask first. This is the most basic form of the problem:
(def a (incanter.core/to-dataset {:a (range 10000) :b (range 5 10005)}))
(time (count (incanter.core/$ :a a)))
Elapsed time: 2188.47 msecs
Which isn’t awful, but its not great. I guess.
But if I increase the size of that dataset to just 100k instead of 10k, the same operation takes much longer:
(def a (incanter.core/to-dataset {:a (range 100000) :b (range 5 100005)}))
(time (count (incanter.core/$ :a a)))
Elapsed time: 238673.793 msecs
The increase in time is not linear. Hopefully this is something obvious that I don't understand and is easily improved.
Thanks in advance for any advice.