Julian Fondren <
ayr...@gmail.com> writes:
> OK, the CL takes ~12 hours on 30k values.
> How much time does your code need on 30k values?
The issue is that the CL used a crazy O(n**2) algorithm to do the
lookups. WJ's version, like the one he criticized, was also incredibly
crude, using a beautiful functional programming language to code what
someone once called "a pig trampling in a rose garden". But it used
a hash table and should be able to do 30k values in close to no time.
Here's my Haskell version:
import qualified Data.HashMap.Strict as M
import Distribution.Simple.Utils (ordNub)
ks = ["a","b","c","b","a","f","e","g","h","k","z","k","r","u","f"]
vs = [1,5,8,7,14,8,3,7,9,4,3,21,5,7,9]
mm = foldl (\m (k,v) -> M.insertWith (+) k v m) M.empty $ (zip ks vs)
main = print $ [(k,v) | k <- ordNub ks, let Just v = M.lookup k mm]
Output (linebreak inserted):
[("a",15),("b",12),("c",8),("f",17),("e",3),("g",7),("h",9),
("k",25),("z",3),("r",5),("u",7)]
It does 30k values in about 0.07 seconds on my laptop.