I was looking for a way to speed up some heavy computations
we are making @work so I have written small benchmark that:
1. reads 5000 sparse vectors from disk
2. computes inner product of the last 100 vectors
with all the vectors before them
The representation that we are using for sparse vectors is
hash tables. I have written this small test case in
several languages, including PIR. In every program the load
time is under 2 seconds and the math is quite simple, so
what this is stress testing hash tables code.
Here are the results (some of them surprising)
time mem
clojure 32 449536
racket 70 168060
c++ 40 75180 map<int,float>
c++ 43 68960 unordered_map<int,float>
perl 33 117904
lua 26 108540
luajit 6 68072
haskell 23 1027504 Data.IntMap Float (compiled)
parrot 28 360992 Hash PMC_keys PMC_vals
parrot 15 263628 Hash int_keys PMC_vals
So, my small comment: we are not bad at all. Luajit comes
first but we are quite fast even without JIT.
The biggest disappointment for me are statically typed
compiled languages - they had all the time to optimize
the code, they had proper information in order to use
native, unboxed values but their performance is quite bad
C++ uses 7x the time of the first (luajit) and haskel
uses 10x times the memory.
Another observation: it looks like luajit infers key and
value datatypes and stores them unboxed.
I hope you find this interesting
Best regards
--
Luben Karavelov
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev
Can you also try parrot with int keys and float values? Just for curiosity.
--
Bacek
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev
I was looking to make it possible (yesterday I pushed a commit to make
iterating
Hash PMCs with int keys possible) but it is not at the moment.
There as also some uncertainty how to proceed. At my PC(amd64) FLOATVAL
is
defined as synonym for double (64 bits). Hash values have maximum size
of a
pointer and there are architectures that have
sizeof(void *) < sizeof(double)
i386 to be precise.
So if we want to store FLOATVAL directly in hash tables we have 2
choises:
1. Make FLOATVAL = float (32 bit) on platforms where
sizeof(void *) < sizeof(double)
I am not sure if this is the right choice - what guarantees do we give
for
FLOATVALs (Num registers)?
2. make hashval field big enough for doubles - it will pose some
penalty on
32bit platforms
In either case it will not be the major case of use of hash tables. I
am not
even sure if Rakudo could do hashes with int keys
If we have consensus how to proceed I am willing to code it even if it
is not
the major use case.
Better use of integer keys is a very good thing.
--Andrew Whitworth
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev
Whiteknight, you missed it. It's all about FLOATVAL _values_. Any you
are right, FLOATVAL keys doesn't make any sense.
--
Bacek