Hash benchmarks

Luben Karavelov

unread,

Mar 14, 2012, 8:30:01 PM3/14/12

to parro...@lists.parrot.org

Hello guys,

I was looking for a way to speed up some heavy computations
we are making @work so I have written small benchmark that:

1. reads 5000 sparse vectors from disk
2. computes inner product of the last 100 vectors
with all the vectors before them

The representation that we are using for sparse vectors is
hash tables. I have written this small test case in
several languages, including PIR. In every program the load
time is under 2 seconds and the math is quite simple, so
what this is stress testing hash tables code.

Here are the results (some of them surprising)

time mem
clojure 32 449536
racket 70 168060
c++ 40 75180 map<int,float>
c++ 43 68960 unordered_map<int,float>
perl 33 117904
lua 26 108540
luajit 6 68072
haskell 23 1027504 Data.IntMap Float (compiled)
parrot 28 360992 Hash PMC_keys PMC_vals
parrot 15 263628 Hash int_keys PMC_vals

So, my small comment: we are not bad at all. Luajit comes
first but we are quite fast even without JIT.

The biggest disappointment for me are statically typed
compiled languages - they had all the time to optimize
the code, they had proper information in order to use
native, unboxed values but their performance is quite bad
C++ uses 7x the time of the first (luajit) and haskel
uses 10x times the memory.

Another observation: it looks like luajit infers key and
value datatypes and stores them unboxed.

I hope you find this interesting
Best regards

--
Luben Karavelov
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Vasily Chekalkin

unread,

Mar 14, 2012, 8:51:37 PM3/14/12

to Luben Karavelov, parro...@lists.parrot.org

Yay!

Can you also try parrot with int keys and float values? Just for curiosity.

--
Bacek

_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Luben Karavelov

unread,

Mar 14, 2012, 9:36:50 PM3/14/12

to Vasily Chekalkin, parro...@lists.parrot.org

On Thu, 15 Mar 2012 11:51:37 +1100, Vasily Chekalkin <ba...@bacek.com>
wrote:

> Yay!
>
> Can you also try parrot with int keys and float values? Just for
> curiosity.
>
> --
> Bacek

I was looking to make it possible (yesterday I pushed a commit to make
iterating
Hash PMCs with int keys possible) but it is not at the moment.

There as also some uncertainty how to proceed. At my PC(amd64) FLOATVAL
is
defined as synonym for double (64 bits). Hash values have maximum size
of a
pointer and there are architectures that have
sizeof(void *) < sizeof(double)
i386 to be precise.

So if we want to store FLOATVAL directly in hash tables we have 2
choises:

1. Make FLOATVAL = float (32 bit) on platforms where
sizeof(void *) < sizeof(double)

I am not sure if this is the right choice - what guarantees do we give
for
FLOATVALs (Num registers)?

2. make hashval field big enough for doubles - it will pose some
penalty on
32bit platforms

In either case it will not be the major case of use of hash tables. I
am not
even sure if Rakudo could do hashes with int keys

If we have consensus how to proceed I am willing to code it even if it
is not
the major use case.

Andrew Whitworth

unread,

Mar 16, 2012, 10:45:11 AM3/16/12

to Luben Karavelov, parro...@lists.parrot.org

I don't think we need to make hashes with FLOATVAL keys possible. I
can't imagine a single use-case where such a thing would be worth the
added costs. This is especially true when you consider that common
arithmetic roundoff errors of wider formats is going to make equality
testing of floats difficult and error-prone.

Better use of integer keys is a very good thing.

--Andrew Whitworth

_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Vasily Chekalkin

unread,

Mar 16, 2012, 7:39:15 PM3/16/12

to Andrew Whitworth, parro...@lists.parrot.org

Hello.

Whiteknight, you missed it. It's all about FLOATVAL _values_. Any you
are right, FLOATVAL keys doesn't make any sense.

--
Bacek

Reply all

Reply to author

Forward