Seeking good benchmark code for immutable hash and hash set usage

40 views
Skip to first unread message

Jon Zeppieri

unread,
Aug 14, 2019, 8:08:22 PM8/14/19
to racket users list
Hello Racketeers,

I'm looking for examples of code that would make good benchmarks for
evaluating the performance of immutable hashes.

Some background:
Immutable hashes used to be implemented in Racket as red-black trees.
That was changed to hash array mapped tries (HAMTs) a number of years
back. In Racket CS, the current implementation is a Patricia trie.
That choice was based largely on the fact that it performed
significantly faster on the hash demo benchmarks[1] that any of the
HAMT variants I tested against. In particular, the write performance
of the HAMTs seemed especially poor.

However, on some real-world code[2] I recently had occasion to test, a
HAMT consistently performs better than the Patricia trie, _especially_
on writes, which makes me think I put entirely too much weight on the
hash demo benchmark.

So I'm looking for more realistic cases to use for benchmarking. If
you have a Racket application or library that makes heavy use of
immutable hashes or hash sets (from the racket/set module), please let
me know.

Thanks,
Jon

[1] https://github.com/racket/racket/blob/master/racket/src/cs/demo/hash.ss
I also tried to use the expander demo as a benchmark, but the timings
weren't significantly different with different hash implementations.

[2] https://github.com/racket/racket/pull/2766#issuecomment-520173585
Well, it's a lot closer to real-world code than the hash demo.

Ben Greenman

unread,
Aug 21, 2019, 10:04:27 PM8/21/19
to Jon Zeppieri, racket users list
A few of the GTP benchmarks [1] use immutable hashes. Here's a link
with the ones that look most interesting:

http://ccs.neu.edu/home/types/tmp/gtp-hash.tar.gz


And here's a small (untyped) program that uses code from the tr-pfds
library to make a trie:

http://ccs.neu.edu/home/types/tmp/trie.tar.gz


Redex and the graph library might be good places to look for other
examples. I know graph uses lots of hashes internally, and I bet Redex
does too.

[1] https://github.com/bennn/gtp-benchmarks
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/racket-users/CAKfDxxwjPcWewzo2X5uXGH06Ud1hjURNuKFW59duXrZq4-7tWQ%40mail.gmail.com.
>

Jon Zeppieri

unread,
Aug 22, 2019, 9:55:31 AM8/22/19
to Ben Greenman, racket users list
Thanks Ben! - Jon

Robby Findler

unread,
Aug 22, 2019, 10:33:40 AM8/22/19
to Ben Greenman, Jon Zeppieri, racket users list
It looks like the pattern unifier uses hashes. I'm not sure if it they
would end up being good benchmarks (probably best to try to instrument
the hashes to see if they actually get used a lot or not) but there
are redex benchmarks that measure how long it takes to find specific
bugs and one of the generators they use involves the pattern unifier.

https://docs.racket-lang.org/redex/benchmark.html

I'm happy to help provide more pointers if it seems useful

Robby
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/CAFUu9R4dwNWS28MyApTYUkiQ-%2BhV%3DuWPBAF77xMV-wNsDMduLw%40mail.gmail.com.

Robby Findler

unread,
Aug 22, 2019, 10:34:58 AM8/22/19
to Ben Greenman, Jon Zeppieri, racket users list
(sorry for the self-followup; the unifier uses immutable hashes; grep
suggests the other hashes that get used a lot are mutable hashes)
Reply all
Reply to author
Forward
0 new messages