SwissTable / F14 in Java

Roman Leventov

unread,

Oct 9, 2019, 6:11:01 AM10/9/19

to hashtable-...@googlegroups.com

My very long and laborious detour to implementing SwissTable (https://github.com/TimeAndSpaceIO/SmoothieMap/blob/2.x/src/main/java/io/timeandspace/smoothie/SwissTable.java) and F14 in Java (SmoothieMap's segments are fixed-sized, 8-group F14 tables with 8 slots in each group) has lead to meh results: https://medium.com/@leventov/smoothiemap-2-the-lowest-memory-hash-table-ever-6bebd06780a3#042b, but the picture might improve with Project Panama's Vector API for intrinsics.

Roman Perepelitsa

unread,

Oct 9, 2019, 6:24:34 AM10/9/19

to Roman Leventov, hashtable-benchmarks

On Wed, Oct 9, 2019 at 12:11 PM Roman Leventov <leven...@gmail.com> wrote:

My very long and laborious detour to implementing SwissTable (https://github.com/TimeAndSpaceIO/SmoothieMap/blob/2.x/src/main/java/io/timeandspace/smoothie/SwissTable.java) and F14 in Java (SmoothieMap's segments are fixed-sized, 8-group F14 tables with 8 slots in each group) has lead to meh results: https://medium.com/@leventov/smoothiemap-2-the-lowest-memory-hash-table-ever-6bebd06780a3#042b, but the picture might improve with Project Panama's Vector API for intrinsics.

The lengthy discussion and conclusions derived from a single synthetic benchmark may be misleading. See https://groups.google.com/d/msg/hashtable-benchmarks/2-cbY4cJfIc/-31tk8izDgAJ. The only conclusion I can draw from the graph is that the three lines are spookily similar. Unless this is an artifact of an unusual scale (is this truncated log scale? cannot tell), it's suspicious. Usually different hashtable implementations have large differences in performance, at least for some table sizes.

Roman.

Roman Leventov

unread,

Oct 9, 2019, 6:34:45 AM10/9/19

to Roman Perepelitsa, hashtable-benchmarks

Yes, that's log scale. Actual data points are here: https://docs.google.com/spreadsheets/d/1g-Et7zAbjSJddx2pHEQKuWn6fu_yIfWxOaxGdwLlUOs/edit?usp=sharing. There are map sizes at which HashMap is almost 2 times faster than SmoothieMap, and vice versa. And these are consistent results at the given map sizes.

I've experimented with various benchmarks (unsuccessful lookup, insert, String key, etc, different GC algorithms) and they all produced completely different results past small map sizes, e. g. where memory access (locality) and fitting data into cache factors start to dominate. No common ground whatsoever. So I omitted this data and reflected this in the discussion in the post.

Reply all

Reply to author

Forward