ObjectOpenHashSet performance anomaly

78 views
Skip to first unread message

Daniel Rabe

unread,
Mar 7, 2023, 11:13:31 AM3/7/23
to fast...@googlegroups.com

Hi,

 

I have noticed a performance anomaly with ObjectOpenHashSet, when adding elements to a new ObjectOpenHashSet in the order we get them back from a different ObjectOpenHashSet. If the new ObjectOpenHashSet is pre-sized to be large enough, there is not a problem, but if it isn't, performance can get very bad (many orders of magnitude worse than using the original insertion order).

In the attached test, things work great up to a collection size of about 8,300,000 elements. We can add the generated Strings to an ObjectOpenHashSet in 2-3 seconds, regardless of the order of insertion. But when we try 8,400,000 elements, things go bad - we can add the generated Strings in about 3.6 seconds, but if we try to insert them into another ObjectOpenHashSet in the order the iterator returns them to us, insertion time goes up to 542 seconds!

If I'm patient enough to wait for the 8,500,000 element test to complete, we see that insertion time goes up to 2,472 seconds (41 minutes); for 8,600,000 elements, it goes up to 3,914 seconds (65 minutes) . I understand that I "should" pre-size the Set appropriately to start with; in my particular use case, I'm loading data from an external data source, and I can't know ahead of time how many entries there will be. I expect the performance not to depend too heavily on the order of insertion.

I suspect the poor performance may be related to hash collisions, but I am failing to understand why the insertion order makes THAT much of a difference.

I am running this test under Java 17, with a 4 GB heap. I don't see heap usage go higher than about 1.2 GB. I'm using fastutil-8.5.11.

Here is the output from the test:

Machine OS: Mac OS X 12.6.3 (x86_64)

Java JDK: Azul Systems, Inc. 17.0.6+10-LTS

Adding 1000000 Strings to the set...

Added to new default-sized set in 363 ms

Iterated the set in 18 ms

Created a new set via constructor in 47 ms

Added in iteration order to a new pre-sized set in 46 ms

Added in iteration order to a new default-sized set in 147 ms

 

Adding 2000000 Strings to the set...

Added to new default-sized set in 628 ms

Iterated the set in 32 ms

Created a new set via constructor in 132 ms

Added in iteration order to a new pre-sized set in 135 ms

Added in iteration order to a new default-sized set in 446 ms

 

Adding 4000000 Strings to the set...

Added to new default-sized set in 1386 ms

Iterated the set in 61 ms

Created a new set via constructor in 170 ms

Added in iteration order to a new pre-sized set in 157 ms

Added in iteration order to a new default-sized set in 538 ms

 

Adding 8000000 Strings to the set...

Added to new default-sized set in 3088 ms

Iterated the set in 188 ms

Created a new set via constructor in 445 ms

Added in iteration order to a new pre-sized set in 350 ms

Added in iteration order to a new default-sized set in 1198 ms

 

Adding 8100000 Strings to the set...

Added to new default-sized set in 3055 ms

Iterated the set in 131 ms

Created a new set via constructor in 412 ms

Added in iteration order to a new pre-sized set in 310 ms

Added in iteration order to a new default-sized set in 1211 ms

 

Adding 8200000 Strings to the set...

Added to new default-sized set in 2784 ms

Iterated the set in 131 ms

Created a new set via constructor in 344 ms

Added in iteration order to a new pre-sized set in 340 ms

Added in iteration order to a new default-sized set in 1484 ms

 

Adding 8300000 Strings to the set...

Added to new default-sized set in 3098 ms

Iterated the set in 140 ms

Created a new set via constructor in 497 ms

Added in iteration order to a new pre-sized set in 462 ms

Added in iteration order to a new default-sized set in 2274 ms

 

Adding 8400000 Strings to the set...

Added to new default-sized set in 3324 ms

Iterated the set in 152 ms

Created a new set via constructor in 509 ms

Added in iteration order to a new pre-sized set in 491 ms

Added in iteration order to a new default-sized set in 542879 ms

 

Adding 8500000 Strings to the set...

Added to new default-sized set in 3635 ms

Iterated the set in 145 ms

Created a new set via constructor in 413 ms

Added in iteration order to a new pre-sized set in 348 ms

Added in iteration order to a new default-sized set in 2472647 ms

 

Adding 8600000 Strings to the set...

Added to new default-sized set in 4200 ms

Iterated the set in 154 ms

Created a new set via constructor in 621 ms

Added in iteration order to a new pre-sized set in 565 ms

Added in iteration order to a new default-sized set in 3914421 ms

 

fastutilPerf.java

Dawid Weiss

unread,
Mar 7, 2023, 11:32:18 AM3/7/23
to fast...@googlegroups.com

I think it's a known problem, the cause is a explained a bit here:

I can't remember what the workaround was in fastutil but I'm sure Sebastiano will chime in with some comments soon.

Dawid

--
You received this message because you are subscribed to the Google Groups "fastutil" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fastutil+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fastutil/78543EBF-7398-4E7D-8482-FA33D19093EA%40comcast.net.

--
You received this message because you are subscribed to the Google Groups "fastutil" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fastutil+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fastutil/78543EBF-7398-4E7D-8482-FA33D19093EA%40comcast.net.

Sebastiano Vigna

unread,
Mar 7, 2023, 11:49:02 AM3/7/23
to fast...@googlegroups.com


> On 7 Mar 2023, at 16:12, Daniel Rabe <dra...@comcast.net> wrote:
>
> Hi,
> I have noticed a performance anomaly with ObjectOpenHashSet, when adding elements to a new ObjectOpenHashSet in the order we get them back from a different ObjectOpenHashSet. If the new ObjectOpenHashSet is pre-sized to be large enough, there is not a problem, but if it isn't, performance can get very bad (many orders of magnitude worse than using the original insertion order).
> In the attached test, things work great up to a collection size of about 8,300,000 elements. We can add the generated Strings to an ObjectOpenHashSet in 2-3 seconds, regardless of the order of insertion. But when we try 8,400,000 elements, things go bad - we can add the generated Strings in about 3.6 seconds, but if we try to insert them into another ObjectOpenHashSet in the order the iterator returns them to us, insertion time goes up to 542 seconds!

See the discussion here:

https://github.com/vigna/fastutil/issues/53

This is well known and basically there's no way around, without slowing normal operations. It's an ugly artifact of the type of linear probing used in fastutil.

I take you're not using addAll() to do the insertion--that would fix the problem, as addAll() pre-sizes the target map (which might be inefficient, admittedly).

Ciao,

seba

Daniel Rabe

unread,
Mar 21, 2023, 11:01:35 AM3/21/23
to fast...@googlegroups.com
Thank you - those references are very helpful! 

My scenario is very similar to https://github.com/vigna/fastutil/issues/53, in that custom serialization is involved, and at deserialization time I don't have enough information to pre-size the Set.

I'll take a closer look at HPPC's ObjectHashSet, as it sounds like they solved this problem with a slight CPU overhead (which I'm willing to pay, given how much memory the implementation saves compared to java.util.HashSet).

--
You received this message because you are subscribed to the Google Groups "fastutil" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fastutil+u...@googlegroups.com.

Sebastiano Vigna

unread,
Mar 21, 2023, 12:09:15 PM3/21/23
to fast...@googlegroups.com


> On 21 Mar 2023, at 15:01, Daniel Rabe <dra...@comcast.net> wrote:
>
> Thank you - those references are very helpful!
>
> My scenario is very similar to https://github.com/vigna/fastutil/issues/53, in that custom serialization is involved, and at deserialization time I don't have enough information to pre-size the Set.
>
> I'll take a closer look at HPPC's ObjectHashSet, as it sounds like they solved this problem with a slight CPU overhead (which I'm willing to pay, given how much memory the implementation saves compared to java.util.HashSet).

Consider that you can always super-size and trim()--that might have a little bit of overhead at creation time, but no overhead in the utilization of the data structure.

Ciao,

seba

Dawid Weiss

unread,
Mar 21, 2023, 12:41:14 PM3/21/23
to fast...@googlegroups.com

Maybe the serialization algorithm should prepend the desired size for the data structure so that it deserializes without problems?

With regard to HPPC, and if you're curious, this is what's currently implemented (with due thanks to Vincent Sonnier and Bruno Roustant):

Mind you, HPPC has no serialization infrastructure at all - you'd have to implement those externally (which isn't terribly difficult). Also, fastutil offers a lot more data structures than HPPC, so it may be wiser to improve the serialization (for example with the sizing hint) than switching libraries. :)

D.

Sebastiano Vigna

unread,
Mar 21, 2023, 1:50:19 PM3/21/23
to fast...@googlegroups.com


On March 21, 2023 4:40:44 PM GMT+00:00, Dawid Weiss <dawid...@gmail.com> wrote:

>
>With regard to HPPC, and if you're curious, this is what's currently
>implemented (with due thanks to Vincent Sonnier and Bruno Roustant):
>https://issues.carrot2.org/browse/HPPC-186

OMG NOBODY TOLD ME ABOUT THIS 😭

Dawid Weiss

unread,
Mar 21, 2023, 1:56:33 PM3/21/23
to fast...@googlegroups.com
>With regard to HPPC, and if you're curious, this is what's currently
>implemented (with due thanks to Vincent Sonnier and Bruno Roustant):
>https://issues.carrot2.org/browse/HPPC-186

OMG NOBODY TOLD ME ABOUT THIS 😭

Lol. :) Not a secret, really - I didn't think about you, sorry. I think the improvement is a tradeoff - iteration is not as cache friendly as a linear scan. So it's not all rosy. But certainly an interesting concept and seems to work in practice.

D.
 

Sebastiano Vigna

unread,
Mar 21, 2023, 2:05:10 PM3/21/23
to fast...@googlegroups.com


On March 21, 2023 3:01:01 PM GMT+00:00, Daniel Rabe <dra...@comcast.net> wrote:
>Thank you - those references are very helpful!
>
>My scenario is very similar to https://github.com/vigna/fastutil/issues/53, in that custom serialization is involved, and at deserialization time I don't have enough information to pre-size the Set.
>

If the serialization is custom you could subclass the map and implement HPPC's idea, that is, scanning the key array with a skip (like, an odd number approximating the golden ratio in fixed point) so that when you read back the keys are not laid out adversarially. The advantage is that you'll have just a small overhead at serialization time, where maybe I/O would be dominant anyway...

Daniel Rabe

unread,
Mar 21, 2023, 4:34:31 PM3/21/23
to fast...@googlegroups.com
Ok, I guess I had been looking at some of the older HPPC docs that discussed the difference between their HashSet and ScatterSet. I thought the perturbation value when rehashing (HPPC-115) was brilliant, because it guards against the collision avalanche (great term, by the way!) regardless of where the input was coming from (the result of iterating another HashSet, or some other mechanism that accidentally put things in a "bad" insertion order). But now I see that ScatterSet and HashSet were later merged, with the randomized iteration order taking the place of the perturbation value (HPPC-186).

So if the only advantage of HPPC over fastutil in this case is the iteration order, then I can certainly take a look at some of the other ideas you have presented and maybe stick with fastutil.
> --
> You received this message because you are subscribed to the Google Groups "fastutil" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fastutil+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/fastutil/A232BBEA-6953-4C16-B8CD-6DB05BC6D661%40unimi.it.

Dawid Weiss

unread,
Mar 21, 2023, 4:52:08 PM3/21/23
to fast...@googlegroups.com
 
because it guards against the collision avalanche (great term, by the way!) regardless of where the input was coming from (the result of iterating another HashSet, or some other mechanism that accidentally put things in a "bad" insertion order).

You're correct. The simplicity of the iterator approach has its appeal though...

D.
Reply all
Reply to author
Forward
0 new messages