Feature request: forEach() Predicates also for maps, custom hash startegies for Object keys / Sets

Vincent Sonnier

unread,

Aug 1, 2013, 3:27:29 AM8/1/13

to java-high-performance...@googlegroups.com

Hello,

HPPC is great, but I feel it lacks some features I consider interesting from Trove :

- forEach() with Procedures, the possibility to stop it early, for example in order to perform a search-like iteration, with potential performance gain.

Indeed, the Predicates behave like that, but only apply to sequential containers. Maybe adding Predicates to maps

could be great as well ?

- Custom Hashing startegies. I understand that MurmurHash3 has supernatural powers, so could be considered optimal for primitive keys or sets.

Unfortunately, this only aplies for primitive types, and I'm also interested in classical Object keys/sets containers

usage because of the memory efficiency and lack of dynamic allocation.

In this case, we are stuck with the native Object.equals() / Object.hash() and the defects of it could destroy perfomance, especially in case of the linear probing used.

Either for performance optimization, of because the Object is kind of external to the current project or cannot really be altered, maybe a custom hash startegy (redefinition of equals() / hash() for the particular type) a la Trove could be added for Object keys maps and Object sets.

Regards,

Vincent

Dawid Weiss

unread,

Aug 1, 2013, 12:50:20 PM8/1/13

to java-high-performance...@googlegroups.com

There is always a feature worth adding ;) And there's always a
question what impact does it have on performance and how many actual
users a given feature will have. Please consider forking and
submitting a pull request for the things you have mentioned -- we will
definitely consider pulling them in if they don't affect performance
(and are not too complex). Otherwise people will still be able to use
them from your fork, that's the beauty of open source.

Dawid

> --
> You received this message because you are subscribed to the Google Groups
> "High Performance Primitive Collections for Java" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
> java-high-performance-primi...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Vincent Sonnier

unread,

Aug 1, 2013, 4:01:46 PM8/1/13

to java-high-performance...@googlegroups.com

Thanks for the suggestion. Nevertheless, I'm not going to fork in the immediate future, because

I'm just currently evaluating several libraries for our current project.

We have indeed 2 "competitors" at that point: Trove and HPPC.

Trove is mostly feature complete in terms of API, but only really have Maps/Sets, while HPPC

has more containers.

The previous features I suggested was the ones that looked especially interesting for us, but lacking in HPPC.

However, those missing features are not blocking though, so would HPPC be finally selected, I probably would have some time budget to fork it to add the missing features.

As a side note, Fastutil would have been considered too, but since apparently the canonical Iterator way is the only available for container traversial, that creates dynamically allocated objects, and that is a no-no for us.

Regards,

Vincent

> java-high-performance-primitive-collections+unsub...@googlegroups.com.

Dawid Weiss

unread,

Aug 1, 2013, 5:19:15 PM8/1/13

to java-high-performance-primitive-collections

Writing from my phone so quickly. Fastutil has both boxed and unboxed iterators. It also supports most of the things you have requested of HPPC so I would reconsider choosing it. Dawid

To unsubscribe from this group and stop receiving emails from it, send an email to java-high-performance-primi...@googlegroups.com.

Vincent Sonnier

unread,

Aug 3, 2013, 4:33:58 AM8/3/13

to java-high-performance...@googlegroups.com

Good morning, (I think)

I don't know why my last message was eaten out (maybe I never pressed "publish")

Well, I said that the drawback of Iterators is that they are dynamically allocated, i.e every time you need to iterate a given container a new instance needs to be created.

Anyway, I finally registered to GitHub, forked your repo under the name "vsonnier", run Maven, load Eclipse project, and such. So much easier to browse the code at least.

About my forEach() remarks before, I see now that Predicates on keys() or values() views of maps are quite enough for lots of practical needs, so finally I'm not going to toy with that.

I'm still sticking to my idea of custom hash/equals strategies though, and will try something about it in the future.

Another idea I got while seeing implementation of clear(), is the possibility of using batch blanking using System.arraycopy() instead of Arrays.fill() to reset values (especially if sizes are power of 2!) in order to gain some perfomance in case of big sizes, or poor code generation for Arrays.fill() which is not a JVM intrinsic I think.

Alas, not everybody is using Oracle JVM, or even JITing.

Have a nice week-end,

Vincent

> java-high-performance-primitive-collections+unsubscribe@googlegroups.com.

> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Dawid Weiss

unread,

Aug 3, 2013, 4:09:22 PM8/3/13

to java-high-performance...@googlegroups.com

> Well, I said that the drawback of Iterators is that they are dynamically
> allocated, i.e every time you need to iterate a given container a new
> instance needs to be created.

I didn't check but I'm pretty sure you can expose the internals by
subclassing and exposing protected stuff from there. Anyway, whatever
suits you best of course :)

> Another idea I got while seeing implementation of clear(), is the
> possibility of using batch blanking using System.arraycopy() instead of
> Arrays.fill() to reset values (especially if sizes are power of 2!) in order

> [snip]

> Alas, not everybody is using Oracle JVM, or even JITing.

What do you mean by batch blanking? In general even if .fill is not an
intrinsic then it'll be jitted into pretty darn fast code -- bounds
checking will be eliminated, loop unrolled, etc. I really don't worry
much about people who cannot run a fairly recent (and decent)
implementation of the JVM -- for those special needs you'll have to
cater in special ways anyway (most of the time) by writing
JVM-specific (or device-specific) code. If you're stuck with such an
environment, your bad luck. Fork and tweak what you need.

All this said, any improvements are welcome, Vincent. Thanks for your
efforts. Even if they stay on a fork/ branch these things are still
out there for future reference (if somebody comes along and asks for
them) and will be credited to you appropriately.

Dawid

Vincent Sonnier

unread,

Aug 3, 2013, 6:08:23 PM8/3/13

to java-high-performance...@googlegroups.com

> Well, I said that the drawback of Iterators is that they are dynamically
> allocated, i.e every time you need to iterate a given container a new
> instance needs to be created.

I didn't check but I'm pretty sure you can expose the internals by
subclassing and exposing protected stuff from there. Anyway, whatever
suits you best of course :)

I'll check that out. A reset-able Iterator would be cool.

> Another idea I got while seeing implementation of clear(), is the
> possibility of using batch blanking using System.arraycopy() instead of
> Arrays.fill() to reset values (especially if sizes are power of 2!) in order
> [snip]
> Alas, not everybody is using Oracle JVM, or even JITing.

What do you mean by batch blanking? In general even if .fill is not an
intrinsic then it'll be jitted into pretty darn fast code -- bounds
checking will be eliminated, loop unrolled, etc. I really don't worry
much about people who cannot run a fairly recent (and decent)
implementation of the JVM -- for those special needs you'll have to
cater in special ways anyway (most of the time) by writing
JVM-specific (or device-specific) code. If you're stuck with such an
environment, your bad luck. Fork and tweak what you need.

Yes, exactly my situation: No JITting, rather poor code generation.

For the "blanking", it means using a preallocated array representing "nulls", and using

System.arraycopy() to overwrite the destination. Such preallocated array would be not so big, say 1024 elements,

an so the destination array would be reset by applying several times arraycopy(), shifting the destination offset each time.

Since System.arraycopy() is an intrinsic, probably ASM hand made its performance is huge everywhere, and

in my case certainly faster that an Arrays.fill() of sufficient size.

All this said, any improvements are welcome, Vincent. Thanks for your
efforts. Even if they stay on a fork/ branch these things are still
out there for future reference (if somebody comes along and asks for
them) and will be credited to you appropriately.

Well, indeed I've started a branch about the hash strategy.

I pushed a first commit minutes ago on GitHub :

As far as I can see, for now primitive keys map/sets are generated exactly as before,

and the Objects versions look generated properly with an HashStrategy.

I've not investigated anything yet, except that the existing tests seems to pass. (mvn clean compile test)

Enough for now, Have a good night !

Vincent

Dawid Weiss

unread,

Aug 6, 2013, 2:55:08 AM8/6/13

to java-high-performance...@googlegroups.com

> I'll check that out. A reset-able Iterator would be cool.

I think you should be able to write one. For VMs with a TLAB
(thread-local allocation buffer) minor allocations such as getting the
iterator are not a major issue. If you reiterate a *lot* then I'd just
get direct access to the underlying buffers and iterate over them
manually (it's what we do at certain performance-crucial points in our
commercial code).

> Yes, exactly my situation: No JITting, rather poor code generation.

Yeah. This is really an exception from the general population I think.
Can you tell what exactly you're using? Just curious -- if this cannot
be disclosed, fine.

> For the "blanking", it means using a preallocated array representing
> "nulls", and using System.arraycopy() to overwrite the destination.

I checked and it seems array.fill is not an intrinsic. It probably
isn't because it jits and inlines so well so they didn't consider it
crucial. We could do partial-memcpy but I think this is an overkill,
to be honest. I'll benchmark it to make sure.

D.

Vincent Sonnier

unread,

Aug 6, 2013, 3:44:26 PM8/6/13

to java-high-performance...@googlegroups.com

Can you tell what exactly you're using? Just curious -- if this cannot
be disclosed, fine.

Well I can't tell you, sorry.

> For the "blanking", it means using a preallocated array representing
> "nulls", and using System.arraycopy() to overwrite the destination.

I checked and it seems array.fill is not an intrinsic. It probably
isn't because it jits and inlines so well so they didn't consider it
crucial. We could do partial-memcpy but I think this is an overkill,
to be honest. I'll benchmark it to make sure.

Indeed, I've done so, using a copycat of your ObjectObjectOpenHashMapBenchmark using a int ==> long and Integer ==> long map.

On all cases, primitves clear() are always super fast, simply because there is nothing to clear except the array of booleans for bookeeping.

On the other hand, Integer ==> long is several times slower, but still very fast on JDK.

On our specific solution, their may have some gain in batch clearing, but only in case of arrays of Objects. So I've commited

something on GitHub about this.

For now, JDK looks very marginally faster (2% at best), tomorrow I'll confirm if its worth it on our own solution.

Finally, using the same kind of ObjectObjectOpenHashMapBenchmark-like benchmark, I've tested my strategies vs v0.52 using a 5 Million element map: (10 warmup runs JDK6_30)

A) int ==> long behaves exactly like v0.52, small suprise here since the generated code is the same,

B) with Integer ==> long, the "default" strategy, equivalent to v0.52 functionality, is no more than 2.5% slower.

C)Then, If we use a custom trivial strategy (equals is comparing Integer.intValue(), hash() is == Integer.intValue()) we are close to 5% slower.

Note this is an incredibly trivial case, where equals() / hash() methods cannot really be faster in practice, so I think 2.5% difference is not bad at all. Hail to the JIT !

What about our solution, you ask ? well, slowdown is about in the 20% mark between v0.52 and my customized v0.60. Luckily, the important part is to check

that there are still performance gains compared to the actual libraries we are using :)

Vincent

Dawid Weiss

unread,

Aug 6, 2013, 4:17:48 PM8/6/13

to java-high-performance...@googlegroups.com

> Well I can't tell you, sorry.

Not an issue. :)

> On all cases, primitves clear() are always super fast, simply because there
> is nothing to clear except the array of booleans for bookeeping.

Yep, exactly.

> On the other hand, Integer ==> long is several times slower, but still very
> fast on JDK.

Interesting. I'll take a look at your code tomorrow, if I find the time.

> cannot really be faster in practice, so I think 2.5% difference is not bad
> at all. Hail to the JIT !

No, this isn't bad. What I'm slightly worried about is what's going to
happen if you have a call site that is not homogenic (calls different
strategy implementations) -- this may end up being rejitted as a
virtual call and then the performance will drop significantly. But
then -- who cares if we only have one implementation at the moment
anyway :)

There's also a side-effect on the local CPU cache -- if you use memcpy
you actually fetch two memory pages instead of just the one being
cleared... I don't think this will surface in practice though (not at
this level of abstraction anyway).

Dawid

Vincent Sonnier

unread,

Aug 7, 2013, 3:02:17 AM8/7/13

to java-high-performance...@googlegroups.com

Good morning,

I came to think that the "default strategy" approach, although elegant, is indeed not very smart.

In my initial commit of my branch: https://github.com/vsonnier/hppc/commit/2dc112fb99c2c380d349ecd89569e92e776ea8b0

I simply applied the following pattern for computing equality or hashcode :

if (strategy == null)

use Object.equals()/hashCode()

else

use strategy.

If the JIT is smart enough to inline an anonymous strategy instance to be almost as fast as before, it will be really

bad luck not to be able to inline/optimize away a simple branch for which strategy handle is constant in practice.

Besides, for much less smart solutions, a null test and a branch is probably better than a virtual call if you see what I mean.

It is all the more easy because the "== null" test is a specific bytecode, so can be perfectly identified by any VM.

I'll test again tomorrow with the reverted changes.

In the meantime, I'll check for the "batch blanking" effect on our non-Oracle VM.

Vincent

Vincent Sonnier

unread,

Aug 7, 2013, 6:42:55 AM8/7/13

to java-high-performance...@googlegroups.com

About the "batch blanking" benchmarks, I understand the situation now.

Runing your own ObjectObjectOpenHashMapBenchmark to compare with or without is actually not meaningful ,because there is actually only 1 clear

operation per test, in the middle of all the other operations. Comparing the global duration then leads to very small difference, because clear() is actually quite fast anyway.

On the other hand, when measuring clear() alone performed on an Integer ==> long map, with or without "batch blanking" leads to the batch blanking to be consistently

about TWICE as fast as the original clear() method.

This is observed in clearing a 5 Million element, and the speed gain is almost the same on JDK6 or in our non-Oracle VM.

Vincent

Dawid Weiss

unread,

Aug 7, 2013, 6:46:33 AM8/7/13

to java-high-performance...@googlegroups.com

Thanks Vincent. Can you prepare a pull request with just this change?

Dawid

> --
> You received this message because you are subscribed to the Google Groups
> "High Performance Primitive Collections for Java" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> java-high-performance-primi...@googlegroups.com.

Vincent Sonnier

unread,

Aug 7, 2013, 2:10:05 PM8/7/13

to java-high-performance...@googlegroups.com

A sent a pull request right now :)

Vincent

Reply all

Reply to author

Forward