Do primitive maps still return arbitrary values for nonexistent keys?

33 views
Skip to first unread message

Michael Ekstrand

unread,
May 15, 2018, 4:02:43 PM5/15/18
to High Performance Primitive Collections for Java
The JavaDoc for e.g. LongDoubleMap.get says the following:
 Important note: For primitive type values, the value returned for a non-existing key may not be the default value of the primitive type (it may be any value previously assigned to that slot).

Is that still accurate? I examined the code for LongDoubleHashMap.get, and it seems that it returns the empty value (0d; Intrinsics.<VType>empty()) when it does not find the key. Am I missing some subtle edge case that causes it to return old values?

Thank you,
- Michael

Dawid Weiss

unread,
May 15, 2018, 4:08:41 PM5/15/18
to java-high-performance-primitive-collections
Correct, I think this has changed at some point... although note the
'may' in the quote you provided... I wouldn't rely on the default and
use a method that is semantically more explicit (like getOrDefault,
for example).

Dawid
> --
> You received this message because you are subscribed to the Google Groups
> "High Performance Primitive Collections for Java" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
> java-high-performance-primi...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Michael Ekstrand

unread,
May 16, 2018, 2:59:38 PM5/16/18
to High Performance Primitive Collections for Java
On Tuesday, May 15, 2018 at 2:08:41 PM UTC-6, Dawid Weiss wrote:
Correct, I think this has changed at some point... although note the
'may' in the quote you provided... I wouldn't rely on the default and
use a method that is semantically more explicit (like getOrDefault,
for example).

Is there a good reason not to update the documentation to guarantee an 'empty' return? Some potential future performance benefit?

I understand that HPPC makes very different decisions from JCL for a lot of things, and that is in general a good thing. Expectation violation, however, makes it more difficult to adopt HPPC, especially for less-expert developers. I am not worried about the clearly at-your-own risk features like slot-index-based access and exposed internals; equivalent methods that (may) violate expectations unnecessarily are concerning, though. My initial impression when I saw that note in the docs is 'that is quite a footgun'.

I'm considering adopting HPPC for LensKit (it's so much smaller than fastutil, and its APIs fit very well with some of our custom data structures), but I want the code to be accessible to newer developers.

Dawid Weiss

unread,
May 16, 2018, 3:28:03 PM5/16/18
to java-high-performance-primitive-collections
> Is there a good reason not to update the documentation to guarantee an
> 'empty' return? Some potential future performance benefit?

Once you guarantee something you'll have to stick to it -- this is
probably the only reason I
can think of. HPPC is very much driven by selfish needs, so we leave
certain things under- rather
than overspecified.

In this case I think historically we found out the hard way that it
was a better idea to return the default value (rather than something
random). And yes, I think we'll stick to it, so if you feel like
providing a pull request via github, it'd be great.

> I understand that HPPC makes very different decisions from JCL for a lot of
> things, and that is in general a good thing.

Don't know whether it's good, but certainly different. ;) I do
personally feel like HPPC belongs to the world of Java before streams
and we shouldn't try to make it a first-class citizen in streams world
(what Koloboke does, for example). The memory footprint is definitely
one aspect where it matters.

> makes it more difficult to adopt HPPC, especially for less-expert developers.

I'd really not recommend HPPC for less-expert developers... I mean: it
only matters if you crunch a lot of numbers in a repeated way (and it
should be an expert replacement for typical collections which run just
fine in most cases).

> I'm considering adopting HPPC for LensKit (it's so much smaller than
> fastutil, and its APIs fit very well with some of our custom data
> structures), but I want the code to be accessible to newer developers.

Think carefully about what I mentioned above and consider the
consequences, Michael. LensKit sounds great and is definitely a good
match for HPPC, but learning a different collection abstraction may be
a showstopper for people coming to the project. Now, this is a shot in
the foot, but I'm just being honest about how I feel.

Dawid

Michael Ekstrand

unread,
May 16, 2018, 4:34:28 PM5/16/18
to High Performance Primitive Collections for Java
On Wednesday, May 16, 2018 at 1:28:03 PM UTC-6, Dawid Weiss wrote:
> Is there a good reason not to update the documentation to guarantee an
> 'empty' return? Some potential future performance benefit?

Once you guarantee something you'll have to stick to it -- this is
probably the only reason I
can think of. HPPC is very much driven by selfish needs, so we leave
certain things under- rather
than overspecified.

Yes - we take a similar approach in many places in LensKit.
 
In this case I think historically we found out the hard way that it
was a better idea to return the default value (rather than something
random). And yes, I think we'll stick to it, so if you feel like
providing a pull request via github, it'd be great.

Done.
 
> I understand that HPPC makes very different decisions from JCL for a lot of
> things, and that is in general a good thing.

Don't know whether it's good, but certainly different. ;) I do
personally feel like HPPC belongs to the world of Java before streams
and we shouldn't try to make it a first-class citizen in streams world
(what Koloboke does, for example). The memory footprint is definitely
one aspect where it matters.

That makes sense. If we adopt it, and if we want to use stream ops, I'll probably just make a utility method that exposes a double stream from a long iterable. We've got them for other things, and will need more if we use HPPC (for interoperability, we'll need to create shims that implement `List<Long>` on top of `LongList`).
 
> I'm considering adopting HPPC for LensKit (it's so much smaller than
> fastutil, and its APIs fit very well with some of our custom data
> structures), but I want the code to be accessible to newer developers.

Think carefully about what I mentioned above and consider the
consequences, Michael. LensKit sounds great and is definitely a good
match for HPPC, but learning a different collection abstraction may be
a showstopper for people coming to the project. Now, this is a shot in
the foot, but I'm just being honest about how I feel.

Absolutely. It's a tough road, and we've screwed it up in the past (earlier versions of LensKit had this thing called a 'SparseVector', which was weird and didn't behave like anything anyone expected).

We're basically looking at 3 options if we move away from fastutil:
  • HPPC + custom adapters for JCL compat + our custom sorted array LongDoubleMap implementation.
  • Eclipse Collections + custom adapters for JCL compat + our custom sorted array LongDoubleMap implementation. I'm pretty hesitant about this because of all the additional machinery it brings with it, much of which is redundant with Guava and Java Streams, both of which we're using for our non-primitive data structure manipulations.
  • Roll our own implementations of the few collections we really need (long array list, long double hash map, long hash set, maybe long int hash map and double array list).
Our own implementations would wind up looking a lot like HPPC - it basically has the API we need. The weird documented behavior of 'get' was really the only thing I saw in my review that gave me pause, in part because we rely on default-0 in a number of places.

I very much appreciate your input. It's not a decision we're making lightly.

- Michael
Reply all
Reply to author
Forward
0 new messages