> I might have overstated the cost of the class cast, but it's still
> non-trivial.
> Take a map of 1 million elements and 0.5 fill factor (these are averages
> over 1000s of trials):
> I can iterate over the entire map of 1 mil elements in 3ms, including
> allocated checks.
> I can pull out the specific keys in question in 5ms
> I can pull out the values as Object in 7ms
> but to coerce them into double[] takes maybe 18ms.
Your estimates feel very wrong to me. Don't know how you measure these
but it just doesn't seem right. Hotspot is very clever in terms of
class casts and optimizing code away, especially in tight loops. You
should *not* try to estimate every operation in isolation because it's
simply not how JVM works at lower level.
> This is a very interesting idea. I like it.
> You're proposing something like:
>
> IntIntOpenHashMap indexMap;
> double[][] data;
> int nextIndex;
No. What I proposed was to use:
IntIntOpenHashMap indexMap;
double[] data;
int firstEmptySlotInData;
double[][] is basically an array of objects (of type double[]) so the
overhead is significant. If you know there will be a few values in
each list associated with a key then you can use a single double[] and
encode the list in "blocks" of consecutive elements, overallocating a
bit. So the first block is, say, a maximum of 3 elements + header, the
next one is 7 elements + header etc. The header (first double on the
list) is typically used to encode the number of elements in the block
(if negative, for example) or the jump pointer to the next block.
All this requires some bit-fiddling and trickiness (like encoding
integer values using the first double -- Double.doubleToRawLongBits,
etc.) but will conserve memory.
> overhead of the class casting, but regardless, it can't possibly be good. I
> bet I can get a 20%+ perf improvement by avoiding it.
Again, I think you're trying to overoptimize things that you shouldn't
worry about until you really hit a performance problem.
D.