I'm working on a JVM implementation of Haskell called Eta  and I'm looking for ways to make the lazy evaluation implementation more efficient. In GHC, Haskell's native compiler, they have an optimisation called "dynamic pointer tagging." Effectively what they do is store the evaluated status as the least significant bit(s) of the pointer address to the heap object. When they want to refer to the object itself, they'll "untag" the pointer using a bitmask.
It has shown that this leads to a 10% performance improvement because checking whether a thunk (suspended expression evaluation) is evaluated or not is a pretty frequent operation. Now the easiest way to do this is simply to store a boolean field in a thunk that stores the evaluation status, but this leads to a significant memory increase since a boolean is represented by an int and thunks are one of the most common types of objects in the Eta runtime. So I want a cheap way (in both space and time) to check whether a thunk is evaluated or not.
Use the bits in the object header that aren't used, such as the locking bits. The Eta runtime does not use the native locking mechanism at all and locking in the concurrent part of the runtime is implemented via atomic operations in sun.misc.Unsafe. It seems a shame that all that unused space is just lying there. I want to know whether the JVM makes key decisions based on the values of the locking bits. And I also want to know if the GC touches these bits (meaning if I set the bits, will I lose the evaluation status after a GC?) and any other consequences of pursuing this rather hacky solution.
Another property of being evaluated is that it's a one-time operation and once the transition goes from 0 to 1, it stays at 1 for the lifetime of the object, effectively like a switch. The reason I bring this up is that I think such a behaviour can be achieved via invokedynamic, but I'm not sure if it's worth pursuing that direction.