I'm working on a JSON encoder that in theory could be 30-200% faster than Jackson.
I have a working prototype.
I made a few tradeoffs and one is that it never resizes an underlying buffer.
You pass it a pre-allocated buffer and then you encode your object and it returns a ByteBuffer than you can use with the encoded JSON.
Now I'm curious about the best way to share the memory across threads/cores.
If you had a pool of buffers, how do you prefer to use a buffer which was allocated on the same numa zone?
I think I can cheat and just access the buffer directly because since its a byte[] it would be cached in the L1/L2 cache.
However, I'm going to have to read that cache line in ever cache_size/buffer_length intervals and my code will have to wait for that (probably getting context switched).
I can't really use the Thread ID because if you have 100 thread you will either need to have too many buffers or too few.
I guess I could use JNA as in:
Is there a decent pattern here? Maybe something from Disruptor?