Thanks for your questions!
My job configuration supports multiple window sizes at the same time, and I want to run all the windows over the same buffer. I evict based on the largest window size (i.e. headMap(maxWindowEarliest).clear()), then take time-based submaps for the smaller windows, i.e. tailMap(eachWindowEarliest).
The subsetting is presumably accelerated by using a sorted map, although there will certainly be a memory/compute tradeoff compared to using a simpler data structure, depending on the window size. At the moment I'm trying to optimize for heap usage, because this is per-session state in a streaming job, and we want to carry a lot of active sessions on each node when we go live.
I guess in the degenerate case when there's only 1-2 window sizes in a job, I might be better off with 1-2 linked lists. 🤔
btw I've heard rumours over the years that linked lists don't always suffer from memory locality issues, largely thanks to thread-local allocators... But in my case, since I'm running in Apache Flink, and there are typically multiple seconds between appends, I think it's pretty unlikely that the same JVM thread will be creating the buffer as appending to it, and I would be suffering from memory bandwidth issues if I used linked lists.
It's really easy to burn a lot of time on these micro-optimizations... I call it a Dangerously Fun Engineering Problem :)
-0xe1a
--
You received this message because you are subscribed to the Google Groups "fastutil" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fastutil+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fastutil/5c330dab-4c28-4b62-9f3a-101d8fcd4ed3n%40googlegroups.com.