So I was pretty happy with getting the CachingResourceManager plugged into my code, but I figured I would run some test to make sure it was
actually caching stuff. Turns out it doesn't work. It actually has a number of fairly serious issues that seem to make it a no-op. I'm actually curious if anyone has really ever tested it before or if everyone just assumes it's working since it doesn't error. I'll try to break down all the issues I ran into categorically. I spent a couple days reverse engineering how it's supposed to work. Some of these issues in are also in the DirectBufferCache and likely affect other uses of it outside of the CachingResourceManager. The good news I have what appears to be fixes for many of these issues locally so I can try and help with pull requests.
CacheEntry claim enable has code paths that leave invalid cache items
So the cyclomatic complexity of the CachedResource's serve() method is pretty crazy. There are a ton of code paths which vaguely use non-idempotent methods like cacheItem.reference() or calls that require *un-doing" like cacheItem.claimEnable(). There are a number of code paths that leave cache entries in invalid states. Most notably, in its current form, a cached resource won't actually get cached until the SECOND time its served (more on why below) but the serve method tends to leave CacheEntries with an enabled status of "1" which means they can never be claimed by another thread, basically killing them. I had to completely refactor CachedResource.serve() to ensure any claimed CacheEntry which didn't get wrapped by the ResponseCacheSender was careful to disable the entry so it could get picked up the next thread.
CacheEntries aren't allocated right away, and are only eventually allocated once the get() hit sampling runs every 5 hits
So this is the reason it takes TWO hits to a cache resource to actually cache it. There is no logic anywhere that actually asks newly created cache entries to allocate their buffers! This means the first time through CachedResource.server(), the entry is unallocated so it's not possible to be cached. The ONLY code that eventually allocates an entry is inside of the hit sampling counter that attempts to purge items from the cache to see if it can finally allocate buffers for unallocated entries. So it's only after 5 "hits" (the SAMPLE_INTERVAL) to the cache entry that it gets eventually allocated. This clearly appears to be an oversight as there's no reason not to immediately attempt allocation of buffers to each new cache entry.
CacheEntries log way more hits than they should
I mentioned above that it takes 5 "hits" to a cache entry before it's allocated. You may be wondering how it gets cached after only two HTTP requests. The reason is that the CacheEntry receives FOUR hits for every HTTP request due to multiple dataCache.get() calls throughout the code (such as getting the content length). So by the 2nd http request, the cacheEntry is getting its 5th (though 8th) "hit" which finally manages to allocate it. To reduce contention and make the "hits" more accurate, I would introduce a "getQuiet()" method to the direct buffer cache which fetches an item from the cache without bumping the hit counter and triggering the cache reaping. I would have the CachedResource.serve() method be the only place that uses "get()" and all the other places use "getQuiet()" so there is only a single "hit" registered per HTTP request serving the file. Ironically all the extra hits helped mask of the issue above.
ResponseCachingSender won't cache transfers, only sends
So I'll be honest-- I've read the code through about a dozen times and I still really don't quite understand the difference between a sender that "transfers" and "sends". I also can't find a shred of information online about exactly what the "TransferMinSize" really does and what it should be set to. I see several of the Undertow test cases set the transfer min size to 10 MB but I can't find any advice one how to tune that. The reason it matters is because any files smaller than the transfer min size (or when serving a range) use a transfer, not a serve (this logic in the PathResource.serveImpl() method. This is problematic because the ResponseCachingSender doesn't implement any caching in its transferFrom() method. There's no explanation as to why other than a comment simply saying "Transfer never caches". Can I get some clarity on why that is? This causes several issues which I'll cover below.
PathResource serveImpl() only uses serve() if file is smaller than the transferMinSize which means the ResponseCachingSender will never cache the buffers for files smaller than the transferMinSize, regardless of the maxFileSize set in your CachingResourceManager
Given what I explained above, this creates a weird catch 22 where files that are smaller than the CachingResourceManager's maxFileSize look like they are eligible to be cached so far as the CachedResource class is concerned. The CachedResource wraps the sender in a ResponseCachingSender with the expectation that
- The resource will be cached
- The cacheEntry will be enabled aftward (or disbled on error)
However, the transfer min size logic in the PathResource can create a separate code path through the ResponseCachingSender;s transferFrom() method which never enables or disables the cacheEntry, leaving it in an invalid state and never caching the item. Furthermore, the default transfer min size is only 1024 bytes which is very small and means all files over 1KB would never actually be cached!. Is it expected to always set your transfer min size to be at least as large as the caching resource manager's maxFileSize? This isn't documented anywhere and wasn't even obvious until I did some very deep debugging.
Files larger than the CachingResourceManager's maxFileSize but larger than a PathResourceManager's transferMinSize clog up the datacache and keep trying to allocate buffers every time they are served even though they will never be cached.
Furthermore (on the point above) - all files small enough to be cached but larger than the transfer min size are not only not cached, but their CachedEntry instances are left in invalid states and clogging the dataCache with allocated buffers which will never be used. I think this needs some thought put into it as I can't find a way to detect when the file will not be cached from inside of the CachedResource method in order to optimize. The CachedResource class doesn't know what specific type of Resource is being served and the transerMinSize is specific only to the PathResourceManager so there's just no way to get the information. I'd really like to know why we can't just cache the transfers, but assuming there's a good reason for that, there needs to be a better way for CachedResource.serve() to know when a resource isn't going to be cacheable and completely skip even trying to create and allocate the CacheEntry instance for it. Otherwise, it's just a waste of CPU cycles and memory every time.
CachedResource.serve() references the cache entry twice leaving the refs count at 2, meaning it can never be destroyed.
This goes back to the crazy code paths I talked about in the first section above. The code in CachedResource,serve() can call the entry.reference() method twice which creates two references to the item. Furthermore, there are code paths which reference the cacheEntry BUT opt to not use the responseCachingSender with its DereferencingCallback, leaving the cached entry in an invalid state where the number of refs get stuck at 2. This means these cache entries can NEVER be destroyed and their buffers will never be cleared, even after the cache entry is removed from the directBufferCache. The actual slices will be used forever in the slice pool and will never even be garbage collected. It was pretty easy to put the cache in a completely invalid state if I set the size small enough to only have a few cache regions in the slice pool. Once they were all used, I could never allocate another cache entry since there were no more available regions in the slice pool EVEN THOUGH all the cachedEntries had been removed from the cache. This is pretty bad as it renders the cache useless.
A DirectBufferCache$CacheEntry's destroy method doesn't actually release buffers
This item is similar to the item above, but instead of bumping the refs counter too high so the cacheEntry can't be destroyed, the actual destroy() method, when called, doesn't even work! Here's the code (from DirectBufferCache$CacheEntry.destroy()
private void destroy() {
this.buffers = EMPTY_BUFFERS;
for (PooledByteBuffer buffer : buffers) {
buffer.free();
}
}
You can see this.buffers is replaced by empty buffers BEFORE the buffers are looped over and freed. This means that the old buffers are never freed for any cache item ever so far as I can tell! The regions remain used in the slice pool forever, and the entire reaping mechanism of the direct buffer cache is moot since it may remove the cache entries, but the buffers themselves, once full, will never be able to allocate again. The order of those two operations needs to be reversed so the older buffers are cleared BEFORE they are replaced by empty buffers.
Note on slice relationship to files
This is more just something that wasn't obvious at first and would hurt to be documented somewhere. The number of slices in the direct buffer cache represents the max number of files which can be cached. I found this out when playing with larger slice sizes. Basically, one file may use 5 slices, but 1 slice can only be assigned to a single file. Therefore, a 10 MB cache with 1 MB slices can only cache 10 files regardless of how small they are. The "left over" portions of the slices are just wasted. Again, this may be obvious to some people, but it wasn't to me at first and I only figured it out via trial and error so it's worth documenting somewhere. Most examples of how to use the direct buffer cache online were copied from the Undertow test suite which tends to use very non-realistic numbers just for the sake of the test such as this very
non-production-ready example from the test suite which you can see has worked its way into this
online tutorial even though a cache with those settings would be nearly useless for a real app.
-----------------------------------------
Whew-- so that represents about 3 full days of debugging the Undertow source and endless tests :) The good news is I have fixes for most of the issues I described above working in my local build of Undertow and the CachingResourceManager and DirectBufferCache seem to be working great. I can help by sending pulls, but perhaps I can send everything I changed in one big pull because honestly many of the issues were so intertwined, they only work with all the fixes together. I also added a few new methods to cacheEntries purely for the purpose of debugging. Perhaps leaving these in could be helpful to build a visualizer for a cache to see what it's doing. Right now it's just a black box and you can't even tell if it's working. Please give me your thoughts and how you want to proceed. I'm really anxious to roll out this caching feature in my product, but it's basically DOA as it stands right now.
Thanks!
~Brad
Developer Advocate
Ortus Solutions, Corp