Hi All,
In preparation for discussing and working on the disk cache during our
upcoming necko work week, I wanted to start some discussion with a
wider audience about directions we can go for making our disk cache
present- (and future-) compliant. Right now, I'm starting with a focus
on (1) making the cache more performant, and (2) increasing
stability/preventing data loss under certain circumstances. With that
in mind, here is my initial list of ideas (in no particular order) of
things we can/should do. I would love to hear what others have to say
in terms of different ideas, reasons my ideas suck (or are great), or
something that I've missed.
(A) Reduce lock contention. Currently, the entire cache is protected
by one big lock, which effectively makes our cache single threaded. My
first step idea for this is to start making our locking finer grained.
[1] is the first bug in this work, with more to follow as more
opportunities for finer grained locking are identified. Eventually, it
would be nice to make some of our locks into multiple reader/single
writer locks for data structures where that makes sense (the global
cache metadata, for example).
(B) Reduce round-trips between the main thread and the cache thread.
This would mean pageloads aren't pushed to the tail of the queue each
time they need to get more data from the cache (among other things).
Discussion on this particular topic has already started at [2].
(C) Increase parallelism within the cache. Currently all cache I/O
happens serially on one thread, even if we're trying to read/write
more than one element in the cache at a time. We should improve this
situation by having more than one thread available for reading/writing
cache entries (this is dependent on having finer-grained locking of
some sort).
(D) Make it so changing the on-disk format of the cache is less likely
to require a wholesale deletion of the cache. Discussion has started
in [3] with some suggestions for strategies to achieve this goal,
which include versioning each cache entry independently of the cache
as a whole, keeping a bitmask of "options" used in the particular
cache entry (enabling us to ignore entries with options we don't
recognize or missing options we require), or making the vast majority
of the metadata more freeform (similar to the bitmask solution, but
text-based and allowing for key/value pairs).
(E) Make crash recovery more robust. This is a long-standing bug [4]
about the fact that we have precisely one dirty bit for the entire
cache, and if we ever crash (shutdown uncleanly), the entire cache is
deleted when the browser next starts up.
(F) More efficient storage of entries. One incremental improvement
might be to store an entry's metadata with that entry when the entry
is not stored in a block file. Currently, the entry and its metadata
are stored in 2 separate files, which can affect things like directory
seek times in a very full cache.
That's the list I've created for myself as a starting point. Of course
none of this is The One Final Answer, and priorities of these (and
other suggestions) are still up in the air. Comments are highly
desired so we can get our cache in way better shape.
Thanks!
-Nick
[1]
https://bugzilla.mozilla.org/show_bug.cgi?id=717761
[2]
https://groups.google.com/d/topic/mozilla.dev.tech.network/RTqPxjy-Eq0/discussion
[3]
https://bugzilla.mozilla.org/show_bug.cgi?id=715752
[4]
https://bugzilla.mozilla.org/show_bug.cgi?id=105843