Comment #14 on issue 375297 by
dmu...@chromium.org: the total blobs' size
Brain dump after exploration:
The limit exists here:
https://code.google.com/p/chromium/codesearch#chromium/src/storage/browser/blob/blob_storage_context.cc&type=cs&sq=package:chromium&l=37&rcl=1418672972
The reason this exists is because all blobs are stored in memory in the
browser process. Obviously we can't have the be unbounded, so after .5
gigs, we start throwing back garbage blobs. Also notice that this data is
streamed to the browser process. So this doesn't depend on
the .createObjectURL, just keeping around blobs that total > 512MB creates
this problem.
Multiple issues here:
1. A file is larger than the memory limit. This will always have to be
represented at least partially on disk, if not completely on disk.
2. Multiple blobs (I believe this store is per-profile) can fill this up,
so being able to boot some to disk would be great.
Ideas:
1. Universal algorithm of: "If we're appending to a blob and we reach the
memory limit, boot the largest blob we have (including possibly the one
we're appending to) to disk". This solves both problems nicely, but could
result in less-than-optimal distribution of files on disk vs memory. This
could be solved by knowing the size of the currently growing blob ahead of
time. I imagine ideally we want the largest blobs on disk.
2. Solve the "file > max memory" problem by always booting those blobs to
disk right away. This might depend on us knowing the total size of the
blob before saving it, which might not be possible. This would maybe be
more performant than above, as we're not doing a second partial storage in
memory before booting to disk.
3. We could look at doing spec changes to allow dynamic revocation of
blobs based on a cache strategy like LRU. Sites that would break due to
this would already be broken if they're already hitting the memory limit,
so I think this would be a not-too-dangerous change.
Looking at file cleanup, I believe all blobs are cleaned up on shutdown (or
startup if we're crashed), so that wouldn't be an issue for file-backing
these guys.
Conclusion:
I'm a fan of doing both #1 and #2, as that will keep the system running
optimally, but this assumes we can known the total size of a blob before
streaming the data to the browser process, which might not be possible (or
feasible). The easier would probably be doing just #1. The spec change
would be nice to give us more flexibility here as well, and an LRU is easy
to implement.
Keep in mind there are a decent number of assumptions I'm making here, so
this is still speculation. I'm going to do more work to create a prototype
& design doc.