Storing "no-store" content to a Blob?

8 views
Skip to first unread message

K. Moon

unread,
Aug 5, 2022, 10:15:25 AM8/5/22
to storage-dev, net-dev, Lei Zhang
Apologies if these mailing lists aren't relevant; wasn't sure where to send this question.

crbug.com/158957 (saving original PDFs to disk) recently became hot again. Our basic problem is that we would like to keep an original copy of the PDF for saving if requested, but this requires an unpredictable amount of memory.

Today, we rely on the cache to serve the same response on the "save to disk" action, but this falls down in cases where the cache cannot re-serve the response, and the server returns a different response the second time.

Ideally, we'd have something like virtual memory, but for renderer memory (which is not unlimited). This led me to looking at the Blob API, which roughly does what I wanted (stores large amounts of data, persists to disk when under memory pressure).

The flaw in this plan is the "no-store" case, as we shouldn't persist such data to disk. This rules out the Blob option, unless we can prevent persisting to disk, or somehow have a "no-store"-compliant version of persisting to disk. (Perhaps encrypting with an ephemeral key?)

Another idea might be to modify the guarantees from the cache specifically for the PDF viewer's use case: Since the cache may have to store a copy of the response anyway, it'd be better to let it handle it, and not store the same data twice. (This would be analogous to treating the resource as if it were a file, rather than something loaded from the network.) This seems hard to me, but maybe it's a better approach.

I don't anticipate we'll act on this bug soon, due to these design challenges, but wanted to collect any ideas from Blob storage and network caching experts.

K. Moon

unread,
Aug 5, 2022, 10:21:06 AM8/5/22
to storage-dev, net-dev, Lei Zhang
(A third approach that we may end up pursuing in the end is to handle "no-store" and other responses that might not be re-served perfectly from the cache differently, but my ideal solution wouldn't put these special cases in the PDF code.)

K. Moon

unread,
Aug 5, 2022, 10:24:25 AM8/5/22
to storage-dev, net-dev, Lei Zhang
Oops, one more thing: I've heard there's a new file system API which may also do the job, but haven't looked into it deeply yet. We would want any files created by the PDF viewer to be ephemeral (go away when the document unloads).

Joshua Bell

unread,
Aug 5, 2022, 12:54:15 PM8/5/22
to K. Moon, storage-dev, net-dev, Lei Zhang
The new FS API won't directly help; it doesn't have ephemerality guarantees.

We are building a "Storage Buckets" API that allows origins to partition their own private storage space with different qualities (e.g. persistence levels, write durability, subsets of quota, expiry dates). Encryption and/or ephemerality is a potential future feature there, so a site could designate a subset of its storage to be encrypted and discarded as soon as possible after the key is destroyed. But there's been no formal design for such a thing, and that also implies a web-facing key management API.

For incognito sessions which currently handle storage in memory, there's been discussion about moving that to being disk-backed but encrypted. IIRC there are also some storage types that in incognito sessions use OS-level "delete on close" support. 

It might be worth checking on whether the OS-level "delete on close" guarantees are enough in your scenario. I could imagine a non-web API where the browser process could mint a read/write/delete-on-close filehandle for the PDF viewer to use; the viewer could dump data there that would live only as long as the FH stayed open, and it would go away if the process died.







K. Moon

unread,
Aug 5, 2022, 9:20:15 PM8/5/22
to Joshua Bell, storage-dev, net-dev, Lei Zhang
Thanks for the suggestion! Minting a small Mojo API specifically for the PDF viewer sounds like it might be the best way forward.

I also had the impression that the new FS API didn't support an ephemeral temporary file concept, but wasn't sure if I just missed something. The storage bucket thing sounds interesting; perhaps something to converge on in the future.

Another idea I had was to encrypt in the PDF renderer before storing externally, but that seems fraught with "roll your own encryption"-type dangers.

I'll be out next week, but it sounds like it's worth following up on these alternatives in a small design doc.
Reply all
Reply to author
Forward
0 new messages