Intent to Implement: CompressStream

332 views
Skip to first unread message

向井かのん

unread,
Aug 29, 2019, 9:17:49 AM8/29/19
to blin...@chromium.org
canon...@google.com,ri...@chromium.org https://github.com/ricea/compressstream-explainer/blob/master/README.md Not yet https://github.com/w3ctag/design-reviews/issues/410 CompressStream is a JavaScript API for gzip compression using Streams. It is possible to compress stream data without this feature, but common libraries like zlib are complex to use. CompressStream makes it easy for developers to do this, and avoids the need to bundle a compressor with their application.
The main risk is that it fails to become an interoperable part of the web platform if other browsers do not implement it. Browsers may produce different compressed output for the same input. Firefox: No public signals Edge: No public signals Safari: No public signals Web developers: No signals The API is very minimal in this first version, which should make it easy to use. Developers familiar with the Streams API will be able to use it without learning anything new. The composability of Streams makes interoperability with other APIs such as TextEncoderStream easy. CompressStream can be polyfilled using a JavaScript or wasm implementation of gzip. This should make it easy to adopt. CompressStream is agnostic to the data that passes through it, but users of the API may have to use caution if the data is to be sent or stored in such a way that an attacker might be able to observe its length. If most of the compressed data is known to an attacker, or parts of it can be controlled by an attacker, then they may be able to infer information about the unknown contents from its length. This is similar to the CRIME attack. This is not a problem with CompressStream itself, but fundamental to the way compression works, and it cannot be solved within the API.
Normal DevTools JavaScript debugging will work. No special support is needed. Yes No The feature will be tested in web platform tests. https://chromestatus.com/feature/5855937971617792
This intent message was generated by Chrome Platform Status.

PhistucK

unread,
Aug 29, 2019, 1:34:18 PM8/29/19
to 向井かのん, blink-dev
I am interested in the web developer interest here. Is this something that is highly (or even moderately) needed?
I do not pretend to know a lot of projects (especially size-sensitive projects), but I do not remember this being an issue, or a missing feature, or even a need (at web development), ever.
If compressed uploads are the main problem this is trying to solve, the browser can just upload compressed data without exposing a stream API for it (maybe using a flag on fetch). I believe requests support compression just as much as responses, at the HTTP protocol level.

PhistucK


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAMp5r%2BzN2xkbJ%2B-_-0-L%2BNN17DUU6gZJkwwkMkQdsnSG2weNgg%40mail.gmail.com.

Ilya Grigorik

unread,
Aug 29, 2019, 1:47:10 PM8/29/19
to PhistucK, 向井かのん, blink-dev, Yoav Weiss
Really excited to see this! 

The need for this has come up many times in context of various WebPerf WG discussions. Use cases we heard and that immediately come to mind: analytics vendors asking for ability to compress on the client prior to sending a beacon, and optimizing use of local storage in SPAs, etc. Would love to have you join one of the upcoming group calls to discuss the design and use cases — might be a good TPAC topic, in fact.

EricLaw-MSFT

unread,
Aug 29, 2019, 9:26:59 PM8/29/19
to blink-dev
I'm excited to see this as well, and I'm glad to see that there's a corresponding DecompressStream. One immediate thought is that this would be a useful primitive for building JavaScript apps that manipulate files in the popular ZIP format (based on DEFLATE). I'd be very excited if brotli encoders and decoders were available, given the size/performance of polyfilling.

With regard to the Beacon use case, is the async nature of the API compatible with the new/upcoming restrictions on what sorts of code can run in beforeunload handlers?

PhistucK

unread,
Aug 30, 2019, 3:11:44 AM8/30/19
to Ilya Grigorik, 向井かのん, blink-dev, Yoav Weiss
Optimizing the local storage should be a browser feature - it has more content and more area to analyze and make the compression more effective.
Otherwise, applications would be forced to use less fields for carrying the information in order to make the compression effective, which could create awkward implementations, or greater data loss when data is being evicted by the browser.

Or am I missing something?

PhistucK

Adam Rice

unread,
Aug 30, 2019, 3:34:24 AM8/30/19
to EricLaw-MSFT, blink-dev
With regard to the Beacon use case, is the async nature of the API compatible with the new/upcoming restrictions on what sorts of code can run in beforeunload handlers?

My understanding is that Promises in beforeunload handlers are not considered "async", and so if we specify it carefully, we should be able to make sure this works.

On Fri, 30 Aug 2019 at 10:27, 'EricLaw-MSFT' via blink-dev <blin...@chromium.org> wrote:
I'm excited to see this as well, and I'm glad to see that there's a corresponding DecompressStream. One immediate thought is that this would be a useful primitive for building JavaScript apps that manipulate files in the popular ZIP format (based on DEFLATE). I'd be very excited if brotli encoders and decoders were available, given the size/performance of polyfilling.

With regard to the Beacon use case, is the async nature of the API compatible with the new/upcoming restrictions on what sorts of code can run in beforeunload handlers?

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Adam Rice

unread,
Aug 30, 2019, 3:40:35 AM8/30/19
to PhistucK, 向井かのん, blink-dev
 I believe requests support compression just as much as responses, at the HTTP protocol level.

Unfortunately this is not the case. HTTP is asymmetrical in this respect: a client can send "accept-encoding" in the request to tell a server what encodings it can use in the response, but in the other direction there is a bootstrapping problem. There was a long effort to attempt to add compressed upload to the HTTP protocol, and many approaches were tried, but it was not successful because middleboxes kept misinterpreting the payload.

PhistucK

unread,
Aug 30, 2019, 3:48:13 AM8/30/19
to Adam Rice, 向井かのん, blink-dev
That actually means that it is supported at the protocol level. Pre-advertising the feature is not supported.
(Middleboxes are broken in many other aspects anyway and are not part of the protocol)

What if individual fields are compressed rather than the entire payload (less efficient sometimes, but still may have some benefit)? Would this also incompatible with middleboxes? Or simply very inefficient?

PhistucK

Adam Rice

unread,
Sep 3, 2019, 3:21:57 AM9/3/19
to PhistucK, 向井かのん, blink-dev
That actually means that it is supported at the protocol level. Pre-advertising the feature is not supported.
(Middleboxes are broken in many other aspects anyway and are not part of the protocol)

Yes. However, broken middleboxes is the environment that browsers have to function in.

What if individual fields are compressed rather than the entire payload (less efficient sometimes, but still may have some benefit)? Would this also incompatible with middleboxes? Or simply very inefficient?

This would work, but short fields would take more bytes rather than less. It's less flexible than CompressStream, as CompressStream can be used to implement it, but the converse is not true.

nicj...@gmail.com

unread,
Sep 4, 2019, 2:55:22 PM9/4/19
to blink-dev
CompressStream is something we would find useful for Boomerang, a Real User Monitoring (RUM) library: https://github.com/akamai/boomerang

In Boomerang we gather and beacon a lot of performance data from the browser, including NavigationTiming, ResourceTiming, UserTiming, etc.  Some of those interfaces can produce a lot of data (e.g. 10KB+), if we just serialize it straight to JSON and put it on the beacon.

For Boomerang we've jumped through a lot of hoops implementing custom compression schemes[1][2] for each set of data, which takes a lot of support code in the library[3], and is often the most expensive operation we execute on the page[4].  We also have to write code on the server to decode and convert it back into something usable.

Taking one example, on a popular media website, serializing all of the ResourceTiming entries to JSON is about 54 KB.

When we apply our compression techniques in Boomerang to it, it reduces in size to around 5.7 KB.

If we were to just simply gzip the original JSON, the size would be around 6.3 KB, only a smidge larger than our custom compression techniques.

If it were available in most browsers, we would probably skip all of that custom stuff and just gzip the JSON'd data.  That would save us complexity, library size, CPU time, maintainability, etc.



PhistucK

unread,
Sep 5, 2019, 2:57:00 PM9/5/19
to nicj...@gmail.com, blink-dev
Another solution for the compressed uploads could be a built in (but different, to avoid the current incompatibility with middleboxes) way to send compressed content in fetch, using a new HTTP header, like Request-Transfer-Encoding or something (since the compressed body is not the problematic part, otherwise the compression would not work anyway).

ZIP manipulation sounds like a niche use case that could be helped by Web Assembly and JavaScript engine optimizations. 

Can you list significant use cases for this feature other than uploading and local storage? I am struggling to find a real use case that should not be built in.
(I know the extensible web manifesto says browsers should have less magic and this is a nice goal, but I am not sure this is the right building block)

PhistucK


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Adam Rice

unread,
Sep 9, 2019, 3:02:17 AM9/9/19
to PhistucK, nicj...@gmail.com, blink-dev
Can you list significant use cases for this feature other than uploading and local storage? I am struggling to find a real use case that should not be built in.
  • WebRTC, WebTransport and other non-HTTP network APIs. 
  • Lazy decompression of resources (rather than paying the cost up-front with HTTP).
  • In-memory databases.
  • Loading and saving gzipped files with the file system API.
But I think the most important use cases are the ones we haven't thought of yet.

PhistucK

unread,
Sep 10, 2019, 1:49:51 AM9/10/19
to Adam Rice, nicj...@gmail.com, blink-dev
Thank you!
It still feels like something that the browser should do on its own (by default in most cases) because it has a much broader perspective of the content and that can be compressed better.
But I realize that adding compression options (or even without options in some cases, just do it), even if this is far more efficient, to each and every API can feel like "the browser is doing much too much to please the developer" in a way.

Going this route feels like a micro-optimization.

PhistucK

Reply all
Reply to author
Forward
0 new messages