Intent to Ship: Zstd Content-Encoding

1,081 views
Skip to first unread message

Nidhi Jaju

unread,
Feb 12, 2024, 7:11:05 PMFeb 12
to blink-dev, Adam Rice

Contact emails

nidh...@chromium.org


Explainer

https://docs.google.com/document/d/1aDyUw4mAzRdLyZyXpVgWvO-eLpc4ERz7I_7VDIPo9Hc/edit?usp=sharing


Specification

https://datatracker.ietf.org/doc/html/rfc8878


Design docs

https://docs.google.com/document/d/14dbzMpsYPfkefAJos124uPrlkvW7jyPJhzjujSWws2k/edit?usp=sharing


Summary

Zstandard, or “zstd”, is a data compression mechanism described in RFC8878. It is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. The "zstd" token was added as an IANA-registered Content-Encoding token as per https://datatracker.ietf.org/doc/html/rfc8878#name-content-encoding


Adding support for "zstd" as a Content-Encoding will help load pages faster and use less bandwidth, and spend less time and CPU/power on compression on our servers, resulting in reduced server costs.


Blink component

Internals>Network


TAG review

https://github.com/w3ctag/design-reviews/issues/930 


TAG review status

Pending


Risks


Interoperability and Compatibility

Servers that have a broken implementation of zstd might exist, but the risk of this is small. Additionally, middleware and middleboxes like virus checkers that intercept HTTPS connections might not support zstd, but might fail to remove it from the Accept-Encoding header in the request.


Another known risk is interoperability between clients that support zstd regarding window frame sizes. In Chrome, we limit the window frame size to 8MB to prevent excessive memory usage, but this limit does not exist in curl and when using zstd directly. We have seen very few sites that use a window size larger than 8MB which causes decoding errors, but we have added new net error codes and debugging messages to help them understand what to do in this situation.


Gecko: Positive (https://github.com/mozilla/standards-positions/issues/775)


WebKit: Positive (https://github.com/WebKit/standards-positions/issues/168)


Web developers: Positive (https://crbug.com/1246971) Meta (Yann and Felix) and Akamai (Nic) are positive about zstd content-encoding on the browser. Meta has collaborated with us to improve the compression ratios for Meta origins during the experiment and is seeing positive user-level results. Alibaba is also supportive of shipping zstd support as they saw massive savings on their origins in terms of server CPU cost.


Other signals:


Ergonomics

While both Zstandard and Brotli are clear wins over gzip content-encoding, which of Zstandard or Brotli to use depends on many factors, and site authors may need to experiment to identify the optimal choice for their content.


Zstandard uses more memory for decompression than gzip. However, this is also true for Brotli, and we haven't seen any problems in practice.


Activation

The "zstd" Content-Encoding is not as widely supported by HTTP servers as gzip. Of the top 5 web servers, Nginx has a third-party module, which should also work for OpenResty (untested). Apache, IIS, and LiteSpeed appear to have no support. Explicit server support is often only necessary for dynamic content. For static (pre-compressed) content, Zstandard can often be supported just by configuration.


Only one public CDN is known to be able to compress Zstandard itself, and some CDN's may require custom configuration to pass-through Zstandard correctly.


Zstd support is not particularly difficult to implement for a server that already implements multiple content encodings. The C implementation has a straightforward API and there are implementations for many other languages. There is also a lively community of Zstandard enthusiasts which should help accelerate adoption.


Security

CRIME and BREACH mean that the resource being compressed can be considered readable by the document deploying them. That is bad if any of them contains information that the document cannot already obtain by other means. An attacker may provide correctly formed compressed frames with unreasonable memory requirements, and dictionaries may interact unexpectedly with a decoder, leading to possible memory or other resource-exhaustion attacks. It is possible to store arbitrary user metadata in skippable frames, so they can be used as a watermark to track the path of the compressed payload. It is important to note that these concerns apply to all compression formats, not just zstd.


To mitigate these risks, similar to Brotli, we'll be advertising support for "zstd" encoding only if transferred data is opaque to proxies, to ensure that resources don't contain private data that the origin cannot read otherwise.


Adding zstd to third_party/ in Chromium adds a large new code surface that processes untrusted data, which inevitably brings risks of new security holes. However, this is mitigated by the extensive fuzzing and security analysis done on zstd by Google and other community members.


Furthermore, zstd is implemented in C, which is not a memory-safe language, and the network service is not yet sandboxed on all platforms.


WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?

Apps which use a WebView to display content from Meta's servers will suddenly start using Zstandard. Since we've already extensively tested our implementation against Meta's servers in Chrome, no problems are expected. There is a killswitch. No special treatment should be needed.



Debuggability

No special support needed.

Zstd content-encoding support is exposed to the devtools protocol, so developers are able to override it and view the headers from the inspector.

A new net error has been added for decoding errors related to window frame size.


Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, ChromeOS, Android, and Android WebView)?

Yes


Is this feature fully tested by web-platform-tests?

Yes (https://wpt.fyi/results/fetch/content-encoding/zstd


Flag name on chrome://flags

enable-zstd-content-encoding


Finch feature name

ZstdContentEncoding


Requires code in //chrome?

False


Tracking bug

https://bugs.chromium.org/p/chromium/issues/detail?id=1246971


Launch bug

https://launch.corp.google.com/launch/4266275


Measurement

https://chromestatus.com/metrics/feature/timeline/popularity/4629


Adoption plan

In our experimental group, around 1% of responses use "zstd" content-encoding. Given the significant benefits of zstandard over gzip, we'd like to see it increase to 10% within 2 years.


Estimated milestones

Shipping on desktop

123

DevTrial on desktop

117


Shipping on Android

123

DevTrial on Android

117


Shipping on WebView

123



Anticipated spec changes

Open questions about a feature may be a source of future web compat or interop issues. Please list open issues (e.g. links to known github issues in the project for the feature specification) whose resolution may introduce web compat/interop risk (e.g., changing to naming or structure of the API in a non-backward-compatible way).

The current standard, RFC8878, doesn't require a limit on the window size used by HTTP servers when compressing Zstandard. An update of some form will be needed to ensure interoperability.


Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/6186023867908096


Links to previous Intent discussions

Intent to Prototype: https://groups.google.com/a/chromium.org/g/blink-dev/c/GDsI0Hw-jYk/m/Yc5QZWD-AwAJ

Intent to Experiment: https://groups.google.com/a/chromium.org/g/blink-dev/c/I6IWfl95gRU



This intent message was generated by Chrome Platform Status.


Yoav Weiss (@Shopify)

unread,
Feb 13, 2024, 2:18:37 AMFeb 13
to Nidhi Jaju, blink-dev, Adam Rice
Thanks for working on this!! This is extremely exciting!

On Tue, Feb 13, 2024 at 1:11 AM Nidhi Jaju <nidh...@chromium.org> wrote:

Contact emails

nidh...@chromium.org


Explainer

https://docs.google.com/document/d/1aDyUw4mAzRdLyZyXpVgWvO-eLpc4ERz7I_7VDIPo9Hc/edit?usp=sharing


Specification

https://datatracker.ietf.org/doc/html/rfc8878


Design docs

https://docs.google.com/document/d/14dbzMpsYPfkefAJos124uPrlkvW7jyPJhzjujSWws2k/edit?usp=sharing


Summary

Zstandard, or “zstd”, is a data compression mechanism described in RFC8878. It is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. The "zstd" token was added as an IANA-registered Content-Encoding token as per https://datatracker.ietf.org/doc/html/rfc8878#name-content-encoding


Adding support for "zstd" as a Content-Encoding will help load pages faster and use less bandwidth, and spend less time and CPU/power on compression on our servers, resulting in reduced server costs.


Blink component

Internals>Network


TAG review

https://github.com/w3ctag/design-reviews/issues/930 


TAG review status

Pending


Risks


Interoperability and Compatibility

Servers that have a broken implementation of zstd might exist, but the risk of this is small. Additionally, middleware and middleboxes like virus checkers that intercept HTTPS connections might not support zstd, but might fail to remove it from the Accept-Encoding header in the request.


Another known risk is interoperability between clients that support zstd regarding window frame sizes. In Chrome, we limit the window frame size to 8MB to prevent excessive memory usage, but this limit does not exist in curl and when using zstd directly. We have seen very few sites that use a window size larger than 8MB which causes decoding errors, but we have added new net error codes and debugging messages to help them understand what to do in this situation.


I know we discussed this at length at the WebPerfWG. 
Can you summarize developments and/or findings since that discussion on that front?
Should we expect the default output of CLI tools to be compatible with what we want to ship here?
Should we expect interoperability between Chromium and e.g. curl?

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAMZNYAN7VRca4VfRqP7pi%2BnqwDuor4ZVjF9yDNH1mZcXteQURw%40mail.gmail.com.

Nidhi Jaju

unread,
Feb 13, 2024, 3:29:30 AMFeb 13
to Yoav Weiss (@Shopify), blink-dev, Adam Rice
We've been discussing it with the zstd team at Meta at https://github.com/facebook/zstd/issues/2713. The plan is to take it to the HTTP WG at the IETF and either file an errata or publish a new document with more strict window size guidelines. The zstd CLI tool currently supports up to 8MB as a default, so the same limit. The library will use 128MB by default, however, and Curl currently supports up to 128MB windows. We expect those defaults to change to match any spec changes. In practice, we've seen very limited reports of sites running into this limit, and we've added helpful messages in Chromium to guide anyone who does run into it.

Yoav Weiss (@Shopify)

unread,
Feb 13, 2024, 4:43:41 AMFeb 13
to Nidhi Jaju, blink-dev, Adam Rice
Thanks! Pushing that limit into the standard and having curl (and other tools) follow that makes sense and seems important.

Thinking out loud, the main risk here is for folks to be testing their content outside of Chromium (e.g. with curl) and then have that content break in Chromium. At the same time if content is tested in Chromium, it will work in another client that supports larger windows.
So the (seemingly small) risk here is one we take on ourselves, rather than risk we externalize on the ecosystem.


 


Gecko: Positive (https://github.com/mozilla/standards-positions/issues/775)


WebKit: Positive (https://github.com/WebKit/standards-positions/issues/168)


Web developers: Positive (https://crbug.com/1246971) Meta (Yann and Felix) and Akamai (Nic) are positive about zstd content-encoding on the browser. Meta has collaborated with us to improve the compression ratios for Meta origins during the experiment and is seeing positive user-level results. Alibaba is also supportive of shipping zstd support as they saw massive savings on their origins in terms of server CPU cost.


Other signals:


Ergonomics

While both Zstandard and Brotli are clear wins over gzip content-encoding, which of Zstandard or Brotli to use depends on many factors, and site authors may need to experiment to identify the optimal choice for their content.


Zstandard uses more memory for decompression than gzip. However, this is also true for Brotli, and we haven't seen any problems in practice.


Activation

The "zstd" Content-Encoding is not as widely supported by HTTP servers as gzip. Of the top 5 web servers, Nginx has a third-party module, which should also work for OpenResty (untested). Apache, IIS, and LiteSpeed appear to have no support. Explicit server support is often only necessary for dynamic content. For static (pre-compressed) content, Zstandard can often be supported just by configuration.


Only one public CDN is known to be able to compress Zstandard itself, and some CDN's may require custom configuration to pass-through Zstandard correctly.


Zstd support is not particularly difficult to implement for a server that already implements multiple content encodings. The C implementation has a straightforward API and there are implementations for many other languages. There is also a lively community of Zstandard enthusiasts which should help accelerate adoption.


Security

CRIME and BREACH mean that the resource being compressed can be considered readable by the document deploying them. That is bad if any of them contains information that the document cannot already obtain by other means. An attacker may provide correctly formed compressed frames with unreasonable memory requirements, and dictionaries may interact unexpectedly with a decoder, leading to possible memory or other resource-exhaustion attacks. It is possible to store arbitrary user metadata in skippable frames, so they can be used as a watermark to track the path of the compressed payload. It is important to note that these concerns apply to all compression formats, not just zstd.


To mitigate these risks, similar to Brotli, we'll be advertising support for "zstd" encoding only if transferred data is opaque to proxies, to ensure that resources don't contain private data that the origin cannot read otherwise.


I'm not sure what that means. Can you elaborate on that?

Asif jutt Jutt

unread,
Feb 13, 2024, 5:24:55 AMFeb 13
to Nidhi Jaju, blink-dev, ri...@chromium.org
S

Service delivery Manager

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAMZNYAN7VRca4VfRqP7pi%2BnqwDuor4ZVjF9yDNH1mZcXteQURw%40mail.gmail.com.

James Hartig

unread,
Feb 13, 2024, 12:55:31 PMFeb 13
to blink-dev, Asif jutt Jutt, blink-dev, ri...@chromium.org, Nidhi Jaju
My employer ran into the window size during our pre-production validation and it was difficult to debug since it was working in cURL, the zstd CLI, and only presented itself on certain URLs. I appreciate Nidhi responding to our issue so quickly and updating Chrome to have a more helpful error message in the future. The Go package we use already updated their default to 8MB (without any awareness to Chrome's size) which should help future users of that package but there might be other packages out there that might not have a low enough default. The updated Chrome error message will help but only if you run into that error message when testing; which might not if you happen to be testing with small responses. I'm not sure where developers should be looking to be aware of the window size. Does it make sense to mention in the Chrome Status entry? If the spec is updated that might be good enough but I just wanted to discuss other avenues that might be more developer-aware.

Asif jutt Jutt

unread,
Feb 13, 2024, 1:30:39 PMFeb 13
to James Hartig, blink-dev, ri...@chromium.org, Nidhi Jaju
Support on update 

Service delivery Manager

Nidhi Jaju

unread,
Feb 13, 2024, 8:36:10 PMFeb 13
to Yoav Weiss (@Shopify), James Hartig, blink-dev, Adam Rice
On Wed, Feb 14, 2024 at 2:48 AM James Hartig <faste...@gmail.com> wrote:
My employer ran into the window size during our pre-production validation and it was difficult to debug since it was working in cURL, the zstd CLI, and only presented itself on certain URLs. I appreciate Nidhi responding to our issue so quickly and updating Chrome to have a more helpful error message in the future. The Go package we use already updated their default to 8MB (without any awareness to Chrome's size) which should help future users of that package but there might be other packages out there that might not have a low enough default. The updated Chrome error message will help but only if you run into that error message when testing; which might not if you happen to be testing with small responses. I'm not sure where developers should be looking to be aware of the window size. Does it make sense to mention in the Chrome Status entry? If the spec is updated that might be good enough but I just wanted to discuss other avenues that might be more developer-aware.

Thank you, I've included these details about the window size limits under the "Interoperability and Compatibility Risks" section in the ChromeStatus entry. 

Yes, that sounds right. We'll continue to push to standardize this behavior across the ecosystem.
 


 


Gecko: Positive (https://github.com/mozilla/standards-positions/issues/775)


WebKit: Positive (https://github.com/WebKit/standards-positions/issues/168)


Web developers: Positive (https://crbug.com/1246971) Meta (Yann and Felix) and Akamai (Nic) are positive about zstd content-encoding on the browser. Meta has collaborated with us to improve the compression ratios for Meta origins during the experiment and is seeing positive user-level results. Alibaba is also supportive of shipping zstd support as they saw massive savings on their origins in terms of server CPU cost.


Other signals:


Ergonomics

While both Zstandard and Brotli are clear wins over gzip content-encoding, which of Zstandard or Brotli to use depends on many factors, and site authors may need to experiment to identify the optimal choice for their content.


Zstandard uses more memory for decompression than gzip. However, this is also true for Brotli, and we haven't seen any problems in practice.


Activation

The "zstd" Content-Encoding is not as widely supported by HTTP servers as gzip. Of the top 5 web servers, Nginx has a third-party module, which should also work for OpenResty (untested). Apache, IIS, and LiteSpeed appear to have no support. Explicit server support is often only necessary for dynamic content. For static (pre-compressed) content, Zstandard can often be supported just by configuration.


Only one public CDN is known to be able to compress Zstandard itself, and some CDN's may require custom configuration to pass-through Zstandard correctly.


Zstd support is not particularly difficult to implement for a server that already implements multiple content encodings. The C implementation has a straightforward API and there are implementations for many other languages. There is also a lively community of Zstandard enthusiasts which should help accelerate adoption.


Security

CRIME and BREACH mean that the resource being compressed can be considered readable by the document deploying them. That is bad if any of them contains information that the document cannot already obtain by other means. An attacker may provide correctly formed compressed frames with unreasonable memory requirements, and dictionaries may interact unexpectedly with a decoder, leading to possible memory or other resource-exhaustion attacks. It is possible to store arbitrary user metadata in skippable frames, so they can be used as a watermark to track the path of the compressed payload. It is important to note that these concerns apply to all compression formats, not just zstd.


To mitigate these risks, similar to Brotli, we'll be advertising support for "zstd" encoding only if transferred data is opaque to proxies, to ensure that resources don't contain private data that the origin cannot read otherwise.


I'm not sure what that means. Can you elaborate on that?

This essentially means that, like Brotli, Zstd is only available in secure contexts i.e. over https.

Yoav Weiss (@Shopify)

unread,
Feb 14, 2024, 9:20:05 AMFeb 14
to blink-dev, Nidhi Jaju, blink-dev, Adam Rice, Yoav Weiss, James Hartig
LGTM1

Limiting zstd support to secure contexts makes perfect sense. However I believe the reason we're doing that for brotli is more around compatibility concerns with old network-based proxies that aren't ready for non-gzip content-encodings. 
I don't think secure contexts do much to protect against BREACH if attackers can control parts of the response. At the same time, I don't know that we're doing anything on that front for other compression formats, so that seems fine.

 
 
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

David Benjamin

unread,
Feb 14, 2024, 10:50:29 AMFeb 14
to Yoav Weiss (@Shopify), blink-dev, Nidhi Jaju, Adam Rice, James Hartig
Right, secure contexts don't magically make dangerous features safe. The only thing secure contexts do is make the name in the URL bar meaningful. The user may still be talking to evil.example.

It sounds like there are more risks discussed here than BREACH, so I think we need to examine them separately:

1. Information leaks when you compress together attacker-controlled data and secret data. (BREACH)
2. DoS risks from the decompressor
3. Watermarking from user-specific encodings of the resource

For BREACH, the description of the document not being able to read it confused me a little. When you compress something, the length of the compressed resource, even when encrypted, gets leaked to all manner of attackers via all manner of ways. I'm guessing the reference to the document is that resource timing APIs allow the document to learn the length of resources it otherwise cannot read? That is one attack vector (not at all mitigated by secure contexts), but there are others. Ultimately, BREACH means the server cannot just transparently compress every resource it sends. In particular, any kind of dynamic HTML resource will likely contain some attacker controlled strings.

That said, the mitigation is mostly on the server to do. Once the resource gets to us, the leak has already happened. The only connection to proxies, and where we can do something on the client, is that sometimes proxies will try to transparently compress all HTTP resources indiscriminately. If that proxy is part of the network path and not the site, it has no hope of mitigating this. So being opaque to proxies is good and cuts out that minor component of the problem, but doesn't actually address the broader issue. It's just fine because the broader issue is for the server to address. (Though it means that our documentation to use it should mention the server's responsibility here!)

For DoS risks, secure contexts also don't do anything. We assume that the attacker can direct the user to visit any website under their control, so users could well visit https://evil.example and securely get a DoS-triggering payload from it. As decompression happens in the network service, shared across sites, that DoS would impact other sites too. So, in order to deploy zstd or any such compression scheme, we need to mitigate DoS attacks directly, usually by applying limits. It sounds like you all have applied a frame size limit? Is that sufficient to avoid DoS, or are the other avenues for a zstd decompression to consume excessive resources?

Finally, the watermarking concerns also aren't mitigated by secure contexts, but I think that's fine. This doesn't really apply to the web's security model because we already assume the resource may be user-specific in all manner of ways. (I mean, it can contain a Set-Cookie header!) Rather, when we want two contexts to be uncorrelated, we control whether they can communicate at all, rather than making assumptions on the encoding mechanism. (Network state partitioning, cookie controls, etc.)

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/f823c2bc-f224-4ff7-9f78-e9eba9a4949cn%40chromium.org.

Nidhi Jaju

unread,
Feb 14, 2024, 10:11:10 PMFeb 14
to David Benjamin, Yoav Weiss (@Shopify), blink-dev, Adam Rice, James Hartig
Thank you for the additional discussion about the different security risks. I've added a note about the server's responsibility to the ChromeStatus entry to take care with including attacker-controlled data in compressed content.
 

For DoS risks, secure contexts also don't do anything. We assume that the attacker can direct the user to visit any website under their control, so users could well visit https://evil.example and securely get a DoS-triggering payload from it. As decompression happens in the network service, shared across sites, that DoS would impact other sites too. So, in order to deploy zstd or any such compression scheme, we need to mitigate DoS attacks directly, usually by applying limits. It sounds like you all have applied a frame size limit? Is that sufficient to avoid DoS, or are the other avenues for a zstd decompression to consume excessive resources?

Yes, we added a window size limit of 8MB, which means that Chromium will use a maximum of 8MB memory buffer to decompress frames to protect it from unreasonable requirements. In addition, for zip bomb-like attacks, Chromium doesn't decompress faster than the renderer consumes data, so we won't accumulate excessive amounts of data in the network process.


Finally, the watermarking concerns also aren't mitigated by secure contexts, but I think that's fine. This doesn't really apply to the web's security model because we already assume the resource may be user-specific in all manner of ways. (I mean, it can contain a Set-Cookie header!) Rather, when we want two contexts to be uncorrelated, we control whether they can communicate at all, rather than making assumptions on the encoding mechanism. (Network state partitioning, cookie controls, etc.)

Agreed, for the content itself, we’ll continue to rely on the existing partitioning present in Chromium, as with other content encodings.

Chris Harrelson

unread,
Feb 15, 2024, 1:39:36 PMFeb 15
to Nidhi Jaju, David Benjamin, Yoav Weiss (@Shopify), blink-dev, Adam Rice, James Hartig

Mike Taylor

unread,
Feb 15, 2024, 2:20:20 PMFeb 15
to Chris Harrelson, Nidhi Jaju, David Benjamin, Yoav Weiss (@Shopify), blink-dev, Adam Rice, James Hartig

Felix Handte

unread,
Feb 16, 2024, 10:57:09 AMFeb 16
to blink-dev, Mike Taylor, David Benjamin, Yoav Weiss (@Shopify), blink-dev, Adam Rice, James Hartig, Chris Harrelson, Nidhi Jaju
Exciting! Meta supports shipping Zstd.

We've been running an experiment on our side and have seen positive user metrics movement from switching to serving zstd to Chrome traffic.
Reply all
Reply to author
Forward
0 new messages