Yoav Weiss (@Shopify)So I am not sure I'll be able to review this properly today (trying to reproduce an important crasher), but I am worried that it seems like you're re-writing stream 0 at every chunk, which may be particularly expensive on Windows, but is a non-trivial amount of work to serialize on any platform.
At the very least, it should probably only happen when this compression mode is on, as other kinds of decompression happen after cache (Also this comment likely needs refinement:
https://source.chromium.org/chromium/chromium/src/+/main:net/url_request/url_request.cc;drc=56c66e417c83e2096a4e4e8a5c4ab7bbd525c9f3;l=288)Best may be to defer it until the entry is about to be closed; that will mean it's wrong for truncated entries, but that's probably OK if it fails cleanly.
This usage of size also looks like it has a chance of going very wrong:
https://source.chromium.org/chromium/chromium/src/+/main:services/network/url_loader.cc;drc=56c66e417c83e2096a4e4e8a5c4ab7bbd525c9f3;l=2683
...though I don't understand what it's doing with gzip, either.
Maks OrlovichBest may be to defer it until the entry is about to be closed; that will mean it's wrong for truncated entries, but that's probably OK if it fails cleanly.
How often should we expect truncated entries? What would be web exposed in those cases?
Yoav Weiss (@Shopify)Hmm, so they happen when we get interrupted by a network error or when the data is too short (the case you changed!), and the entry looks like it may be resumable by a range request (do dictionary-compressed entries ever do?). I don't know how frequent that is in practice. If that happens then the next time we fetch the resource we will try to resume fetch using range stuff.
I think I tackled it, but I'd appreciate a critical view 😊
!response_info->did_use_shared_dictionary) {Yoav Weiss (@Shopify)This seems like a separate bugfix?
Yoav Weiss (@Shopify)Yeah, I'll spin it off
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
There was also a case (discovered while testing this) where when DCB/DCZ
resulted in larger assets (e.g. when the decoded asset is very small),
the decoded asset was discarded when fetched from the cache. This CL
fixes that as well, and adds test that verify both assets that deflate
and inflate as part of the compression process.No longer in this CL, I think? (Thanks)
bool use_large = query.find("cacheable") != std::string::npos;I think this may be better as a separate knob even if you change them together, as I think it would make the use code easier to follow.
// body size should still be available via GetReceivedBodyBytes().This comment looks out-of-date now?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
There was also a case (discovered while testing this) where when DCB/DCZ
resulted in larger assets (e.g. when the decoded asset is very small),
the decoded asset was discarded when fetched from the cache. This CL
fixes that as well, and adds test that verify both assets that deflate
and inflate as part of the compression process.No longer in this CL, I think? (Thanks)
Done
bool use_large = query.find("cacheable") != std::string::npos;I think this may be better as a separate knob even if you change them together, as I think it would make the use code easier to follow.
Done
// body size should still be available via GetReceivedBodyBytes().This comment looks out-of-date now?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Yoav Weiss (@Shopify)So I am not sure I'll be able to review this properly today (trying to reproduce an important crasher), but I am worried that it seems like you're re-writing stream 0 at every chunk, which may be particularly expensive on Windows, but is a non-trivial amount of work to serialize on any platform.
At the very least, it should probably only happen when this compression mode is on, as other kinds of decompression happen after cache (Also this comment likely needs refinement:
https://source.chromium.org/chromium/chromium/src/+/main:net/url_request/url_request.cc;drc=56c66e417c83e2096a4e4e8a5c4ab7bbd525c9f3;l=288)Best may be to defer it until the entry is about to be closed; that will mean it's wrong for truncated entries, but that's probably OK if it fails cleanly.
This usage of size also looks like it has a chance of going very wrong:
https://source.chromium.org/chromium/chromium/src/+/main:services/network/url_loader.cc;drc=56c66e417c83e2096a4e4e8a5c4ab7bbd525c9f3;l=2683
...though I don't understand what it's doing with gzip, either.
Maks OrlovichBest may be to defer it until the entry is about to be closed; that will mean it's wrong for truncated entries, but that's probably OK if it fails cleanly.
How often should we expect truncated entries? What would be web exposed in those cases?
Yoav Weiss (@Shopify)Hmm, so they happen when we get interrupted by a network error or when the data is too short (the case you changed!), and the entry looks like it may be resumable by a range request (do dictionary-compressed entries ever do?). I don't know how frequent that is in practice. If that happens then the next time we fetch the resource we will try to resume fetch using range stuff.
I think I tackled it, but I'd appreciate a critical view 😊
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Exportable changes to web-platform-tests were detected in this CL and a pull request in the upstream repo has been made: https://github.com/web-platform-tests/wpt/pull/58223.
When this CL lands, the bot will automatically merge the PR on GitHub if the required GitHub checks pass; otherwise, ecosystem-infra@ team will triage the failures and may contact you.
WPT Export docs:
https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md#Automatic-export-process
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
16 is the latest approved patch-set.
The change was submitted with unreviewed changes in the following files:
```
The name of the file: net/http/http_cache_unittest.cc
Insertions: 4, Deletions: 3.
@@ -15430,9 +15430,10 @@
}
// Tests that encoded body size is preserved across cache reads.
-// When a response is fetched from the network, the encoded (compressed) body
-// size is recorded. When the same response is served from cache, the encoded
-// body size should still be available via GetReceivedBodyBytes().
+// When a shared dictionary compressed response is fetched from the network,
+// the encoded (on-the-wire) body size is stored in the cached response info.
+// When the same response is later served from cache, the encoded body size
+// should still be available via HttpResponseInfo::encoded_body_size.
TEST_F(HttpCacheTest, EncodedBodySizePreservedFromCache) {
MockHttpCache cache;
```
```
The name of the file: content/browser/network/shared_dictionary_browsertest.cc
Insertions: 13, Deletions: 11.
@@ -754,7 +754,7 @@
if (dict_hash) {
if (*dict_hash == kExpectedDictionaryHashBase64) {
if (HasSharedDictionaryAcceptEncoding(request.headers)) {
- bool use_large = query.find("cacheable") != std::string::npos;
+ bool use_large = query.find("large") != std::string::npos;
response->AddCustomHeader(
"content-encoding",
net::shared_dictionary::kSharedZstdContentEncodingName);
@@ -1962,9 +1962,10 @@
// First fetch: from network with dictionary compression.
// transferSize should be > 0 (network fetch).
- EXPECT_EQ(expected_network_sizes,
- EvalJs(GetTargetShell()->web_contents()->GetPrimaryMainFrame(),
- JsReplace(R"(
+ EXPECT_EQ(
+ expected_network_sizes,
+ EvalJs(GetTargetShell()->web_contents()->GetPrimaryMainFrame(),
+ JsReplace(R"(
(async () => {
const targetUrl = $1;
const promise = new Promise((resolve) => {
@@ -1984,15 +1985,16 @@
', ' + (entry.transferSize > 0)
);
)",
- GetURL("/shared_dictionary/path/test?cacheable")))
- .ExtractString());
+ GetURL("/shared_dictionary/path/test?cacheable&large")))
+ .ExtractString());
// Second fetch: should come from disk cache. The encodedBodySize should
// still reflect the original dictionary-compressed size, not the
// decompressed size stored in cache. transferSize should be 0 (cache hit).
- EXPECT_EQ(expected_cached_sizes,
- EvalJs(GetTargetShell()->web_contents()->GetPrimaryMainFrame(),
- JsReplace(R"(
+ EXPECT_EQ(
+ expected_cached_sizes,
+ EvalJs(GetTargetShell()->web_contents()->GetPrimaryMainFrame(),
+ JsReplace(R"(
(async () => {
const targetUrl = $1;
performance.clearResourceTimings();
@@ -2013,8 +2015,8 @@
', ' + entry.transferSize
);
)",
- GetURL("/shared_dictionary/path/test?cacheable")))
- .ExtractString());
+ GetURL("/shared_dictionary/path/test?cacheable&large")))
+ .ExtractString());
}
IN_PROC_BROWSER_TEST_P(SharedDictionaryBrowserTest,
```
CompressionDictionaries - ensure encodedBodySize survives the cache
DCB/DCZ assets are being decompressed before being cached, to avoid
having to rely on their dictionary when retrieved from the cache.
This creates a web-exposed issue with encodedBodySize, where it's
equal to the decodedBodySize in those cases.
This CL fixes that by storing the original encoded body size as metadata
on the cached asset, and setting it upon retrieval.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
The WPT PR for this CL has been merged upstream! https://github.com/web-platform-tests/wpt/pull/58223
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |