| Commit-Queue | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
base::flat_set<std::string_view> values;
size_t iter = 0;
std::optional<std::string_view> value;
while ((value = EnumerateHeader(&iter, name))) {
values.insert(*value);
}
return values;nit : refactoring
base::flat_set<std::string_view> values;
for (size_t iter = 0; auto value = EnumerateHeader(&iter, name); ) {
values.insert(*value);
}
return values;
base::flat_set<std::string_view>
HttpResponseHeaders::GetCacheControlHeaderValues() const {
return GetHeaderValues("cache-control");
}nit : is explicit function required ?
base::flat_set<std::string_view> cache_control_values) const {const base::flat_set<std::string_view>&
base::flat_set<std::string_view> cache_control_values) const {const base::flat_set<std::string_view>&
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
values.insert(*value);Don't insert into a flat_set one-by-one. That's quadratic.
Instead, insert into a vector and move it into a flat_set, using this constructor:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_set.h;drc=025a94257380eadfad2d705129e5863fca0bf89e;l=58
(which ends up here:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_tree.h;drc=58fb75d86a0ad2642beec2d6c16b1e6c008e33cd;l=539
)
base::flat_set<std::string_view>
HttpResponseHeaders::GetCacheControlHeaderValues() const {
return GetHeaderValues("cache-control");
}nit : is explicit function required ?
Possibly inline, though LTO and the like likely takes care of that.
base::flat_set<std::string_view> cache_control_values) const {const ref here, too.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
values.insert(*value);Don't insert into a flat_set one-by-one. That's quadratic.
Instead, insert into a vector and move it into a flat_set, using this constructor:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_set.h;drc=025a94257380eadfad2d705129e5863fca0bf89e;l=58(which ends up here:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_tree.h;drc=58fb75d86a0ad2642beec2d6c16b1e6c008e33cd;l=539
)
Generally, cache control values to be set upto 3 from usage perspective.
Does it go beyond that.
So Insertion should not be expensive compared to creation of vector and moving to flat_set right?
Could you share your opinion on this thought process?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
values.insert(*value);Mayur PatilDon't insert into a flat_set one-by-one. That's quadratic.
Instead, insert into a vector and move it into a flat_set, using this constructor:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_set.h;drc=025a94257380eadfad2d705129e5863fca0bf89e;l=58(which ends up here:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_tree.h;drc=58fb75d86a0ad2642beec2d6c16b1e6c008e33cd;l=539
)
Generally, cache control values to be set upto 3 from usage perspective.
Does it go beyond that.
So Insertion should not be expensive compared to creation of vector and moving to flat_set right?
Could you share your opinion on this thought process?
So creating a vector and moving to flat_set is basically sort of free in that flat_set wraps a vector anyway (see the second of the links above). What's not 100% obvious is that on reasonable number of values doing the sort all at once is cheaper than inserting stuff in right spot one by one, even if it's n*log n instead of n^2.
Of course, if there are 3 values it's unclear that a flat_set is actually useful in the first place.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
...Have you actually measured this?
Thanks for suggesting this point Maks...
I added a simple Logger to verify Time difference in NanoSeconds for one Parameterized Test Case (RequiresValidationTest.RequiresValidation)...
Added below snippet:
const auto startTime = base::TimeTicks::Now();
FreshnessLifetimes lifetimes = GetFreshnessLifetimes(response_time);
const auto endTime = base::TimeTicks::Now();
LOG(INFO) << "Time difference = " << (endTime - startTime).InNanoseconds();
Performance of HasHeaderValue lookup is still better compared to storage based lookup validation...
Performance:-
HasHeaderValue < Changes without vector while creating flat_set object < Changes with vector while creating flat_set object
with HasHeaderValue lookup 5 times been highest performant...
values.insert(*value);Mayur PatilDon't insert into a flat_set one-by-one. That's quadratic.
Instead, insert into a vector and move it into a flat_set, using this constructor:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_set.h;drc=025a94257380eadfad2d705129e5863fca0bf89e;l=58(which ends up here:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_tree.h;drc=58fb75d86a0ad2642beec2d6c16b1e6c008e33cd;l=539
)
Maks OrlovichGenerally, cache control values to be set upto 3 from usage perspective.
Does it go beyond that.
So Insertion should not be expensive compared to creation of vector and moving to flat_set right?
Could you share your opinion on this thought process?
So creating a vector and moving to flat_set is basically sort of free in that flat_set wraps a vector anyway (see the second of the links above). What's not 100% obvious is that on reasonable number of values doing the sort all at once is cheaper than inserting stuff in right spot one by one, even if it's n*log n instead of n^2.
Of course, if there are 3 values it's unclear that a flat_set is actually useful in the first place.
Made necessary changes to incorporate vector
base::flat_set<std::string_view> values;
size_t iter = 0;
std::optional<std::string_view> value;
while ((value = EnumerateHeader(&iter, name))) {
values.insert(*value);
}
return values;nit : refactoring
base::flat_set<std::string_view> values;
for (size_t iter = 0; auto value = EnumerateHeader(&iter, name); ) {
values.insert(*value);
}
return values;
Done
base::flat_set<std::string_view>
HttpResponseHeaders::GetCacheControlHeaderValues() const {
return GetHeaderValues("cache-control");
}Maks Orlovichnit : is explicit function required ?
Possibly inline, though LTO and the like likely takes care of that.
Done
base::flat_set<std::string_view> cache_control_values) const {const base::flat_set<std::string_view>&
Done
base::flat_set<std::string_view> cache_control_values) const {const base::flat_set<std::string_view>&
Done
base::flat_set<std::string_view> cache_control_values) const {Mayur Patilconst ref here, too.
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Mayur Patil...Have you actually measured this?
Thanks for suggesting this point Maks...
I added a simple Logger to verify Time difference in NanoSeconds for one Parameterized Test Case (RequiresValidationTest.RequiresValidation)...
Added below snippet:
const auto startTime = base::TimeTicks::Now();
FreshnessLifetimes lifetimes = GetFreshnessLifetimes(response_time);
const auto endTime = base::TimeTicks::Now();
LOG(INFO) << "Time difference = " << (endTime - startTime).InNanoseconds();Performance of HasHeaderValue lookup is still better compared to storage based lookup validation...
Performance:-
HasHeaderValue < Changes without vector while creating flat_set object < Changes with vector while creating flat_set objectwith HasHeaderValue lookup 5 times been highest performant...
Thanks for doing this measurement! It looks like we are losing so much time from doing the memory allocation for the flat_set that it is cancelling out any benefit from reducing the number of string comparisons.
I suggest giving up on the generic approach and just having a single `EnumerateHeader()` loop in `GetFreshnessLifetimes()` that checks for all 5 strings at the same time.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Mayur Patil...Have you actually measured this?
Adam RiceThanks for suggesting this point Maks...
I added a simple Logger to verify Time difference in NanoSeconds for one Parameterized Test Case (RequiresValidationTest.RequiresValidation)...
Added below snippet:
const auto startTime = base::TimeTicks::Now();
FreshnessLifetimes lifetimes = GetFreshnessLifetimes(response_time);
const auto endTime = base::TimeTicks::Now();
LOG(INFO) << "Time difference = " << (endTime - startTime).InNanoseconds();Performance of HasHeaderValue lookup is still better compared to storage based lookup validation...
Performance:-
HasHeaderValue < Changes without vector while creating flat_set object < Changes with vector while creating flat_set objectwith HasHeaderValue lookup 5 times been highest performant...
Thanks for doing this measurement! It looks like we are losing so much time from doing the memory allocation for the flat_set that it is cancelling out any benefit from reducing the number of string comparisons.
I suggest giving up on the generic approach and just having a single `EnumerateHeader()` loop in `GetFreshnessLifetimes()` that checks for all 5 strings at the same time.
I have added variables to store computed values.
Upon reviewing the test results, I noticed an improvement compared to the current values. However, the scope of testing was limited to the RequiresValidation Test Case in this scenario.
I have a few queries:
The methods GetMaxAgeValue() and GetStaleWhileRevalidateValue() internally call private method GetCacheControlDirective(). These methods are now only used in test cases. Should we consider removing them and modifying the test cases if these changes are good to go?
values.insert(*value);Mayur PatilDon't insert into a flat_set one-by-one. That's quadratic.
Instead, insert into a vector and move it into a flat_set, using this constructor:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_set.h;drc=025a94257380eadfad2d705129e5863fca0bf89e;l=58(which ends up here:
https://source.chromium.org/chromium/chromium/src/+/main:base/containers/flat_tree.h;drc=58fb75d86a0ad2642beec2d6c16b1e6c008e33cd;l=539
)
Maks OrlovichGenerally, cache control values to be set upto 3 from usage perspective.
Does it go beyond that.
So Insertion should not be expensive compared to creation of vector and moving to flat_set right?
Could you share your opinion on this thought process?
Mayur PatilSo creating a vector and moving to flat_set is basically sort of free in that flat_set wraps a vector anyway (see the second of the links above). What's not 100% obvious is that on reasonable number of values doing the sort all at once is cheaper than inserting stuff in right spot one by one, even if it's n*log n instead of n^2.
Of course, if there are 3 values it's unclear that a flat_set is actually useful in the first place.
Made necessary changes to incorporate vector
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
std::optional<base::TimeDelta> ParseCacheControlDirective(This calling convention is a bit confusing. I suggest you change the function to just expect a number like `"3600"` and then in the caller do `value->substr(kMaxAge.size())`. Then you can rename the function so something like `ParseSeconds()` or similar.
// @param directive_size The length of the directive name including the equalsWe don't normally use `@param` notation in Chromium code. I suggest just describing the parameter in a normal sentence.
// According to RFC 7234, "must-revalidate" takes precedence overI think it's better if the caller implements this restriction. Consider a header like
```
Cache-Control: stale-while-revalidate=30, must-revalidate
```
In this case this function would set `stale_while_revalidate` even though `must-revalidate` was present. It would be good to add a test to http_response_headers_unittest.cc for this case.
inline void ParseCacheControlDirectivesForFreshness(Prefer using return values over output parameters (see https://google.github.io/styleguide/cppguide.html#Inputs_and_Outputs). In this case it's best to define a struct like
```
struct CacheControlFreshnessDirectives {
bool must_revalidate;
std::optional<base::TimeDelta> max_age_value;
std::optional<base::TimeDelta> stale_while_revalidate;
};
```
inline bool HasCacheRestriction() const {We don't normally inline complex functions in Chromium code. For production builds we use full program optimisation, so the compiler can inline across translation units if it needs to. Placing functions in headers tends to lead to unnecessary inlining by the compiler, leading to binary bloat and worse performance. See https://chromium.googlesource.com/chromium/src/+/main/styleguide/c++/c++.md#inline-functions
while (start < end && *start == ' ') {
// leading spaces
++start;
}
while (start < end - 1 && *(end - 1) == ' ') {
// trailing spaces
--end;
}I realise this code was there originally, but it is not very clear. I suggest doing something like
```
value = base::TrimString(value, " ", base::TRIM_ALL);
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Mayur Patil...Have you actually measured this?
Adam RiceThanks for suggesting this point Maks...
I added a simple Logger to verify Time difference in NanoSeconds for one Parameterized Test Case (RequiresValidationTest.RequiresValidation)...
Added below snippet:
const auto startTime = base::TimeTicks::Now();
FreshnessLifetimes lifetimes = GetFreshnessLifetimes(response_time);
const auto endTime = base::TimeTicks::Now();
LOG(INFO) << "Time difference = " << (endTime - startTime).InNanoseconds();Performance of HasHeaderValue lookup is still better compared to storage based lookup validation...
Performance:-
HasHeaderValue < Changes without vector while creating flat_set object < Changes with vector while creating flat_set objectwith HasHeaderValue lookup 5 times been highest performant...
Mayur PatilThanks for doing this measurement! It looks like we are losing so much time from doing the memory allocation for the flat_set that it is cancelling out any benefit from reducing the number of string comparisons.
I suggest giving up on the generic approach and just having a single `EnumerateHeader()` loop in `GetFreshnessLifetimes()` that checks for all 5 strings at the same time.
I have added variables to store computed values.
Upon reviewing the test results, I noticed an improvement compared to the current values. However, the scope of testing was limited to the RequiresValidation Test Case in this scenario.
I have a few queries:
The methods GetMaxAgeValue() and GetStaleWhileRevalidateValue() internally call private method GetCacheControlDirective(). These methods are now only used in test cases. Should we consider removing them and modifying the test cases if these changes are good to go?
Yes, if `GetMaxAgeValue()` and `GetStaleWhileRevalidateValue()` are only used in test cases, please delete them.
Mayur Patil...Have you actually measured this?
Adam RiceThanks for suggesting this point Maks...
I added a simple Logger to verify Time difference in NanoSeconds for one Parameterized Test Case (RequiresValidationTest.RequiresValidation)...
Added below snippet:
const auto startTime = base::TimeTicks::Now();
FreshnessLifetimes lifetimes = GetFreshnessLifetimes(response_time);
const auto endTime = base::TimeTicks::Now();
LOG(INFO) << "Time difference = " << (endTime - startTime).InNanoseconds();Performance of HasHeaderValue lookup is still better compared to storage based lookup validation...
Performance:-
HasHeaderValue < Changes without vector while creating flat_set object < Changes with vector while creating flat_set objectwith HasHeaderValue lookup 5 times been highest performant...
Mayur PatilThanks for doing this measurement! It looks like we are losing so much time from doing the memory allocation for the flat_set that it is cancelling out any benefit from reducing the number of string comparisons.
I suggest giving up on the generic approach and just having a single `EnumerateHeader()` loop in `GetFreshnessLifetimes()` that checks for all 5 strings at the same time.
Adam RiceI have added variables to store computed values.
Upon reviewing the test results, I noticed an improvement compared to the current values. However, the scope of testing was limited to the RequiresValidation Test Case in this scenario.
I have a few queries:
The methods GetMaxAgeValue() and GetStaleWhileRevalidateValue() internally call private method GetCacheControlDirective(). These methods are now only used in test cases. Should we consider removing them and modifying the test cases if these changes are good to go?
Yes, if `GetMaxAgeValue()` and `GetStaleWhileRevalidateValue()` are only used in test cases, please delete them.
it is use in http_cache_unittest & http_response_headers_unittests. keeping them to easily access EnumerateHeader method...
Hence just added a comment stating they are meant for test purpose only
std::optional<base::TimeDelta> ParseCacheControlDirective(This calling convention is a bit confusing. I suggest you change the function to just expect a number like `"3600"` and then in the caller do `value->substr(kMaxAge.size())`. Then you can rename the function so something like `ParseSeconds()` or similar.
Done
// @param directive_size The length of the directive name including the equalsWe don't normally use `@param` notation in Chromium code. I suggest just describing the parameter in a normal sentence.
Done
// According to RFC 7234, "must-revalidate" takes precedence overI think it's better if the caller implements this restriction. Consider a header like
```
Cache-Control: stale-while-revalidate=30, must-revalidate
```
In this case this function would set `stale_while_revalidate` even though `must-revalidate` was present. It would be good to add a test to http_response_headers_unittest.cc for this case.
Done
Prefer using return values over output parameters (see https://google.github.io/styleguide/cppguide.html#Inputs_and_Outputs). In this case it's best to define a struct like
```
struct CacheControlFreshnessDirectives {
bool must_revalidate;
std::optional<base::TimeDelta> max_age_value;
std::optional<base::TimeDelta> stale_while_revalidate;
};
```
Done
We don't normally inline complex functions in Chromium code. For production builds we use full program optimisation, so the compiler can inline across translation units if it needs to. Placing functions in headers tends to lead to unnecessary inlining by the compiler, leading to binary bloat and worse performance. See https://chromium.googlesource.com/chromium/src/+/main/styleguide/c++/c++.md#inline-functions
Done
while (start < end && *start == ' ') {
// leading spaces
++start;
}
while (start < end - 1 && *(end - 1) == ' ') {
// trailing spaces
--end;
}I realise this code was there originally, but it is not very clear. I suggest doing something like
```
value = base::TrimString(value, " ", base::TRIM_ALL);
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
auto start = value.cbegin();
auto end = value.cend();
if (start == end ||
!std::all_of(start, end, [](char c) { return '0' <= c && c <= '9'; })) {
return std::nullopt;
}
int64_t seconds = 0;
base::StringToInt64(base::MakeStringPiece(start, end), &seconds);`std::string_view` guarantees memory safety, so try to stay in the `std::string_view` world as much as possible. Like this:
```suggestion
if (value.empty() ||
!std::ranges::all_of(value, absl::is_digit)) {
return std::nullopt;
}
int64_t seconds = 0;
base::StringToInt64(value, &seconds);
```
} else if (!directives.must_revalidate &&I think it's better not to check `must_revalidate` here, since it doesn't work in every case. `GetFreshnessLifetimes()` already does the check correctly, so you can move the comment about RFC 7234 there.
const auto& directives = ParseCacheControlDirectivesForFreshness();You can make the code more concise with destructuring assignment:
```suggestion
auto [must_revalidate, max_age, stale_while_revalidate] = ParseCacheControlDirectivesForFreshness();
```
bool has_freshness;`has_freshness` and `has_staleness` are not actually needed. You can just compare `freshness` and `staleness` directly. This will make the test cases easier to understand.
EXPECT_TRUE(lifetimes.freshness > base::TimeDelta());This check is redundant and should be removed (also it should be using `EXPECT_GT` for better diagnostics on failure).
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
auto start = value.cbegin();
auto end = value.cend();
if (start == end ||
!std::all_of(start, end, [](char c) { return '0' <= c && c <= '9'; })) {
return std::nullopt;
}
int64_t seconds = 0;
base::StringToInt64(base::MakeStringPiece(start, end), &seconds);`std::string_view` guarantees memory safety, so try to stay in the `std::string_view` world as much as possible. Like this:
```suggestion
if (value.empty() ||
!std::ranges::all_of(value, absl::is_digit)) {
return std::nullopt;
}
int64_t seconds = 0;
base::StringToInt64(value, &seconds);
```
Done
I think it's better not to check `must_revalidate` here, since it doesn't work in every case. `GetFreshnessLifetimes()` already does the check correctly, so you can move the comment about RFC 7234 there.
yes, the condition was actually added to serve one purpose, to ignore computation of stale_while_revalidate in case must_revalidate is turned to true, to avoid extra steps for execution...
eg : must-revlidate; stale-while-revalidate=20
actual assignment of correct value is already taken care in GetFreshness Method....
const auto& directives = ParseCacheControlDirectivesForFreshness();You can make the code more concise with destructuring assignment:
```suggestion
auto [must_revalidate, max_age, stale_while_revalidate] = ParseCacheControlDirectivesForFreshness();
```
Done
`has_freshness` and `has_staleness` are not actually needed. You can just compare `freshness` and `staleness` directly. This will make the test cases easier to understand.
Done
This check is redundant and should be removed (also it should be using `EXPECT_GT` for better diagnostics on failure).
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
How much did this end up helping?
// should not be used in production code....You could rename it to GetMaxAgeValueForTesting() to help ensure that.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// should not be used in production code....You could rename it to GetMaxAgeValueForTesting() to help ensure that.
GetMaxAgeValue & StaleWhileRevalidate are used in 2 files at multiple instancees... To avoid those changes I did not renamed them and added just a comment stating method not be used in production....
I think it should be sufficient enough....
Refactor GetFreshnessLifetimes to Extract Cache-Control Once
Refactored the HttpResponseHeaders::GetFreshnessLifetimes method to
improve efficiency by extracting and parsing the Cache-Control header
values only once. This change eliminates the need to loop for
Cache-Control header in the header list, reducing redundant operations
and enhancing overall performance.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |