Andrew BrownHappy new year everyone! Apologies for dragging this up right as everyone returns from a well-deserved break :)
Figured I could use some of my summer free time to close this out for good. I worked out (some of, hopefully all) the failing devtools tests, which needed changes inside the preflight-summoning code - tagged Min Qin for review. I'm a little wary that I ended up needing to have weird case handling in the preflight code.
I have two outstanding questions:
- I've commented out the assertion in a test for frame_fetch_context. It's expecting that the User-Agent is set after calling PrepareRequest, but because this CL moves it to its own thing, that's no longer the case. Is this test checking something "deeper" that I should work out, or can it be safely removed? (I suspect this is the root cause of a few other test failures, but I never have been able to accurately run the full suite locally.)
- Do my cases for User-Agent handling make sense: a fetch-set User-Agent overrides a devtools User-Agent overrides the default? I think we've settled this, but it may be different when you consider which User-Agent should be present on preflight OPTIONS requests. At this point, it follows the same order.
Andrew BrownOk, not quite all the failing devtools tests :(
Some remaining failures I've traced back down to `ThrottlingURLLoader::StartNow()`, where it enters IPC/inheritance hell. I am almost willing to bet this will characterise all the remianing failures: issues in the various implementations of `URLLoaderFactory::CreateLoaderAndStart`.
Basically, I have reason to believe there are one or more `URLLoader` implementations that don't combine the `cors_exempt_headers` in with the normal `headers` on a request - similar to the change I've already made in `preflight_controller.cc`. There are different `URLLoader`'s for different things, and I bet the one that deals with extensions will have a problem, and so will the one that deals with non-returning delayed requests for `fetchLater()`.
Do you have any tips for tracking down which one(s) live on the path below the DevTools (and extension) code? The debug docs are fantastic for working bottom-up, where you have a crash and you need to work out where, but not so helpful for working top-down, where I have correct behaviour and I don't know where it stops.
There are ~43 different types of `CreateLoaderAndStart` that could lead to my problem, each with their own very large stack of behaviour, and often calling into each other in hard-to-grasp ways. I can work through them eventually, but it's going to take a long time.
I was *almost* correct; I actually had to tell the parallel stack that the inspector protocol uses, that it still needs to go find the UA elsewhere.
This means unfortunately all the work to keep the UA inside the FetchContext is useless, because the inspector protocol has no idea what context the request was made in, so there's just no way that the context-based approach is compatible with the behaviour inspector needs. I've gone back to the *original* way I did it, with a new field on the ResourceRequest.
This lets us keep basically all of the original behaviour (stuff like PrepareRequest being the function call that actually sets the UA onto the blink::ResourceRequest, instead of it happening much later when populating the network::ResourceRequest). So a lot of this CL is now semi-superfluous, as there's still some remnants of the move-to-context (though these remnants are pretty ok, in my estimation. Maybe some string copying behaviour could be optimised, idk.)
If it works as expected, there should (fingers crossed) be only a few failing tests:
It may be time for a chromestatus.com entry? I feel like there's a lot of implementation-defined behaviour here that would be good to get more eyes on, and I'm planning to be able to work on this for the next 4-5 weeks straight.
Andrew Brown...ah, I'm terribly sorry - I thought that would just avoid pinging you both again, not delete your reviews entirely :(
Done
net::HttpRequestHeaders::kUserAgent, *user_agent);I think this code has a problem, which is that if the user has set a UA, we don't bother setting the context UA onto the cors_exempt_headers (see below comment in `request_conversion.cc`). So if there's a user UA, a preflight will be sent asking whether UA is an OK header, but *using* the custom UA that the user already set. Which is daft.
I don't know if we can fix it nicely, as I think we avoided setting UA onto both sets of headers in order to avoid merging issues with them later (header lists have no guaranteed merging behaviour, I think?). So we might have to add a special UA hiding spot to the `network::ResourceRequest` too, which is not great.
}There has to be a better way to use `getString()` here, or use another function instead, but I could not for the life of me find documentation for `DictionaryValue`, and the code I found for it was autogenerated and didn't look like it had much of a better approach available.
headers, std::make_optional(request.GetContextUserAgent())))I don't think CORS preflights go through this function, as CORS preflights still appear to be UA-less in Devtools (the actual request still has the header, just the inspector's `Network.requestWillBeSent` event is wrong.)
But also, this behaviour isn't tested ATM (I think?) - the existing tests don't appear to be checking the inspector's view of the preflight, just the network's view of it (in `http/tests/inspector-protocol/emulation/user-agent-override-cors-preflight.js`)
So, outstanding questions (the last of which I can answer myself, given time):
.setHeaders(BuildObjectForHeaders(headers_map, std::nullopt))I'm also not fully convinced that optionals are what we want here, but it feels better than creating a `BuildObjectForRequestHeaders` function. Maybe not. Thoughts welcome.
// EXPECT_EQ("hi", request.HttpHeaderField(http_names::kUserAgent));Adam RiceThis test is funny - it's now always going to fail, because there's simply no need for `PrepareRequest` to put the useragent in the headers. The title of this test implies it's testing something more meaningful than just the user-agent-setting behaviour - what sort of backup should I add instead?
Andrew BrownIt's verifying that resetting `dummy_page_holder` results in a call to `client->UserAgent()`. It wants to verify that `"hi"` will actually be set as the User-Agent request header on the request, but I don't know how to do that here.
Alright, I fiddled with it in patchset 19, but ended up needing to put the UA *back* into the ResourceRequest (unfortunately) due to inspector protocol things, so the new test just checks that the UA is correctly set inside it's new hiding spot.
PrepareRequestWhenWhenDetached) {Typo, will fix in next patchset.
}See above comment in `preflight_controller.cc` - it's both these pieces of code together.
The key thing is at this point, we can't put the context UA onto the cors_exempt_headers without potentially accepting dodgy merge behaviour later when they get combined. So we have a `network::ResourceRequest` that has a user UA on the normal headers and *no* UA on the cors_exempt headers, which then goes to `preflight_controller`, which uses the user UA.
I guess we could work out where the merge happens and make sure we remove the context UA from the `cors_exempt_headers` *if* the user UA is set? But then we need to make sure preflights avoid copying the user UA, too.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I think you're going to need to add a DevTools expert as a reviewer to be able to answer some of these questions.
if (context_ua.has_value() andNit: use `&&` not `and` in C++ code.
Added Alex Rudenko to review for devtools knowledge... first alphabetical owner, sorry, I get it, my name starts with A too - I have a few changes that touch devtools stuff (in `inspector_network_agent.cc`) that feel like they could be done in smarter ways. Thoughts welcome.
Also if someone could put it through a CQ dry run again that would be hugely appreciated - locally it looks like the devtools changes might have ended up fixing the remaining tests, but it would be good to check.
Nit: use `&&` not `and` in C++ code.
Done
Typo, will fix in next patchset.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
@sa...@chromium.org could you please do the initial round of reviews for the devtools code?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
@sa...@chromium.org could you please do the initial round of reviews for the devtools code?
I took a look only at the `.../inspector_network_agent.cc`. I will leave the judgement WRT `GetContextUserAgent` method to the fetch owners.
// We need to make special care in the case of outgoing ResourceRequestsPlease move this block to `BuildObjectForResourceRequest`.
headers.Set(http_names::kReferer, AtomicString(request.ReferrerString()));As long as this is the only place the `context_ua` is set, please move the logic for adding the user agent header from `BuildObjectForHeaders` extra parameter here.
Per spec, fetch & XMLHttpRequest should be able to set customcould you please link to the parts of the spec that define this behavior?
headers, std::make_optional(request.GetContextUserAgent())))I don't think CORS preflights go through this function, as CORS preflights still appear to be UA-less in Devtools (the actual request still has the header, just the inspector's `Network.requestWillBeSent` event is wrong.)
But also, this behaviour isn't tested ATM (I think?) - the existing tests don't appear to be checking the inspector's view of the preflight, just the network's view of it (in `http/tests/inspector-protocol/emulation/user-agent-override-cors-preflight.js`)
So, outstanding questions (the last of which I can answer myself, given time):
- Should there be a test for this behaviour (sending the right UA on a preflight, and making sure Devtools sees it too)?
- Where do preflights spawn into the inspector code? Preflights are spawned in the network process, as best I can tell (code lives in services/network, not third_party/blink), so I don't know how that information gets sent back to the render process's inspector stuff.
I do not think we need to add extra headers in this event. We have requestWillBeSentExtraInfo that should report actual headers set by the network service. Could you please clarify why requestWillBeSentExtraInfo is not sufficient here?
I've spent all week compiling and attempting to test and recompiling. I still have no clue if it's actually better, but it will at least not have the stupid failures the last CL run had (the bad optional access stuff) - if someone would be so kind as to hit the tryjob button that would probably save me ~6 more hours running stuff locally...
Per spec, fetch & XMLHttpRequest should be able to set customcould you please link to the parts of the spec that define this behavior?
It all derives from https://fetch.spec.whatwg.org/#forbidden-request-header -- this list was updated to remove User-Agent in 2015.
Now User-Agent isn't forbidden, it should be settable by users, and should not be on `net/http/http_util.cc`'s forbidden header list.
Unfortunately, once you remove it from that list, it also starts always generating CORS preflights. Whether it should send preflights is not directly answered by the spec (it's ambiguous), but we are emulating Firefox's behaviour, as it seems unintuitive to send preflights for every single fetch request ever asking for "User-Agent". Plus, it will break large swathes of the web for things that aren't expecting to have to explicitly say that "User-Agent" is OK.
There's other discussion on this CL about this, but to summarise where we're at:
So the solution to the CORS issue is that we take the (devtools or browser) UA from the FetchContext, and instead of putting it straight in the headers, we have to hide it around until we get to a state where it won't trigger a CORS preflight.
Unfortunately this means I've had to run around to a heap of random spots and make sure they're looking in the new spot for the UA, but eh. I probably have to update that description to summarise the new behaviour soon, as it seems to change every 2nd patchset.
// We need to make special care in the case of outgoing ResourceRequestsPlease move this block to `BuildObjectForResourceRequest`.
Done - your approach looks miles better.
There has to be a better way to use `getString()` here, or use another function instead, but I could not for the life of me find documentation for `DictionaryValue`, and the code I found for it was autogenerated and didn't look like it had much of a better approach available.
Done
headers.Set(http_names::kReferer, AtomicString(request.ReferrerString()));As long as this is the only place the `context_ua` is set, please move the logic for adding the user agent header from `BuildObjectForHeaders` extra parameter here.
Yeah, it is - moved over, done!
headers, std::make_optional(request.GetContextUserAgent())))Alex RudenkoI don't think CORS preflights go through this function, as CORS preflights still appear to be UA-less in Devtools (the actual request still has the header, just the inspector's `Network.requestWillBeSent` event is wrong.)
But also, this behaviour isn't tested ATM (I think?) - the existing tests don't appear to be checking the inspector's view of the preflight, just the network's view of it (in `http/tests/inspector-protocol/emulation/user-agent-override-cors-preflight.js`)
So, outstanding questions (the last of which I can answer myself, given time):
- Should there be a test for this behaviour (sending the right UA on a preflight, and making sure Devtools sees it too)?
- Where do preflights spawn into the inspector code? Preflights are spawned in the network process, as best I can tell (code lives in services/network, not third_party/blink), so I don't know how that information gets sent back to the render process's inspector stuff.
I do not think we need to add extra headers in this event. We have requestWillBeSentExtraInfo that should report actual headers set by the network service. Could you please clarify why requestWillBeSentExtraInfo is not sufficient here?
I'll be honest I am completely unfamiliar with DevTools and didn't know that existed; everything that I've done has been prompted by a few failing tests. My impression (and these changes) are to solve the fact that the new, cors-free "hiding" spot for the UA also needs to be given to the inspector.
Looking at requestWillBeSentExtraInfo, that seems like exactly what we want here, but this will require some tests to be updated - there's at least 8 tests failing in `web_tests/http/tests/inspector_protocol` that all are expecting a User-Agent to be present on the original `requestWillBeSent` event (which it won't be, anymore).
I could update those 8 tests to check requestWillBeSentExtraInfo, and then ensure the User-Agent is present there?
.setHeaders(BuildObjectForHeaders(headers_map, std::nullopt))I'm also not fully convinced that optionals are what we want here, but it feels better than creating a `BuildObjectForRequestHeaders` function. Maybe not. Thoughts welcome.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
headers, std::make_optional(request.GetContextUserAgent())))Alex RudenkoI don't think CORS preflights go through this function, as CORS preflights still appear to be UA-less in Devtools (the actual request still has the header, just the inspector's `Network.requestWillBeSent` event is wrong.)
But also, this behaviour isn't tested ATM (I think?) - the existing tests don't appear to be checking the inspector's view of the preflight, just the network's view of it (in `http/tests/inspector-protocol/emulation/user-agent-override-cors-preflight.js`)
So, outstanding questions (the last of which I can answer myself, given time):
- Should there be a test for this behaviour (sending the right UA on a preflight, and making sure Devtools sees it too)?
- Where do preflights spawn into the inspector code? Preflights are spawned in the network process, as best I can tell (code lives in services/network, not third_party/blink), so I don't know how that information gets sent back to the render process's inspector stuff.
Andrew BrownI do not think we need to add extra headers in this event. We have requestWillBeSentExtraInfo that should report actual headers set by the network service. Could you please clarify why requestWillBeSentExtraInfo is not sufficient here?
I'll be honest I am completely unfamiliar with DevTools and didn't know that existed; everything that I've done has been prompted by a few failing tests. My impression (and these changes) are to solve the fact that the new, cors-free "hiding" spot for the UA also needs to be given to the inspector.
Looking at requestWillBeSentExtraInfo, that seems like exactly what we want here, but this will require some tests to be updated - there's at least 8 tests failing in `web_tests/http/tests/inspector_protocol` that all are expecting a User-Agent to be present on the original `requestWillBeSent` event (which it won't be, anymore).
I could update those 8 tests to check requestWillBeSentExtraInfo, and then ensure the User-Agent is present there?
I think it would be better to not special case the header when reporting from the renderer and update the tests that do not contain the header anymore and make sure that ExtraInfo events from the network service actually contain the correct user agent header via a new test. I would suggest the network reviewers to review the non-DevTools changes first to make sure those are correct.
headers, std::make_optional(request.GetContextUserAgent())))Alex RudenkoI don't think CORS preflights go through this function, as CORS preflights still appear to be UA-less in Devtools (the actual request still has the header, just the inspector's `Network.requestWillBeSent` event is wrong.)
But also, this behaviour isn't tested ATM (I think?) - the existing tests don't appear to be checking the inspector's view of the preflight, just the network's view of it (in `http/tests/inspector-protocol/emulation/user-agent-override-cors-preflight.js`)
So, outstanding questions (the last of which I can answer myself, given time):
- Should there be a test for this behaviour (sending the right UA on a preflight, and making sure Devtools sees it too)?
- Where do preflights spawn into the inspector code? Preflights are spawned in the network process, as best I can tell (code lives in services/network, not third_party/blink), so I don't know how that information gets sent back to the render process's inspector stuff.
Andrew BrownI do not think we need to add extra headers in this event. We have requestWillBeSentExtraInfo that should report actual headers set by the network service. Could you please clarify why requestWillBeSentExtraInfo is not sufficient here?
Alex RudenkoI'll be honest I am completely unfamiliar with DevTools and didn't know that existed; everything that I've done has been prompted by a few failing tests. My impression (and these changes) are to solve the fact that the new, cors-free "hiding" spot for the UA also needs to be given to the inspector.
Looking at requestWillBeSentExtraInfo, that seems like exactly what we want here, but this will require some tests to be updated - there's at least 8 tests failing in `web_tests/http/tests/inspector_protocol` that all are expecting a User-Agent to be present on the original `requestWillBeSent` event (which it won't be, anymore).
I could update those 8 tests to check requestWillBeSentExtraInfo, and then ensure the User-Agent is present there?
I think it would be better to not special case the header when reporting from the renderer and update the tests that do not contain the header anymore and make sure that ExtraInfo events from the network service actually contain the correct user agent header via a new test. I would suggest the network reviewers to review the non-DevTools changes first to make sure those are correct.
I was reviewing it again and the user agent actually is set by the client code, right? I wonder why is it no present in `request.HttpHeaderFields()`? It does sound like it should contain it if it is provided by the client.
context_user_agent_header_ = user_agent;should this still write to the http header field via SetHttpHeaderField even if we need context_user_agent_header_ for the other logic?
headers, std::make_optional(request.GetContextUserAgent())))Alex RudenkoI don't think CORS preflights go through this function, as CORS preflights still appear to be UA-less in Devtools (the actual request still has the header, just the inspector's `Network.requestWillBeSent` event is wrong.)
But also, this behaviour isn't tested ATM (I think?) - the existing tests don't appear to be checking the inspector's view of the preflight, just the network's view of it (in `http/tests/inspector-protocol/emulation/user-agent-override-cors-preflight.js`)
So, outstanding questions (the last of which I can answer myself, given time):
- Should there be a test for this behaviour (sending the right UA on a preflight, and making sure Devtools sees it too)?
- Where do preflights spawn into the inspector code? Preflights are spawned in the network process, as best I can tell (code lives in services/network, not third_party/blink), so I don't know how that information gets sent back to the render process's inspector stuff.
Andrew BrownI do not think we need to add extra headers in this event. We have requestWillBeSentExtraInfo that should report actual headers set by the network service. Could you please clarify why requestWillBeSentExtraInfo is not sufficient here?
Alex RudenkoI'll be honest I am completely unfamiliar with DevTools and didn't know that existed; everything that I've done has been prompted by a few failing tests. My impression (and these changes) are to solve the fact that the new, cors-free "hiding" spot for the UA also needs to be given to the inspector.
Looking at requestWillBeSentExtraInfo, that seems like exactly what we want here, but this will require some tests to be updated - there's at least 8 tests failing in `web_tests/http/tests/inspector_protocol` that all are expecting a User-Agent to be present on the original `requestWillBeSent` event (which it won't be, anymore).
I could update those 8 tests to check requestWillBeSentExtraInfo, and then ensure the User-Agent is present there?
Alex RudenkoI think it would be better to not special case the header when reporting from the renderer and update the tests that do not contain the header anymore and make sure that ExtraInfo events from the network service actually contain the correct user agent header via a new test. I would suggest the network reviewers to review the non-DevTools changes first to make sure those are correct.
I was reviewing it again and the user agent actually is set by the client code, right? I wonder why is it no present in `request.HttpHeaderFields()`? It does sound like it should contain it if it is provided by the client.
I'm not sure if I'm interpreting this right, but I'll attempt to explain; let me know if this helps. (My terminology is crappy, sorry; it's hard to overcome my non-chromium intuition of the definitions of "browser" and "client".)
Firstly, you're right, the user-agent is set javascript code onto the normal `request.HttpHeaderFields()`. But, any User-Agent value that exists in these headers will cause a CORS OPTIONS request. (The reason it doesn't do this in real Chrome right now is because of the list in `/net/http/http_util.cc`, but that's like, the actual spec bug we're fixing.)
If we put our internal value (the one that looks like `Mozilla/5.0 Chrome 140.0` etc etc) onto those headers, we'll end up triggering a CORS preflight request on every single web request we ever make.
So we have to put our internal value somewhere else, so that it can be all combined together later. That somewhere else is (for the moment) the `request.GetContextUserAgent()` (named because the internal value comes from the `FetchContext`).
Unfortunately, anywhere that we can put the UA that hides it from CORS, also hides it from a lot of other things (including DevTools).
The alternative approach was to send the context value straight to the network process later on (patchset 19), but I thought I had to make sure it was accessible through a blink::ResourceRequest somewhere in order for DevTools to see it... *but* now I've learned that it's ok for DevTools to just get it in requestWillBeSentExtraInfo later on. So I think I'm going back to the patchset 19 approach.
context_user_agent_header_ = user_agent;should this still write to the http header field via SetHttpHeaderField even if we need context_user_agent_header_ for the other logic?
I attempted to explain in the other comment. In brief, we have to have a separate ContextUserAgent that isn't just an ordinary http header because if we put the special browser user-agent into the ordinary http headers, we end up making CORS preflights on every single request, which is bad.
Thanks to the suggestion you made about requestWillBeSetExtraInfo, I'm probably going to go back to the approach in patchset 19, where the browser/context/safe User-Agent goes straight onto the `network::ResourceRequest` and doesn't need any special-casing inside resource_request.h.
So the `SetContextUserAgent()` stuff is not long for this world - I'll get to it tomorrow (AEST), and once that's uploaded, I'll re-ping Adam & Takashi to get another do-over/more thoughts from the network/fetch side. The context-based approach was their suggestion to begin with, but it will inevitably need some tweaks for some weirder (& worse) preflight behaviour that the DevTools testing made me spot.
Andrew BrownHappy new year everyone! Apologies for dragging this up right as everyone returns from a well-deserved break :)
Figured I could use some of my summer free time to close this out for good. I worked out (some of, hopefully all) the failing devtools tests, which needed changes inside the preflight-summoning code - tagged Min Qin for review. I'm a little wary that I ended up needing to have weird case handling in the preflight code.
I have two outstanding questions:
- I've commented out the assertion in a test for frame_fetch_context. It's expecting that the User-Agent is set after calling PrepareRequest, but because this CL moves it to its own thing, that's no longer the case. Is this test checking something "deeper" that I should work out, or can it be safely removed? (I suspect this is the root cause of a few other test failures, but I never have been able to accurately run the full suite locally.)
- Do my cases for User-Agent handling make sense: a fetch-set User-Agent overrides a devtools User-Agent overrides the default? I think we've settled this, but it may be different when you consider which User-Agent should be present on preflight OPTIONS requests. At this point, it follows the same order.
Andrew BrownOk, not quite all the failing devtools tests :(
Some remaining failures I've traced back down to `ThrottlingURLLoader::StartNow()`, where it enters IPC/inheritance hell. I am almost willing to bet this will characterise all the remianing failures: issues in the various implementations of `URLLoaderFactory::CreateLoaderAndStart`.
Basically, I have reason to believe there are one or more `URLLoader` implementations that don't combine the `cors_exempt_headers` in with the normal `headers` on a request - similar to the change I've already made in `preflight_controller.cc`. There are different `URLLoader`'s for different things, and I bet the one that deals with extensions will have a problem, and so will the one that deals with non-returning delayed requests for `fetchLater()`.
Do you have any tips for tracking down which one(s) live on the path below the DevTools (and extension) code? The debug docs are fantastic for working bottom-up, where you have a crash and you need to work out where, but not so helpful for working top-down, where I have correct behaviour and I don't know where it stops.
There are ~43 different types of `CreateLoaderAndStart` that could lead to my problem, each with their own very large stack of behaviour, and often calling into each other in hard-to-grasp ways. I can work through them eventually, but it's going to take a long time.
I was *almost* correct; I actually had to tell the parallel stack that the inspector protocol uses, that it still needs to go find the UA elsewhere.
This means unfortunately all the work to keep the UA inside the FetchContext is useless, because the inspector protocol has no idea what context the request was made in, so there's just no way that the context-based approach is compatible with the behaviour inspector needs. I've gone back to the *original* way I did it, with a new field on the ResourceRequest.
This lets us keep basically all of the original behaviour (stuff like PrepareRequest being the function call that actually sets the UA onto the blink::ResourceRequest, instead of it happening much later when populating the network::ResourceRequest). So a lot of this CL is now semi-superfluous, as there's still some remnants of the move-to-context (though these remnants are pretty ok, in my estimation. Maybe some string copying behaviour could be optimised, idk.)
If it works as expected, there should (fingers crossed) be only a few failing tests:
- ExtensionWebRequestApiPrerenderingTest.Load
- ExtensionWebRequestApiPrerenderingTest.LoadIntoNewTab (next goal/s)
- OutOfProcessPPAPITest.URLLoader3
- PPAPINaClNewlibTest.URLLoader3 (ChromeOS only, so I can't test these locally)
It may be time for a chromestatus.com entry? I feel like there's a lot of implementation-defined behaviour here that would be good to get more eyes on, and I'm planning to be able to work on this for the next 4-5 weeks straight.
Done
Added Alex Rudenko to review for devtools knowledge... first alphabetical owner, sorry, I get it, my name starts with A too - I have a few changes that touch devtools stuff (in `inspector_network_agent.cc`) that feel like they could be done in smarter ways. Thoughts welcome.
Also if someone could put it through a CQ dry run again that would be hugely appreciated - locally it looks like the devtools changes might have ended up fixing the remaining tests, but it would be good to check.
Done
Basically everything except a handful of Extension tests (discussed soon) should pass now -- @ Adam, CQ dry run please?
That said, I have a few outstanding questions that may need broader conversations to resolve. I don't know if these are answerable with the reviewers I already have, or whether this is time for a chromestatus.com entry and/or emails to other mailing lists for different areas.
Importantly, though:
Per spec, fetch & XMLHttpRequest should be able to set customAndrew Browncould you please link to the parts of the spec that define this behavior?
It all derives from https://fetch.spec.whatwg.org/#forbidden-request-header -- this list was updated to remove User-Agent in 2015.
Now User-Agent isn't forbidden, it should be settable by users, and should not be on `net/http/http_util.cc`'s forbidden header list.
Unfortunately, once you remove it from that list, it also starts always generating CORS preflights. Whether it should send preflights is not directly answered by the spec (it's ambiguous), but we are emulating Firefox's behaviour, as it seems unintuitive to send preflights for every single fetch request ever asking for "User-Agent". Plus, it will break large swathes of the web for things that aren't expecting to have to explicitly say that "User-Agent" is OK.
There's other discussion on this CL about this, but to summarise where we're at:
- user-agent set by user: preflight sent
- user-agent set by devtools or browser: no preflight sent
So the solution to the CORS issue is that we take the (devtools or browser) UA from the FetchContext, and instead of putting it straight in the headers, we have to hide it around until we get to a state where it won't trigger a CORS preflight.
Unfortunately this means I've had to run around to a heap of random spots and make sure they're looking in the new spot for the UA, but eh. I probably have to update that description to summarise the new behaviour soon, as it seems to change every 2nd patchset.
Description is back to being accurate now...
function checkUserAgent(headers) {This is the root cause of a couple remaining failures:
They're using User-Agent as a litmus test for whether the headers on a request are valid, which is fine, but User-Agent is not going to be present on a DevTools "requestWillBeSent" event anymore. So we need a new way to do this.
EXPECT_EQ("hi", GetFetchContext()->GetDefaultUserAgent());I would resurrect my comments from patchset 14, but they don't link through to the new code.
Is this test sensible? I feel like it's not, because the name suggests to me that instead of checking UA the new way, we should instead be looking at whether PrepareRequest does it's job -- just using a different litmus test than User-Agent. Advice welcome.
Andrew Brownshould this still write to the http header field via SetHttpHeaderField even if we need context_user_agent_header_ for the other logic?
I attempted to explain in the other comment. In brief, we have to have a separate ContextUserAgent that isn't just an ordinary http header because if we put the special browser user-agent into the ordinary http headers, we end up making CORS preflights on every single request, which is bad.
Thanks to the suggestion you made about requestWillBeSetExtraInfo, I'm probably going to go back to the approach in patchset 19, where the browser/context/safe User-Agent goes straight onto the `network::ResourceRequest` and doesn't need any special-casing inside resource_request.h.
So the `SetContextUserAgent()` stuff is not long for this world - I'll get to it tomorrow (AEST), and once that's uploaded, I'll re-ping Adam & Takashi to get another do-over/more thoughts from the network/fetch side. The context-based approach was their suggestion to begin with, but it will inevitably need some tweaks for some weirder (& worse) preflight behaviour that the DevTools testing made me spot.
Now updated and removed.
See above comment in `preflight_controller.cc` - it's both these pieces of code together.
The key thing is at this point, we can't put the context UA onto the cors_exempt_headers without potentially accepting dodgy merge behaviour later when they get combined. So we have a `network::ResourceRequest` that has a user UA on the normal headers and *no* UA on the cors_exempt headers, which then goes to `preflight_controller`, which uses the user UA.
I guess we could work out where the merge happens and make sure we remove the context UA from the `cors_exempt_headers` *if* the user UA is set? But then we need to make sure preflights avoid copying the user UA, too.
So the state of affairs is thus:
But then, later (in `preflight_controller.cc`):
I can think of a few ways to resolve this. Thoughts from you as reviewers would be appreciated but also this is getting to be touching stuff outside blink, so if there's another mailing list I should go contact, I'm happy to do that too.
We could:
- induce magic behaviour for a User-Agent at the point of merging the two sets of headers (only take the cors_exempt_headers version if the normal isn't there)
- provide a different place for the context UA to live inside a `network::ResourceRequest` (so that it doesn't get auto-merged from the cors_exempt_headers)
- we could set this all the time, and never use the cors_exempt_headers, or
- we could set this only when the user also sets one, so the special-case to look for the new header only occurs in `preflight_controller.cc`)
I'm going to do a dummy impl of the last option when I next have time, but I'm expecting it will need changes, so hopefully with your thoughts I can get ahead of the curve on a better approach.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I tried refreshing my memory, but I'm sorry if I forget the previous discussion and say something inconsistent.
Fixed: 571722I think this is an ID for the old bug trackker, and now 40450316 is the right id for this. Can you check if https://crbug.com/40450316 is the right link for your intended bug, and update this number?
----
Please fix this WARNING reported by Commit Message Validation: Did you intend to reference an old bug id? Please ensure this is a valid Buganizer ID and not an external system such as Monorail.
Analyzer Description: Validates that commit messages conform to a specific format.
Owner: ayeay...@google.com
Did you intend to reference an old bug id? Please ensure this is a valid Buganizer ID and not an external system such as Monorail.
To rerun the analyzer locally, run: `alint -- -c CommitMatch`
For more information, see go/alint.
function checkUserAgent(headers) {This is the root cause of a couple remaining failures:
- ExtensionWebRequestApiPrerenderingTest.Load
- ExtensionWebRequestApiPrerenderingTest.LoadIntoNewTab
They're using User-Agent as a litmus test for whether the headers on a request are valid, which is fine, but User-Agent is not going to be present on a DevTools "requestWillBeSent" event anymore. So we need a new way to do this.
Does this need help from qinmin@ ?
IIUC, this means the webRequest API exposed header is changed, and we need to update the API document, and need an announcement for Extensions developers and enterprise users?
if (user_agent) {
preflight_request->cors_exempt_headers.SetHeader(
net::HttpRequestHeaders::kUserAgent, *user_agent);
}This is a duplicated operation as we run the same code at line 220 eventually.
Can you remove?
// we can't use cors_exempt_headers, so the User-Agent will appear hereMaybe typo?
I think you means the User-Agent in the normal headers here.
So, s/cors_exempt_headers/it as it is not trustworthy/ ?
Also expected behavior here is:
1. if there is a DevTools provided User-Agent (in cors_exempt_headers), we use it
2. if there is a JavaScript provided User-Agent (in headers), we ignore it, and use the context given User-Agent.
Implementation-wise, this can be done simply by
```
std::optional<std::string> user_agent = request.cors_exempt_headers.GetHeader(
net::HttpRequestHeaders::kUserAgent);
if (user_agent) {
// Set DevTools given User-Agent to the preflight request
preflight_request->headers.SetHeader(
net::HttpRequestHeaders::kUserAgent, *user_agent);
} else {
// Remove JavaScript given User-Agent if it exists, and delegate to //net to set
// a context given User-Agent
preflight_request->headers.RemoveHeader(
net::HttpRequestHeaders::kUserAgent);
}
```
net::HttpRequestHeaders::kUserAgent, *user_agent);I think this code has a problem, which is that if the user has set a UA, we don't bother setting the context UA onto the cors_exempt_headers (see below comment in `request_conversion.cc`). So if there's a user UA, a preflight will be sent asking whether UA is an OK header, but *using* the custom UA that the user already set. Which is daft.
I don't know if we can fix it nicely, as I think we avoided setting UA onto both sets of headers in order to avoid merging issues with them later (header lists have no guaranteed merging behaviour, I think?). So we might have to add a special UA hiding spot to the `network::ResourceRequest` too, which is not great.
My comment above can be an answer for this problem?
// LINT.ThenChange(//tools/metrics/histograms/metadata/network/enums.xml:PrefetchMatchesResourceRequestField)Please fix this ERROR reported by If This Then That: Changes in the preceding block may need to be reflected in these files: /tools/metrics/histograms/metadata/network/enums.xml If this does not apply, add 'NO_IFTTT=some reason...' to your commit message
Analyzer Description: Reminders to change files when other files change
Owner: ayeay...@google.com
Changes in the preceding block may need to be reflected in these files: /tools/metrics/histograms/metadata/network/enums.xml If this does not apply, add 'NO_IFTTT=some reason...' to your commit message
To rerun the analyzer locally, run: `alint -- -c IfThisThenAnalyzer`
For more information, see go/alint.
// Browser-intrinsic User-Agent string, to be used for preflight requestsThis might not be needed if my comment on the preflight_controller.cc works.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
cc: tnak as this is a User-Agent related CL.
I tried refreshing my memory, but I'm sorry if I forget the previous discussion and say something inconsistent.
That's totally ok, this CL is so old & complex now I keep forgetting bits too.
Apologies, the next patchset will take some time for me to upload as `git-credential-luci` has logged me out and is now forcing me to update my system to glibc 2.38. I've been meaning to switch distros for a while now so I guess this is as good an excuse as any.
I think this is an ID for the old bug trackker, and now 40450316 is the right id for this. Can you check if https://crbug.com/40450316 is the right link for your intended bug, and update this number?
----
Please fix this WARNING reported by Commit Message Validation: Did you intend to reference an old bug id? Please ensure this is a valid Buganizer ID and not an external system such as Monorail.
Analyzer Description: Validates that commit messages conform to a specific format.
Owner: ayeay...@google.comDid you intend to reference an old bug id? Please ensure this is a valid Buganizer ID and not an external system such as Monorail.
To rerun the analyzer locally, run: `alint -- -c CommitMatch`
For more information, see go/alint.
Done :)
function checkUserAgent(headers) {Takashi ToyoshimaThis is the root cause of a couple remaining failures:
- ExtensionWebRequestApiPrerenderingTest.Load
- ExtensionWebRequestApiPrerenderingTest.LoadIntoNewTab
They're using User-Agent as a litmus test for whether the headers on a request are valid, which is fine, but User-Agent is not going to be present on a DevTools "requestWillBeSent" event anymore. So we need a new way to do this.
Does this need help from qinmin@ ?
IIUC, this means the webRequest API exposed header is changed, and we need to update the API document, and need an announcement for Extensions developers and enterprise users?
Hmm. I think at the moment, yes, but that should actually be fixable.
My first idea is to ensure that the extensions stack combines the `cors_exempt_headers` into any headers returned to the user. I don't know if this is acceptable from a duplication-of-code-behaviour standpoint -- is it ok to have a bunch of places scattered around the code that pseudo-implement the network process's header merging behaviour?
The thing is, I'm uncovering a lot of behaviour that's always been "wrong", but has always been "good enough" -- lots of places where the code only considers `network::ResourceRequest`'s |headers| without combining the |cors_exempt_headers|.
Until now, |cors_exempt_headers| were niche, so it didn't matter, but now, User-Agent is on there... and lots of things care about User-Agent.
I'll try to "fix" the tests by reintroducing the User-Agent to the webRequest API first, and then get qinmin@ 's thoughts on implementation detail.
if (user_agent) {
preflight_request->cors_exempt_headers.SetHeader(
net::HttpRequestHeaders::kUserAgent, *user_agent);
}This is a duplicated operation as we run the same code at line 220 eventually.
Can you remove?
Yeah, I can condense that down. Good spot.
// we can't use cors_exempt_headers, so the User-Agent will appear hereMaybe typo?
I think you means the User-Agent in the normal headers here.
So, s/cors_exempt_headers/it as it is not trustworthy/ ?Also expected behavior here is:
1. if there is a DevTools provided User-Agent (in cors_exempt_headers), we use it
2. if there is a JavaScript provided User-Agent (in headers), we ignore it, and use the context given User-Agent.Implementation-wise, this can be done simply by
```
std::optional<std::string> user_agent = request.cors_exempt_headers.GetHeader(
net::HttpRequestHeaders::kUserAgent);
if (user_agent) {
// Set DevTools given User-Agent to the preflight request
preflight_request->headers.SetHeader(
net::HttpRequestHeaders::kUserAgent, *user_agent);
} else {
// Remove JavaScript given User-Agent if it exists, and delegate to //net to set
// a context given User-Agent
preflight_request->headers.RemoveHeader(
net::HttpRequestHeaders::kUserAgent);
}
```
I think my comments here aren't quite clear about the obscure boxes that we have to tick. In this case, the problem is when we have *both* a UA from fetch and a UA from DevTools.
There are 4(+2) cases:
(The problem is case 4, and *not* either of the bonus cases - i.e. the problem only happens when a DevTools UA *and* a fetch UA are both present.)
The reason this is an issue is because in `request_conversion.cc`, there's already a fetch UA on the `headers`. We can't put the devtools UA on the `cors_exempt_headers`, because that overrides the fetch UA on the normal `headers`.
If we don't do anything extra (like the code you suggest above) then it's fine, but in case 4, what happens is blink simply doesn't have the ability to send through the DevTools-overridden User-Agent, so the preflight is made with the original, browser-intrinsic string.
My current (local) code correctly handles all of these cases, but needs to use the `context_user_agent_for_preflights` as a bonus field to hold the DevTools UA when a fetch UA is already on the normal |headers|. I think the name is not particularly clear, and will clean up comments & naming in the next patchset.
// LINT.ThenChange(//tools/metrics/histograms/metadata/network/enums.xml:PrefetchMatchesResourceRequestField)Please fix this ERROR reported by If This Then That: Changes in the preceding block may need to be reflected in these files: /tools/metrics/histograms/metadata/network/enums.xml If this does not apply, add 'NO_IFTTT=some reason...' to your commit message
Analyzer Description: Reminders to change files when other files change
Owner: ayeay...@google.comChanges in the preceding block may need to be reflected in these files: /tools/metrics/histograms/metadata/network/enums.xml If this does not apply, add 'NO_IFTTT=some reason...' to your commit message
To rerun the analyzer locally, run: `alint -- -c IfThisThenAnalyzer`
For more information, see go/alint.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Ah, OK. I think you are saying about the MergeFrom() operations at ConfigureUrRequest in //services/network/url_loader_util.cc or URLLoader::FollowRedirect in //services/network/url_loader.cc where we override `headers` content with `cors_exempt_headers`.
So, in such a case, how about clarifying the User-Agent handling more clearly.
// LINT.ThenChange(//tools/metrics/histograms/metadata/network/enums.xml:PrefetchMatchesResourceRequestField)Andrew BrownPlease fix this ERROR reported by If This Then That: Changes in the preceding block may need to be reflected in these files: /tools/metrics/histograms/metadata/network/enums.xml If this does not apply, add 'NO_IFTTT=some reason...' to your commit message
Analyzer Description: Reminders to change files when other files change
Owner: ayeay...@google.comChanges in the preceding block may need to be reflected in these files: /tools/metrics/histograms/metadata/network/enums.xml If this does not apply, add 'NO_IFTTT=some reason...' to your commit message
To rerun the analyzer locally, run: `alint -- -c IfThisThenAnalyzer`
For more information, see go/alint.
Done.
Not yet?
We need to make an update also on //tools/metrics/histograms/metadata/network/enums.xml as the LINT error suggested.
std::optional<std::string> context_user_agent_for_preflights = std::nullopt;I prefer to drop "_for_preflights" suffix here as this information itself doesn't have any restriction for the usage.
Also, I think this is not working over IPC now?
As ResourceRequest is mapped to mojom::URLRequest, we also need to update url_request.mojom, and url_request_mojom_traits.*.
You can check services/network/public/mojom/BUILD.gn to check the typemap for this class.
virtual String GetDefaultUserAgent() const { NOTREACHED(); }Can you drop the default implementation instead of adding NOTREACHED?
We will prefer the compile time check rather than runtime check for this.
// The browser-supplied User-Agent string (comes from NetworkContext)Do you mean FetchContext?
dest->cors_exempt_headers.SetHeader(net::HttpRequestHeaders::kUserAgent,
context_default_user_agent.Utf8());So, can we set it always to the context_user_agent_for_preflights?
Basically everything except a handful of Extension tests (discussed soon) should pass now -- @ Adam, CQ dry run please?
That said, I have a few outstanding questions that may need broader conversations to resolve. I don't know if these are answerable with the reviewers I already have, or whether this is time for a chromestatus.com entry and/or emails to other mailing lists for different areas.
- comments in `request_conversion.cc` are wondering how best to get the context UA over to the network process for OPTIONS requests to use
- comments in `framework.js` are wondering what to do about any tests that rely/relied on DevTools for stuff
- comments in `frame_fetch_context_test.cc` are wondering whether the new version of `PrepareRequestWhenDetached` is even remotely useful or sensible
Importantly, though:
- lots of tests have changed, especially those that relied on DevTools to see UAs in a `requestWillBeSent` event. How many new tests or modifications to old tests, are needed to make sure the UA is still present on a `requestWillBeSentExtraInfo`?
- followup, which cases are sufficiently distinct that they're worth different tests for this? I've edited tests that involve redirects, cross-site redirects, etc etc -- can we assume it's all good with one more test for "UA on ExtraInfo" or do we need "UA on ExtraInfo for redirects", "UA on ExtraInfo for preflights", etc?
Done
That sounds like a good idea to me -- implemented. A CQ run will probably identify a ton of places I've missed that need the new UA merged, but que sera.
net::HttpRequestHeaders::kUserAgent, *user_agent);Takashi ToyoshimaI think this code has a problem, which is that if the user has set a UA, we don't bother setting the context UA onto the cors_exempt_headers (see below comment in `request_conversion.cc`). So if there's a user UA, a preflight will be sent asking whether UA is an OK header, but *using* the custom UA that the user already set. Which is daft.
I don't know if we can fix it nicely, as I think we avoided setting UA onto both sets of headers in order to avoid merging issues with them later (header lists have no guaranteed merging behaviour, I think?). So we might have to add a special UA hiding spot to the `network::ResourceRequest` too, which is not great.
My comment above can be an answer for this problem?
Done via content_user_agent.
// LINT.ThenChange(//tools/metrics/histograms/metadata/network/enums.xml:PrefetchMatchesResourceRequestField)Andrew BrownPlease fix this ERROR reported by If This Then That: Changes in the preceding block may need to be reflected in these files: /tools/metrics/histograms/metadata/network/enums.xml If this does not apply, add 'NO_IFTTT=some reason...' to your commit message
Analyzer Description: Reminders to change files when other files change
Owner: ayeay...@google.comChanges in the preceding block may need to be reflected in these files: /tools/metrics/histograms/metadata/network/enums.xml If this does not apply, add 'NO_IFTTT=some reason...' to your commit message
To rerun the analyzer locally, run: `alint -- -c IfThisThenAnalyzer`
For more information, see go/alint.
Takashi ToyoshimaDone.
Not yet?
We need to make an update also on //tools/metrics/histograms/metadata/network/enums.xml as the LINT error suggested.
Sorry, yes, I had done it locally, but couldn't upload (glibc issues, now resolved).
std::optional<std::string> context_user_agent_for_preflights = std::nullopt;I prefer to drop "_for_preflights" suffix here as this information itself doesn't have any restriction for the usage.
Also, I think this is not working over IPC now?
As ResourceRequest is mapped to mojom::URLRequest, we also need to update url_request.mojom, and url_request_mojom_traits.*.
You can check services/network/public/mojom/BUILD.gn to check the typemap for this class.
Yeah, I discovered that locally too. It should be fully implemented now.
// Browser-intrinsic User-Agent string, to be used for preflight requestsAndrew BrownThis might not be needed if my comment on the preflight_controller.cc works.
Done
// already set a User-Agent via fetch.Is the terminology in this comment sensible?
headers, std::make_optional(request.GetContextUserAgent())))Alex RudenkoI don't think CORS preflights go through this function, as CORS preflights still appear to be UA-less in Devtools (the actual request still has the header, just the inspector's `Network.requestWillBeSent` event is wrong.)
But also, this behaviour isn't tested ATM (I think?) - the existing tests don't appear to be checking the inspector's view of the preflight, just the network's view of it (in `http/tests/inspector-protocol/emulation/user-agent-override-cors-preflight.js`)
So, outstanding questions (the last of which I can answer myself, given time):
- Should there be a test for this behaviour (sending the right UA on a preflight, and making sure Devtools sees it too)?
- Where do preflights spawn into the inspector code? Preflights are spawned in the network process, as best I can tell (code lives in services/network, not third_party/blink), so I don't know how that information gets sent back to the render process's inspector stuff.
Andrew BrownI do not think we need to add extra headers in this event. We have requestWillBeSentExtraInfo that should report actual headers set by the network service. Could you please clarify why requestWillBeSentExtraInfo is not sufficient here?
Alex RudenkoI'll be honest I am completely unfamiliar with DevTools and didn't know that existed; everything that I've done has been prompted by a few failing tests. My impression (and these changes) are to solve the fact that the new, cors-free "hiding" spot for the UA also needs to be given to the inspector.
Looking at requestWillBeSentExtraInfo, that seems like exactly what we want here, but this will require some tests to be updated - there's at least 8 tests failing in `web_tests/http/tests/inspector_protocol` that all are expecting a User-Agent to be present on the original `requestWillBeSent` event (which it won't be, anymore).
I could update those 8 tests to check requestWillBeSentExtraInfo, and then ensure the User-Agent is present there?
Alex RudenkoI think it would be better to not special case the header when reporting from the renderer and update the tests that do not contain the header anymore and make sure that ExtraInfo events from the network service actually contain the correct user agent header via a new test. I would suggest the network reviewers to review the non-DevTools changes first to make sure those are correct.
Andrew BrownI was reviewing it again and the user agent actually is set by the client code, right? I wonder why is it no present in `request.HttpHeaderFields()`? It does sound like it should contain it if it is provided by the client.
I'm not sure if I'm interpreting this right, but I'll attempt to explain; let me know if this helps. (My terminology is crappy, sorry; it's hard to overcome my non-chromium intuition of the definitions of "browser" and "client".)
Firstly, you're right, the user-agent is set javascript code onto the normal `request.HttpHeaderFields()`. But, any User-Agent value that exists in these headers will cause a CORS OPTIONS request. (The reason it doesn't do this in real Chrome right now is because of the list in `/net/http/http_util.cc`, but that's like, the actual spec bug we're fixing.)
If we put our internal value (the one that looks like `Mozilla/5.0 Chrome 140.0` etc etc) onto those headers, we'll end up triggering a CORS preflight request on every single web request we ever make.
So we have to put our internal value somewhere else, so that it can be all combined together later. That somewhere else is (for the moment) the `request.GetContextUserAgent()` (named because the internal value comes from the `FetchContext`).
Unfortunately, anywhere that we can put the UA that hides it from CORS, also hides it from a lot of other things (including DevTools).
The alternative approach was to send the context value straight to the network process later on (patchset 19), but I thought I had to make sure it was accessible through a blink::ResourceRequest somewhere in order for DevTools to see it... *but* now I've learned that it's ok for DevTools to just get it in requestWillBeSentExtraInfo later on. So I think I'm going back to the patchset 19 approach.
Done
virtual String GetDefaultUserAgent() const { NOTREACHED(); }Can you drop the default implementation instead of adding NOTREACHED?
We will prefer the compile time check rather than runtime check for this.
I don't think we can, unfortunately -- e.g. `GetFeatureContext()` or `GetPermissionsPolicy()` just below. I think there has to be a default implementation.
I'm not 100% confident on C++ internals, but I think the issue is that `fetch_context.h/cc` actually gets compiled into some real code at some point, so if you don't have a default implementation, the linker errors out.
// The browser-supplied User-Agent string (comes from NetworkContext)Andrew BrownDo you mean FetchContext?
Actually, this code isn't needed anymore -- removed entirely.
Andrew BrownSee above comment in `preflight_controller.cc` - it's both these pieces of code together.
The key thing is at this point, we can't put the context UA onto the cors_exempt_headers without potentially accepting dodgy merge behaviour later when they get combined. So we have a `network::ResourceRequest` that has a user UA on the normal headers and *no* UA on the cors_exempt headers, which then goes to `preflight_controller`, which uses the user UA.
I guess we could work out where the merge happens and make sure we remove the context UA from the `cors_exempt_headers` *if* the user UA is set? But then we need to make sure preflights avoid copying the user UA, too.
So the state of affairs is thus:
- If the user provides a UA, it sits on the normal headers. In `request_conversion.cc`, we *can't* put the context UA onto the cors_exempt_headers, because that will actually clobber the user-provided one.
- If the user doesn't provide a UA, then this code does exactly what it says; there are no issues here about different handling of a DevTools-set value vs an intrinsic browser value.
But then, later (in `preflight_controller.cc`):
- If the user didn't provide a UA, then the context UA is present on the cors_exempt_headers, so we can use that and we're good.
- But if the user did provide a UA, it's present on the normal headers, and so there is no trace of the context UA. This is a big problem, because it means the only UA we have access to for our OPTIONS request is the user one, resulting in silly behaviour.
I can think of a few ways to resolve this. Thoughts from you as reviewers would be appreciated but also this is getting to be touching stuff outside blink, so if there's another mailing list I should go contact, I'm happy to do that too.
We could:
- induce magic behaviour for a User-Agent at the point of merging the two sets of headers (only take the cors_exempt_headers version if the normal isn't there)
- provide a different place for the context UA to live inside a `network::ResourceRequest` (so that it doesn't get auto-merged from the cors_exempt_headers)
- we could set this all the time, and never use the cors_exempt_headers, or
- we could set this only when the user also sets one, so the special-case to look for the new header only occurs in `preflight_controller.cc`)I'm going to do a dummy impl of the last option when I next have time, but I'm expecting it will need changes, so hopefully with your thoughts I can get ahead of the curve on a better approach.
Done
dest->cors_exempt_headers.SetHeader(net::HttpRequestHeaders::kUserAgent,
context_default_user_agent.Utf8());So, can we set it always to the context_user_agent_for_preflights?
Yeah, we can now just always put the context version on the content_user_agent; the actual value from the |headers| doesn't matter at this point.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Apologies, the next patchset will take some time for me to upload as `git-credential-luci` has logged me out and is now forcing me to update my system to glibc 2.38. I've been meaning to switch distros for a while now so I guess this is as good an excuse as any.
Done.
Quick fixes, should pass everything except `external/wpt/xhr/preserve-ua-header-on-redirect.htm` -- I'll look into that now.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Not quite, actually -- the extension issues are a bit more formidable than I anticipated (and I've also been suffering through openSUSE build incompatibilities). Still working on it, though I might end up needing to ask for an Extensions expert opinion sooner than I thought.
Adding dtapuska@ to review for `content/public/test/url_loader_interceptor.cc`
Hopefully ok, but interested to hear thoughts on performing header merges in different locations.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Going to defer to Alex here, since it something related to CORS.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Hi @rdevlin...@chromium.org, could you take a look at this one? As described in [1], it sounds like this could impact if the `User-Agent` header is exposed in the webRequest API.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Hi @rdevlin...@chromium.org, could you take a look at this one? As described in [1], it sounds like this could impact if the `User-Agent` header is exposed in the webRequest API.
Thanks for tagging me in, Oliver! I agree that this is concerning for extensions, since it's a potentially large behavior change.
Would it be possible to preserve the existing behavior? The comment [here](https://chromium-review.googlesource.com/c/chromium/src/+/5273743/33/chrome/test/data/extensions/api_test/webrequest/framework.js) implies it is, with a bit of refactoring?
Devlin CroninHi @rdevlin...@chromium.org, could you take a look at this one? As described in [1], it sounds like this could impact if the `User-Agent` header is exposed in the webRequest API.
Thanks for tagging me in, Oliver! I agree that this is concerning for extensions, since it's a potentially large behavior change.
Would it be possible to preserve the existing behavior? The comment [here](https://chromium-review.googlesource.com/c/chromium/src/+/5273743/33/chrome/test/data/extensions/api_test/webrequest/framework.js) implies it is, with a bit of refactoring?
I wasn't sure when I posted to that mailing list, as the standard pattern for header merging isn't compatible with `OnBeforeChangeHeaders` taking a pointer to the headers, but upon further look, it appears possible to preserve the existing behaviour by just merging the new `content_user_agent` in the lead up to extensions triggering the `OnBeforeSendHeaders` event.
This feels suspect, as it appears the headers sent to the event get re-placed back onto the `network::ResourceRequest`, so I don't know if there are other implications of merging the UA back in at this point, but I'm unsure -- thoughts welcome.
Something in one of the rebase(s) is now causing a dangling ref when the browser closes :( so the tests are all failing. (Technically I can still run them and see if they pass the "test" section of the test, but they all error out immediately afterwards.) Got my weekend plans cut out for me.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Adding dtapuska@ to review for `content/public/test/url_loader_interceptor.cc`
Hopefully ok, but interested to hear thoughts on performing header merges in different locations.
I think this is a better question for network service and loading experts (some of whom are already reviewing this CL). That change looks good mechanically (I did wonder if it's beneficial at all to do the merging in a shared helper, so that this doesn't get out of sync if other headers are added in the future, but not sure how many other places actually end up doing this), but I don't really know much about the code involved here to review this deeper, so I'll have to defer to other reviewers. Happy to approve content/ once everybody's happy with the CL and outstanding questions are resolved.
Devlin CroninHi @rdevlin...@chromium.org, could you take a look at this one? As described in [1], it sounds like this could impact if the `User-Agent` header is exposed in the webRequest API.
Andrew BrownThanks for tagging me in, Oliver! I agree that this is concerning for extensions, since it's a potentially large behavior change.
Would it be possible to preserve the existing behavior? The comment [here](https://chromium-review.googlesource.com/c/chromium/src/+/5273743/33/chrome/test/data/extensions/api_test/webrequest/framework.js) implies it is, with a bit of refactoring?
I wasn't sure when I posted to that mailing list, as the standard pattern for header merging isn't compatible with `OnBeforeChangeHeaders` taking a pointer to the headers, but upon further look, it appears possible to preserve the existing behaviour by just merging the new `content_user_agent` in the lead up to extensions triggering the `OnBeforeSendHeaders` event.
This feels suspect, as it appears the headers sent to the event get re-placed back onto the `network::ResourceRequest`, so I don't know if there are other implications of merging the UA back in at this point, but I'm unsure -- thoughts welcome.
I think that's probably mostly a question for net experts -- maybe @ri...@chromium.org has thoughts?
Devlin CroninHi @rdevlin...@chromium.org, could you take a look at this one? As described in [1], it sounds like this could impact if the `User-Agent` header is exposed in the webRequest API.
Andrew BrownThanks for tagging me in, Oliver! I agree that this is concerning for extensions, since it's a potentially large behavior change.
Would it be possible to preserve the existing behavior? The comment [here](https://chromium-review.googlesource.com/c/chromium/src/+/5273743/33/chrome/test/data/extensions/api_test/webrequest/framework.js) implies it is, with a bit of refactoring?
Devlin CroninI wasn't sure when I posted to that mailing list, as the standard pattern for header merging isn't compatible with `OnBeforeChangeHeaders` taking a pointer to the headers, but upon further look, it appears possible to preserve the existing behaviour by just merging the new `content_user_agent` in the lead up to extensions triggering the `OnBeforeSendHeaders` event.
This feels suspect, as it appears the headers sent to the event get re-placed back onto the `network::ResourceRequest`, so I don't know if there are other implications of merging the UA back in at this point, but I'm unsure -- thoughts welcome.
I think that's probably mostly a question for net experts -- maybe @ri...@chromium.org has thoughts?
As far as I can tell from reading the code, the `request_` only changes the view of the world that extensions have and isn't passed back to the network stack, so changing it should be safe.
Alex MoshchukAdding dtapuska@ to review for `content/public/test/url_loader_interceptor.cc`
Hopefully ok, but interested to hear thoughts on performing header merges in different locations.
I think this is a better question for network service and loading experts (some of whom are already reviewing this CL). That change looks good mechanically (I did wonder if it's beneficial at all to do the merging in a shared helper, so that this doesn't get out of sync if other headers are added in the future, but not sure how many other places actually end up doing this), but I don't really know much about the code involved here to review this deeper, so I'll have to defer to other reviewers. Happy to approve content/ once everybody's happy with the CL and outstanding questions are resolved.
Excellent -- thanks for the assist.
Devlin CroninHi @rdevlin...@chromium.org, could you take a look at this one? As described in [1], it sounds like this could impact if the `User-Agent` header is exposed in the webRequest API.
Andrew BrownThanks for tagging me in, Oliver! I agree that this is concerning for extensions, since it's a potentially large behavior change.
Would it be possible to preserve the existing behavior? The comment [here](https://chromium-review.googlesource.com/c/chromium/src/+/5273743/33/chrome/test/data/extensions/api_test/webrequest/framework.js) implies it is, with a bit of refactoring?
Devlin CroninI wasn't sure when I posted to that mailing list, as the standard pattern for header merging isn't compatible with `OnBeforeChangeHeaders` taking a pointer to the headers, but upon further look, it appears possible to preserve the existing behaviour by just merging the new `content_user_agent` in the lead up to extensions triggering the `OnBeforeSendHeaders` event.
This feels suspect, as it appears the headers sent to the event get re-placed back onto the `network::ResourceRequest`, so I don't know if there are other implications of merging the UA back in at this point, but I'm unsure -- thoughts welcome.
Adam RiceI think that's probably mostly a question for net experts -- maybe @ri...@chromium.org has thoughts?
As far as I can tell from reading the code, the `request_` only changes the view of the world that extensions have and isn't passed back to the network stack, so changing it should be safe.
Huh, that's excellent. I don't think I would have gotten that aspect, thanks for working that out!
Something in one of the rebase(s) is now causing a dangling ref when the browser closes :( so the tests are all failing. (Technically I can still run them and see if they pass the "test" section of the test, but they all error out immediately afterwards.) Got my weekend plans cut out for me.
@toyo...@chromium.org Could you put this through the bots/dry run again?
I'm still unable to test after a painful week of git tree corruption & completely impenetrable dangling pointer issues. Basically all I managed was to confirm that the dangling pointer occurs on `main` for me too, so it's clearly an issue with my build environment, not the actual contents of the CL.
If my understanding is correct, the only remaining fail should be `external/wpt/xhr/preserve-ua-header-on-redirect.htm`.
function checkUserAgent(headers) {Takashi ToyoshimaThis is the root cause of a couple remaining failures:
- ExtensionWebRequestApiPrerenderingTest.Load
- ExtensionWebRequestApiPrerenderingTest.LoadIntoNewTab
They're using User-Agent as a litmus test for whether the headers on a request are valid, which is fine, but User-Agent is not going to be present on a DevTools "requestWillBeSent" event anymore. So we need a new way to do this.
Andrew BrownDoes this need help from qinmin@ ?
IIUC, this means the webRequest API exposed header is changed, and we need to update the API document, and need an announcement for Extensions developers and enterprise users?
Hmm. I think at the moment, yes, but that should actually be fixable.
My first idea is to ensure that the extensions stack combines the `cors_exempt_headers` into any headers returned to the user. I don't know if this is acceptable from a duplication-of-code-behaviour standpoint -- is it ok to have a bunch of places scattered around the code that pseudo-implement the network process's header merging behaviour?
The thing is, I'm uncovering a lot of behaviour that's always been "wrong", but has always been "good enough" -- lots of places where the code only considers `network::ResourceRequest`'s |headers| without combining the |cors_exempt_headers|.
Until now, |cors_exempt_headers| were niche, so it didn't matter, but now, User-Agent is on there... and lots of things care about User-Agent.
I'll try to "fix" the tests by reintroducing the User-Agent to the webRequest API first, and then get qinmin@ 's thoughts on implementation detail.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
Thanks, Andrew! Extensions LGTM % nit, since there's now no behavior change.
request_.headers.MergeAndAddUA(request_.cors_exempt_headers,nit: maybe add a brief comment about why we need to do this, since there's been a good amount of exploration / discussion?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks, Andrew! Extensions LGTM % nit, since there's now no behavior change.
Unfortunately the current test results do make this seem like it's not yet fixed.
The few failures I've actually drilled into are *all* requests that have gone through the extensions stack, now needing a specific `User-Agent` CORS preflight response.
This means the User-Agent getting added onto the headers in the `OnBeforeSendHeaders` event is actually affecting the real network request somehow, causing it to spot a non-CORS-exempt header on the headers and send a preflight (which the tests don't correctly respond to).
So all requests that go through the Extensions stack are generating a CORS preflight for the User-Agent header, which most things won't expect and respond to. We need to find an alternative approach; merging here isn't suitable.
My first guess is that the associated callback function might be our solution, if it lets me pass the `content_user_agent` back up to the front-end consumer for merging outside of all of this business? I don't know if that's feasible/reasonable at all though, considering that goes back into the black hole of mojom; it looks like it would need a lot of modifications all over the place in unrelated contexts.
Other than that... I'm not sure.
request_.headers.MergeAndAddUA(request_.cors_exempt_headers,nit: maybe add a brief comment about why we need to do this, since there's been a good amount of exploration / discussion?
Sorry, I missed this reply.
virtual String GetDefaultUserAgent() const { NOTREACHED(); }Andrew BrownCan you drop the default implementation instead of adding NOTREACHED?
We will prefer the compile time check rather than runtime check for this.
I don't think we can, unfortunately -- e.g. `GetFeatureContext()` or `GetPermissionsPolicy()` just below. I think there has to be a default implementation.
I'm not 100% confident on C++ internals, but I think the issue is that `fetch_context.h/cc` actually gets compiled into some real code at some point, so if you don't have a default implementation, the linker errors out.
Ah, you are right. Sorry, I forgot the fact that this class could be directly used too.
Andrew BrownThanks, Andrew! Extensions LGTM % nit, since there's now no behavior change.
Unfortunately the current test results do make this seem like it's not yet fixed.
The few failures I've actually drilled into are *all* requests that have gone through the extensions stack, now needing a specific `User-Agent` CORS preflight response.
This means the User-Agent getting added onto the headers in the `OnBeforeSendHeaders` event is actually affecting the real network request somehow, causing it to spot a non-CORS-exempt header on the headers and send a preflight (which the tests don't correctly respond to).
So all requests that go through the Extensions stack are generating a CORS preflight for the User-Agent header, which most things won't expect and respond to. We need to find an alternative approach; merging here isn't suitable.
My first guess is that the associated callback function might be our solution, if it lets me pass the `content_user_agent` back up to the front-end consumer for merging outside of all of this business? I don't know if that's feasible/reasonable at all though, considering that goes back into the black hole of mojom; it looks like it would need a lot of modifications all over the place in unrelated contexts.
Other than that... I'm not sure.
Thanks for the continued investigation!
Unfortunately, I'm not sure of what the right solution here is, either. Since this is mostly touching the net stack, I'd defer to one of those owners. @ri...@chromium.org?
Andrew BrownSomething in one of the rebase(s) is now causing a dangling ref when the browser closes :( so the tests are all failing. (Technically I can still run them and see if they pass the "test" section of the test, but they all error out immediately afterwards.) Got my weekend plans cut out for me.
@toyo...@chromium.org Could you put this through the bots/dry run again?
I'm still unable to test after a painful week of git tree corruption & completely impenetrable dangling pointer issues. Basically all I managed was to confirm that the dangling pointer occurs on `main` for me too, so it's clearly an issue with my build environment, not the actual contents of the CL.
If my understanding is correct, the only remaining fail should be `external/wpt/xhr/preserve-ua-header-on-redirect.htm`.
Done
Andrew BrownThanks, Andrew! Extensions LGTM % nit, since there's now no behavior change.
Devlin CroninUnfortunately the current test results do make this seem like it's not yet fixed.
The few failures I've actually drilled into are *all* requests that have gone through the extensions stack, now needing a specific `User-Agent` CORS preflight response.
This means the User-Agent getting added onto the headers in the `OnBeforeSendHeaders` event is actually affecting the real network request somehow, causing it to spot a non-CORS-exempt header on the headers and send a preflight (which the tests don't correctly respond to).
So all requests that go through the Extensions stack are generating a CORS preflight for the User-Agent header, which most things won't expect and respond to. We need to find an alternative approach; merging here isn't suitable.
My first guess is that the associated callback function might be our solution, if it lets me pass the `content_user_agent` back up to the front-end consumer for merging outside of all of this business? I don't know if that's feasible/reasonable at all though, considering that goes back into the black hole of mojom; it looks like it would need a lot of modifications all over the place in unrelated contexts.
Other than that... I'm not sure.
Thanks for the continued investigation!
Unfortunately, I'm not sure of what the right solution here is, either. Since this is mostly touching the net stack, I'd defer to one of those owners. @ri...@chromium.org?
I don't know if this is taboo to ping a second person for the same stuff, but I'll also add @cdu...@chromium.org for thoughts -- I was snooping the `OWNERS` of nearby code and he sounds like he might have expertise in this area (specifically the overlap between extensions & networking).
(Relevant file is `extensions/browser/api/web_request/web_request_proxying_url_loader_factory.cc` w/r/t "How do we get the `content_user_agent` into Extensions before `OnBeforeSendHeaders` fires, without merging it into the request's normal headers?")
I've temporarily commented out the mystery-tbd-Extensions-merging stuff that we're still working out -- could someone put this CL through a dryrun so I can check the redirect implementation is all working? I've made changes that fixed the last dubious test locally; just want to check it hasn't broken anything else in the process.
It should pass everything *but* the two Extensions tests.
request_.headers.MergeAndAddUA(request_.cors_exempt_headers,Andrew Brownnit: maybe add a brief comment about why we need to do this, since there's been a good amount of exploration / discussion?
Can do, but TBD.
I'll resolve this comment for now, but I'll make sure there's a good few comments around whatever the final implementation ends up being.
I'll just leave this one unresolved because it has the nicest summary of the whole setup we've got going here.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Andrew BrownThanks, Andrew! Extensions LGTM % nit, since there's now no behavior change.
Devlin CroninUnfortunately the current test results do make this seem like it's not yet fixed.
The few failures I've actually drilled into are *all* requests that have gone through the extensions stack, now needing a specific `User-Agent` CORS preflight response.
This means the User-Agent getting added onto the headers in the `OnBeforeSendHeaders` event is actually affecting the real network request somehow, causing it to spot a non-CORS-exempt header on the headers and send a preflight (which the tests don't correctly respond to).
So all requests that go through the Extensions stack are generating a CORS preflight for the User-Agent header, which most things won't expect and respond to. We need to find an alternative approach; merging here isn't suitable.
My first guess is that the associated callback function might be our solution, if it lets me pass the `content_user_agent` back up to the front-end consumer for merging outside of all of this business? I don't know if that's feasible/reasonable at all though, considering that goes back into the black hole of mojom; it looks like it would need a lot of modifications all over the place in unrelated contexts.
Other than that... I'm not sure.
Thanks for the continued investigation!
Unfortunately, I'm not sure of what the right solution here is, either. Since this is mostly touching the net stack, I'd defer to one of those owners. @ri...@chromium.org?
I misunderstood the code when I looked at it before. This is happening before we call `CreateLoaderAndStart()`, so any modifications to `request_` will affect the real request. So we can't modify `request_` here.
What I propose instead is to create a new `headers_with_user_agent_` member variable in the `InProgressRequest` class, which is a copy of `request_.headers` but with the User-Agent header added. Then anywhere we are currently calling a WebRequestEventRouter with a pointer to `request_.headers`, pass a pointer to `headers_with_user_agent_` instead.
Any code which modifies `request_.headers`, for example WebRequestProxyingURLLoaderFactory::InProgressRequest::FollowRedirect(), must recreate `headers_with_user_agent_` to make sure it remains correct. Possibly it will be sufficient to put that code in the UpdateRequestInfo() method?
The extra cost of copying the headers on the request fast path bothers me, but I can't see any way around it.
void MergeAndAddUA(const HttpRequestHeaders& other,Since this method is rather specialised and doesn't require access to HttpRequestHeaders internals, I would prefer if it was not a method on the class. A static function in `net::HttpUtil` would be preferable. Something like
```
static HttpRequestHeaders MergeHeadersAndAddUserAgent(
net::HttpRequestHeaders info,
const net::HttpRequestHeaders& from,
std::optional<std::string> user_agent);
```
The asymmetry between the passing convention for the parameters is so you can do
```
existing =
net::HttpUtil::MergeHeadersAndAddUserAgent(std::move(existing), extra_headers, user_agent);
```
and it will efficiently move the `existing` object to the output, or
```
net::HttpRequestHeaders merged =
net::HttpUtil::MargeHeadersAndAddUserAgent(base, extra_headers, user_agent);
```
when you need a new object.
// already set a User-Agent via fetch.Is the terminology in this comment sensible?
How about something like:
```
The correct value for the "User-Agent" header from the point-of-view of the
caller, after DevTools and other overrides have been applied. If this is not set, the "User-Agent" value configured on the NetworkContext will be used instead.
This can be overridden by setting a "User-Agent" in `headers`, but that will
require a CORS preflight for an untrusted client.
The `User-Agent` header cannot be set both in `cors_exempt_headers` as it will
not be merged correctly if it was also set in `headers`.
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
EXPECT_EQ("hi", GetFetchContext()->GetDefaultUserAgent());Adam RiceI would resurrect my comments from patchset 14, but they don't link through to the new code.
Is this test sensible? I feel like it's not, because the name suggests to me that instead of checking UA the new way, we should instead be looking at whether PrepareRequest does it's job -- just using a different litmus test than User-Agent. Advice welcome.
Looking at the original CL, there does seem to be a point to this:
https://chromium-review.googlesource.com/c/chromium/src/+/518743
Detaching the page causes the user agent string to be cached, and so the test is verifying that the code is doing this correctly.
Maybe add some comments, but keep it the same? Or add another test that does actually call `PrepareRequest()` and verifies something else it is supposed to be doing?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Andrew BrownThanks, Andrew! Extensions LGTM % nit, since there's now no behavior change.
Devlin CroninUnfortunately the current test results do make this seem like it's not yet fixed.
The few failures I've actually drilled into are *all* requests that have gone through the extensions stack, now needing a specific `User-Agent` CORS preflight response.
This means the User-Agent getting added onto the headers in the `OnBeforeSendHeaders` event is actually affecting the real network request somehow, causing it to spot a non-CORS-exempt header on the headers and send a preflight (which the tests don't correctly respond to).
So all requests that go through the Extensions stack are generating a CORS preflight for the User-Agent header, which most things won't expect and respond to. We need to find an alternative approach; merging here isn't suitable.
My first guess is that the associated callback function might be our solution, if it lets me pass the `content_user_agent` back up to the front-end consumer for merging outside of all of this business? I don't know if that's feasible/reasonable at all though, considering that goes back into the black hole of mojom; it looks like it would need a lot of modifications all over the place in unrelated contexts.
Other than that... I'm not sure.
Adam RiceThanks for the continued investigation!
Unfortunately, I'm not sure of what the right solution here is, either. Since this is mostly touching the net stack, I'd defer to one of those owners. @ri...@chromium.org?
I misunderstood the code when I looked at it before. This is happening before we call `CreateLoaderAndStart()`, so any modifications to `request_` will affect the real request. So we can't modify `request_` here.
What I propose instead is to create a new `headers_with_user_agent_` member variable in the `InProgressRequest` class, which is a copy of `request_.headers` but with the User-Agent header added. Then anywhere we are currently calling a WebRequestEventRouter with a pointer to `request_.headers`, pass a pointer to `headers_with_user_agent_` instead.
Any code which modifies `request_.headers`, for example WebRequestProxyingURLLoaderFactory::InProgressRequest::FollowRedirect(), must recreate `headers_with_user_agent_` to make sure it remains correct. Possibly it will be sufficient to put that code in the UpdateRequestInfo() method?
The extra cost of copying the headers on the request fast path bothers me, but I can't see any way around it.
I've implemented this (I think?) -- dryrun please.
I've called it `user_visible_headers_` with the logic that they're the headers that get sent back up to the "user" (i.e. extensions). Terminology check would be appreciated.
My first attempt (which is now uploaded) seemed to pass the tests that I ran locally, which scares me a little, but if it works it works.
Having now written it, I definitely can see where you're not keen on the extra copy... it feels bad to be making the `user_visible_headers_` in the constructor and then completely remaking them again in `ContinueToBeforeSendHeaders`. There might be a way around it if we try to ensure that `user_visible_headers_` are updated wherever the `headers` change? But that seems like a lot of code complexity, so eh.
I've temporarily commented out the mystery-tbd-Extensions-merging stuff that we're still working out -- could someone put this CL through a dryrun so I can check the redirect implementation is all working? I've made changes that fixed the last dubious test locally; just want to check it hasn't broken anything else in the process.
It should pass everything *but* the two Extensions tests.
Done
void MergeAndAddUA(const HttpRequestHeaders& other,Since this method is rather specialised and doesn't require access to HttpRequestHeaders internals, I would prefer if it was not a method on the class. A static function in `net::HttpUtil` would be preferable. Something like
```
static HttpRequestHeaders MergeHeadersAndAddUserAgent(
net::HttpRequestHeaders info,
const net::HttpRequestHeaders& from,
std::optional<std::string> user_agent);
```
The asymmetry between the passing convention for the parameters is so you can do
```
existing =
net::HttpUtil::MergeHeadersAndAddUserAgent(std::move(existing), extra_headers, user_agent);
```
and it will efficiently move the `existing` object to the output, or
```
net::HttpRequestHeaders merged =
net::HttpUtil::MargeHeadersAndAddUserAgent(base, extra_headers, user_agent);
```
when you need a new object.
Huh, nifty. Done!
// already set a User-Agent via fetch.Adam RiceIs the terminology in this comment sensible?
How about something like:
```
The correct value for the "User-Agent" header from the point-of-view of the
caller, after DevTools and other overrides have been applied. If this is not set, the "User-Agent" value configured on the NetworkContext will be used instead.
This can be overridden by setting a "User-Agent" in `headers`, but that will
require a CORS preflight for an untrusted client.The `User-Agent` header cannot be set both in `cors_exempt_headers` as it will
not be merged correctly if it was also set in `headers`.
```
Added with a few slight changes.
EXPECT_EQ("hi", GetFetchContext()->GetDefaultUserAgent());Adam RiceI would resurrect my comments from patchset 14, but they don't link through to the new code.
Is this test sensible? I feel like it's not, because the name suggests to me that instead of checking UA the new way, we should instead be looking at whether PrepareRequest does it's job -- just using a different litmus test than User-Agent. Advice welcome.
Looking at the original CL, there does seem to be a point to this:
https://chromium-review.googlesource.com/c/chromium/src/+/518743Detaching the page causes the user agent string to be cached, and so the test is verifying that the code is doing this correctly.
Maybe add some comments, but keep it the same? Or add another test that does actually call `PrepareRequest()` and verifies something else it is supposed to be doing?
I've added a new `PrepareRequestWhenDetached` that checks the UkmSourceID.
I think that should be equivalent(?), as far as the old test was checking that you could call PrepareRequest while detached and it would still do some sort of stuff.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
net::HttpRequestHeaders user_visible_headers_;How about `script_exposed_headers_` as an alternative name?
// that has to happen further down (after CORS checks).I suggest "has to happen in the network service" would be clearer.
user_visible_headers_ = net::HttpUtil::MergeHeadersAndAddUserAgent(I think this should go in the member initialiser section above. Also in the other constructor below.
*(user_agent));Nit: `()` around `user_agent` are not needed.
net::HttpRequestHeaders::kUserAgent, *(content_user_agent_));Nit: `()` not needed around `content_user_agent_`.
EXPECT_EQ(ukm::kInvalidSourceId, request.GetUkmSourceId());Unfortunately `ukm::kInvalidSourceId` is the default value for this field, so it doesn't really prove that `PrepareRequest()` did anything.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
EXPECT_EQ(ukm::kInvalidSourceId, request.GetUkmSourceId());Unfortunately `ukm::kInvalidSourceId` is the default value for this field, so it doesn't really prove that `PrepareRequest()` did anything.
Ah, so it is. I spotted in `t/b/r/core/testing/dummy_page_holder.cc:134` that the document source ID for these tests was initialised specifically as `kInvalidSourceId`, so I figured it would be a decent test example, but didn't check the ResourceRequest default.
Would it be possible to add a quick constant to ukm, call it `ukm::kTestingSourceID`, that could be set at `dummy_page_holder.cc:134` and used for this test? I'm a little surprised there's not one already, but I guess it would be a bit niche.
Otherwise the next option would be looking into `request.SetStorageApiAccess` in `FrameFetchContext::PrepareRequest` to see if it sets a value we can test.
EXPECT_EQ(ukm::kInvalidSourceId, request.GetUkmSourceId());Andrew BrownUnfortunately `ukm::kInvalidSourceId` is the default value for this field, so it doesn't really prove that `PrepareRequest()` did anything.
Ah, so it is. I spotted in `t/b/r/core/testing/dummy_page_holder.cc:134` that the document source ID for these tests was initialised specifically as `kInvalidSourceId`, so I figured it would be a decent test example, but didn't check the ResourceRequest default.
Would it be possible to add a quick constant to ukm, call it `ukm::kTestingSourceID`, that could be set at `dummy_page_holder.cc:134` and used for this test? I'm a little surprised there's not one already, but I guess it would be a bit niche.
Otherwise the next option would be looking into `request.SetStorageApiAccess` in `FrameFetchContext::PrepareRequest` to see if it sets a value we can test.
On further inspection, PrepareRequest() only does two things when detached, and the only one that has a side effect is setting the user agent. So there's nothing here we can test. I think we should just delete this "PrepareRequestWhenDetached" test because it isn't adding any value.
Andrew BrownThanks, Andrew! Extensions LGTM % nit, since there's now no behavior change.
Devlin CroninUnfortunately the current test results do make this seem like it's not yet fixed.
The few failures I've actually drilled into are *all* requests that have gone through the extensions stack, now needing a specific `User-Agent` CORS preflight response.
This means the User-Agent getting added onto the headers in the `OnBeforeSendHeaders` event is actually affecting the real network request somehow, causing it to spot a non-CORS-exempt header on the headers and send a preflight (which the tests don't correctly respond to).
So all requests that go through the Extensions stack are generating a CORS preflight for the User-Agent header, which most things won't expect and respond to. We need to find an alternative approach; merging here isn't suitable.
My first guess is that the associated callback function might be our solution, if it lets me pass the `content_user_agent` back up to the front-end consumer for merging outside of all of this business? I don't know if that's feasible/reasonable at all though, considering that goes back into the black hole of mojom; it looks like it would need a lot of modifications all over the place in unrelated contexts.
Other than that... I'm not sure.
Adam RiceThanks for the continued investigation!
Unfortunately, I'm not sure of what the right solution here is, either. Since this is mostly touching the net stack, I'd defer to one of those owners. @ri...@chromium.org?
Andrew BrownI misunderstood the code when I looked at it before. This is happening before we call `CreateLoaderAndStart()`, so any modifications to `request_` will affect the real request. So we can't modify `request_` here.
What I propose instead is to create a new `headers_with_user_agent_` member variable in the `InProgressRequest` class, which is a copy of `request_.headers` but with the User-Agent header added. Then anywhere we are currently calling a WebRequestEventRouter with a pointer to `request_.headers`, pass a pointer to `headers_with_user_agent_` instead.
Any code which modifies `request_.headers`, for example WebRequestProxyingURLLoaderFactory::InProgressRequest::FollowRedirect(), must recreate `headers_with_user_agent_` to make sure it remains correct. Possibly it will be sufficient to put that code in the UpdateRequestInfo() method?
The extra cost of copying the headers on the request fast path bothers me, but I can't see any way around it.
I've implemented this (I think?) -- dryrun please.
I've called it `user_visible_headers_` with the logic that they're the headers that get sent back up to the "user" (i.e. extensions). Terminology check would be appreciated.
My first attempt (which is now uploaded) seemed to pass the tests that I ran locally, which scares me a little, but if it works it works.
Having now written it, I definitely can see where you're not keen on the extra copy... it feels bad to be making the `user_visible_headers_` in the constructor and then completely remaking them again in `ContinueToBeforeSendHeaders`. There might be a way around it if we try to ensure that `user_visible_headers_` are updated wherever the `headers` change? But that seems like a lot of code complexity, so eh.
Continuing this discussion, I think we have further problems.
The entire extensions stack seems to pass around the one headers object for all modifications, including the actual modifications that extensions could make (e.g. by DeclarativeNetRequest). So we have to somehow get these modifications back to the real request somehow (*without* passing the content_user_agent and triggering CORS).
I can think of two ways forwards to try to maintain the existing behaviour:
- Literally just check the value of the User-Agent header when the `proxying_url_loader` gets the headers back, and if it's an exact match for the `content_user_agent`, remove it.
Problems: 1. we would actually have to check string equality for the header value, which has a bit of of a bad code smell, and 2. What happens in the pathological case where an extension manually sets the header to the same value?
- Somehow special-case the User-Agent *inside* net::HttpRequestHeaders. Basically, store some sort of flag for "has the UA been modified," and only remove the UA if the headers object reports it as unchanged. (I attempted to originally do this by making an actual subclass of `net::HttpRequestHeaders` inside the `InProgressRequest` that had special user-agent-tracking behaviour. I was probably messing up the inheritance somehow but it didn't seem like a good solution.)
Problems: modifying the actual `net::HttpRequestHeaders` for this seems like a bad idea.
How about `script_exposed_headers_` as an alternative name?
Done
I suggest "has to happen in the network service" would be clearer.
Done
user_visible_headers_ = net::HttpUtil::MergeHeadersAndAddUserAgent(I think this should go in the member initialiser section above. Also in the other constructor below.
Done
Nit: `()` around `user_agent` are not needed.
Done
net::HttpRequestHeaders::kUserAgent, *(content_user_agent_));Nit: `()` not needed around `content_user_agent_`.
Done
EXPECT_EQ("hi", GetFetchContext()->GetDefaultUserAgent());Adam RiceI would resurrect my comments from patchset 14, but they don't link through to the new code.
Is this test sensible? I feel like it's not, because the name suggests to me that instead of checking UA the new way, we should instead be looking at whether PrepareRequest does it's job -- just using a different litmus test than User-Agent. Advice welcome.
Andrew BrownLooking at the original CL, there does seem to be a point to this:
https://chromium-review.googlesource.com/c/chromium/src/+/518743Detaching the page causes the user agent string to be cached, and so the test is verifying that the code is doing this correctly.
Maybe add some comments, but keep it the same? Or add another test that does actually call `PrepareRequest()` and verifies something else it is supposed to be doing?
I've added a new `PrepareRequestWhenDetached` that checks the UkmSourceID.
I think that should be equivalent(?), as far as the old test was checking that you could call PrepareRequest while detached and it would still do some sort of stuff.
Done
EXPECT_EQ(ukm::kInvalidSourceId, request.GetUkmSourceId());Andrew BrownUnfortunately `ukm::kInvalidSourceId` is the default value for this field, so it doesn't really prove that `PrepareRequest()` did anything.
Adam RiceAh, so it is. I spotted in `t/b/r/core/testing/dummy_page_holder.cc:134` that the document source ID for these tests was initialised specifically as `kInvalidSourceId`, so I figured it would be a decent test example, but didn't check the ResourceRequest default.
Would it be possible to add a quick constant to ukm, call it `ukm::kTestingSourceID`, that could be set at `dummy_page_holder.cc:134` and used for this test? I'm a little surprised there's not one already, but I guess it would be a bit niche.
Otherwise the next option would be looking into `request.SetStorageApiAccess` in `FrameFetchContext::PrepareRequest` to see if it sets a value we can test.
On further inspection, PrepareRequest() only does two things when detached, and the only one that has a side effect is setting the user agent. So there's nothing here we can test. I think we should just delete this "PrepareRequestWhenDetached" test because it isn't adding any value.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
You are right, I hadn't thought it through properly.
I think we should take your first option. We don't really have an accurate way to detect "has been set by extension JavaScript" anyway, so however we do it we end up relying on a string comparison.
There's actually already a horrible function called CalculateOnBeforeSendHeadersDelta() that does pretty much what we need, and MergeOnBeforeSendHeadersResponses() to merge back the results. I think instead of MergeOnBeforeSendHeadersResponses() taking one pointer to a `net::HttpRequestHeaders` object, it will have to take two, one for the real request headers that don't include `User-Agent`, and one for the internal version that does. Then it should just apply exactly the same changes to both. This means the cost when nothing is actually modified will be small.
Ok, sort of "option 1.5" -- actually passing *both* headers objects through everywhere, but then whenever we display to the user, take the `script_exposed_headers`, and whenever we modify, modify both. That sounds like it should definitely work.
I'm not confident this is fully implemented yet; I'm almost certain there will be more to it, but I've been losing momentum in trying to understand the weeds of the extensions code, so in the interest of keeping things moving, here's a new patchset for dry-running. Thanks :)
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Another dry-run please :)
I think I sort of understand what's happening now between the factory and the router; I've undone the changes I made around the `WebRequestInfo.extra_request_headers` and I think I'm now passing everything around correctly. There might be some exceptional behaviour around the handling of `nullptr` in a few of the functions that's worth commenting? Thoughts welcome.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Done!
Well, I guess now I need +1's from all the different owners -- re-pinged Takashi, as I feel like yourself and Adam should be the first eyes over it, given this is a primarily network-focused change.
Well, I guess now I need +1's from all the different owners -- re-pinged Takashi, as I feel like yourself and Adam should be the first eyes over it, given this is a primarily network-focused change.
Sorry to be a pest, but re-pinging, as it's been a week.
I will need reviews from both of you as Network People, eventually, in order to cover all the files (plus this is a complex & large change that can't hurt to have more eyes) so please dw about duplicate reviews. (quoting the "contributing to chromium" guide)
| Code-Review | +1 |
lgtm, thank you for all your hard work on this.
net::HttpRequestHeaders original,Nit: `net::` prefix should not be used inside the `net` namespace.
#include "net/http/http_request_headers.h"Nit: you could just predeclare `class HttpRequestHeaders;` as with `HttpResponseHeaders` below, as the compiler doesn't need to know the size of `HttpRequestHeaders` to compile this file. Only worth fixing if you're doing another upload anyway.
net::HttpRequestHeaders HttpUtil::MergeHeadersAndAddUserAgent(As in the header file, `net::` not needed.
lgtm, thank you for all your hard work on this.
Thanks in return to you too -- I couldn't have done it without your support.
Nit: `net::` prefix should not be used inside the `net` namespace.
Done
Nit: you could just predeclare `class HttpRequestHeaders;` as with `HttpResponseHeaders` below, as the compiler doesn't need to know the size of `HttpRequestHeaders` to compile this file. Only worth fixing if you're doing another upload anyway.
I'm already not keeping the CL dryrun around (24hrs is long passed), and it's old enough to be worth rebasing, and it's an easy fix, so I'll just do it.
net::HttpRequestHeaders HttpUtil::MergeHeadersAndAddUserAgent(As in the header file, `net::` not needed.
Done
// we can't use cors_exempt_headers, so the User-Agent will appear here| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I'm not suuuuper sure what's up with the ios-simulator test stuff? I don't recognise any of the calls in the stack trace (i.e. it shouldn't go anywhere near this CL, let alone the actual changes between patchset 44 and 45) and the test title implies it's a potentially flaky test. Do we just re-run and hope it works?
#include "net/http/http_request_headers.h"Andrew BrownNit: you could just predeclare `class HttpRequestHeaders;` as with `HttpResponseHeaders` below, as the compiler doesn't need to know the size of `HttpRequestHeaders` to compile this file. Only worth fixing if you're doing another upload anyway.
I'm already not keeping the CL dryrun around (24hrs is long passed), and it's old enough to be worth rebasing, and it's an easy fix, so I'll just do it.
Hm, I think I misunderstood what you meant by this. Re-done, still works, but had to slightly fiddle the includes on the net unittests as well.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I'm not suuuuper sure what's up with the ios-simulator test stuff? I don't recognise any of the calls in the stack trace (i.e. it shouldn't go anywhere near this CL, let alone the actual changes between patchset 44 and 45) and the test title implies it's a potentially flaky test. Do we just re-run and hope it works?
Yes. If a failure seems unrelated, most of the time it is just a flaky test.
Andrew BrownWell, I guess now I need +1's from all the different owners -- re-pinged Takashi, as I feel like yourself and Adam should be the first eyes over it, given this is a primarily network-focused change.
Sorry to be a pest, but re-pinging, as it's been a week.
I will need reviews from both of you as Network People, eventually, in order to cover all the files (plus this is a complex & large change that can't hurt to have more eyes) so please dw about duplicate reviews. (quoting the "contributing to chromium" guide)
Sorry for overlooking this pings. It still LGTM.
Andrew BrownWell, I guess now I need +1's from all the different owners -- re-pinged Takashi, as I feel like yourself and Adam should be the first eyes over it, given this is a primarily network-focused change.
Takashi ToyoshimaSorry to be a pest, but re-pinging, as it's been a week.
I will need reviews from both of you as Network People, eventually, in order to cover all the files (plus this is a complex & large change that can't hurt to have more eyes) so please dw about duplicate reviews. (quoting the "contributing to chromium" guide)
Sorry for overlooking this pings. It still LGTM.
By the way, I think you still need a reviewer's approval on Extensions related directory.
Andrew BrownWell, I guess now I need +1's from all the different owners -- re-pinged Takashi, as I feel like yourself and Adam should be the first eyes over it, given this is a primarily network-focused change.
Takashi ToyoshimaSorry to be a pest, but re-pinging, as it's been a week.
I will need reviews from both of you as Network People, eventually, in order to cover all the files (plus this is a complex & large change that can't hurt to have more eyes) so please dw about duplicate reviews. (quoting the "contributing to chromium" guide)
Takashi ToyoshimaSorry for overlooking this pings. It still LGTM.
By the way, I think you still need a reviewer's approval on Extensions related directory.
All good, appreciate the vote :)
Pinging Devlin now for extensions approval -- happy to go bother others if there's specific internal complexity to be looked at (there is a bit of weird stuff in there), but there's no visible API change for extensions anymore.
Figure I might as well also ping Alex for owner review on content/public/test/url_loader_interceptor.cc to round it all off.
| Code-Review | +1 |
Figure I might as well also ping Alex for owner review on content/public/test/url_loader_interceptor.cc to round it all off.
Andrew BrownWell, I guess now I need +1's from all the different owners -- re-pinged Takashi, as I feel like yourself and Adam should be the first eyes over it, given this is a primarily network-focused change.
Takashi ToyoshimaSorry to be a pest, but re-pinging, as it's been a week.
I will need reviews from both of you as Network People, eventually, in order to cover all the files (plus this is a complex & large change that can't hurt to have more eyes) so please dw about duplicate reviews. (quoting the "contributing to chromium" guide)
Takashi ToyoshimaSorry for overlooking this pings. It still LGTM.
Andrew BrownBy the way, I think you still need a reviewer's approval on Extensions related directory.
All good, appreciate the vote :)
Pinging Devlin now for extensions approval -- happy to go bother others if there's specific internal complexity to be looked at (there is a bit of weird stuff in there), but there's no visible API change for extensions anymore.
Thanks, Andrew! If there's no API change, I'm okay with this at a high level (though see the one comment I added). I've added @kelvi...@chromium.org for a more thorough review, as an owner of the webRequest API.
"usER-agENt",If there's no visible API change for extensions, is this change still needed?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
"usER-agENt",If there's no visible API change for extensions, is this change still needed?
Hm, true, I guess there technically still is a "visible API change" here, but that's because chromium currently doesn't follow the fetch spec w/r/t User-Agent.
The extensions Downloads API docs say that additional headers can be provided for the request if XMLHttpRequest allows it. Until now, XMLHttpRequest couldn't set User-Agents, but that's the core bug we're fixing. Technically I guess we're not actually changing the documented behaviour, because the docs rely on XMLHttpRequest, and we're just changing that.
Do we need an announcement/chromestatus.com entry/whatever for this change, then? I'm not super familiar with the actual "release process" for how this code change actually ships, so there might be steps there that I'm missing.
"usER-agENt",Andrew BrownIf there's no visible API change for extensions, is this change still needed?
Hm, true, I guess there technically still is a "visible API change" here, but that's because chromium currently doesn't follow the fetch spec w/r/t User-Agent.
The extensions Downloads API docs say that additional headers can be provided for the request if XMLHttpRequest allows it. Until now, XMLHttpRequest couldn't set User-Agents, but that's the core bug we're fixing. Technically I guess we're not actually changing the documented behaviour, because the docs rely on XMLHttpRequest, and we're just changing that.
Do we need an announcement/chromestatus.com entry/whatever for this change, then? I'm not super familiar with the actual "release process" for how this code change actually ships, so there might be steps there that I'm missing.
I think I'm okay with this, but I'll defer to @olive...@chromium.org on if he thinks this needs a separate announcement or discussion in the WECG.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Will finish reviewing today, sorry for the delay
I have a lot of questions and I might ask more in another pass:
With respect to extensions that can modify network requests, what happens here?
1. Before the request is sent, compute the value of `content_user_agent` which will be added to headers later?
2. Request headers are set, and modifications from extensions are made?
3. CORS checks happen but don't merge headers yet because `content_user_agent` is exempt?
4. Now merge `content_user_agent` into headers?
net::HttpRequestHeaders* exposed_headers);nit: this is "script_exposed_headers" in the .cc file, why the "script_" part? and can we add a comment here describing what it is?
// User-Agent.by this you mean the CORS exempt user agent that should be merged into request headers later?
// Report the script_exposed_headers if givenstyle nit: period at end of comments, and that comments start with a capital letter.
// Modifies the headers in `request_headers` according to `deltas`. Conflictsnit: "Modifies the headers in `request_headers` and `script_exposed_request_headers` according to `deltas`.
// |script_exposed_headers| may have a (User-Agent) header whennit: backticks around variable names instead of pipes. I also think this comment is better added before the function name in web_request_api_helpers.h as an example of why we use `script_exposed_headers` in addition to `request_headers` ?
// Parse old cookie line. Use request_headers for simplicity as
// script_exposed_headers should never differ on cookies.nit: if you want to guarantee this, add a DCHECK
// that has to happen in the network stack (after CORS checks).dumb question: merging before would fail CORS checks?
request_.headers,nit: just use `request` instead of `request_` here. No difference in functionality but it reads a bit cleaner
// We have to reconstruct the script_exposed_headers_ here in case somethingstyle nit: surround variable names in backticks if you can, or pipes to be consistent with the rest of the file
// has changedThis implies that one of the `request_`'s headers, cors_exempt_headers or content_user_agent has changed? I'm curious where this could happen between when we first initialize script_exposed_headers_ and now.
&request_headers_, nullptr);just curious: why is it "safe" to pass in null here?
I have a lot of questions and I might ask more in another pass:
With respect to extensions that can modify network requests, what happens here?
1. Before the request is sent, compute the value of `content_user_agent` which will be added to headers later?
2. Request headers are set, and modifications from extensions are made?
3. CORS checks happen but don't merge headers yet because `content_user_agent` is exempt?
4. Now merge `content_user_agent` into headers?
No worries, thanks for your review! Happy to answer questions, fire away. This might be too granular of a breakdown, or there might be more aspects to ask about, but this is how I'm conceptualising what happens:
1. Start with a `network::ResourceRequest` with `headers`, `cors_exempt_headers`, and `content_user_agent` which comes from DevTools or FetchContext. (All three of these must be kept separate until after CORS checks.)
2. Extensions code receives this request.
3. Extensions code has to invent the `script_exposed_headers`, which is the headers that will be *really* sent on this request, after CORS. This has to be a separate object because again we can't merge them all into the `headers` until after CORS.
4. Extensions code tells a running extension that the `script_exposed_headers` are the headers for "this request", so the extension sees the "real" final state of the request headers, including `content_user_agent`.
5. All extensions modifications are applied to *both* the real `headers` and the user-visible `script_exposed_headers`. This keeps the `script_exposed_headers` that the user can see in-sync with the real `headers` but enables us to make sure they stay separate.
6. After all extensions have done modifications/looked at things/etc, toss away the `script_exposed_headers`. We go back to having a `network::ResourceRequest` with `headers`, `cors_exempt_headers`, and `content_user_agent`, all separate, but with all the extensions modifications applied to `headers` as requested.
7. CORS checks happen on `headers`. A CORS preflight may be sent for User-Agent here if the request has a User-Agent specified on the `headers`, as it must have come from an untrusted source.
8. After CORS, if the preflight said it's fine, then we can use the `headers` User-Agent. If it's not fine, or if there wasn't one, we can merge the `content_user_agent` onto the headers to actually make the request.
net::HttpRequestHeaders* exposed_headers);nit: this is "script_exposed_headers" in the .cc file, why the "script_" part? and can we add a comment here describing what it is?
`script_exposed` vs `exposed` are basically arbitrary names. My original name was `user_visible_headers` but it's not clear who the *user* is (is it the extensions code, or page JS code, or the person sitting at the computer, or what), so I changed it.
As long as we all think the name indicates "headers that the extension will see", that's the goal. Possibly switch to use `exposed_headers` everywhere?
// User-Agent.by this you mean the CORS exempt user agent that should be merged into request headers later?
If the headers already have a User-Agent it will be that, e.g. if this request comes from someone doing `fetch('foo.com' {headers: {"User-Agent": "baz"}});`, then the exposed_headers will have "User-Agent: baz". The content_user_agent shouldn't override any other User-Agent.
So it's not guaranteed that these headers have the CORS exempt User-Agent, just that they have *a* User-Agent.
// Report the script_exposed_headers if givenstyle nit: period at end of comments, and that comments start with a capital letter.
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
// Modifies the headers in `request_headers` according to `deltas`. Conflictsnit: "Modifies the headers in `request_headers` and `script_exposed_request_headers` according to `deltas`.
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
// |script_exposed_headers| may have a (User-Agent) header whennit: backticks around variable names instead of pipes. I also think this comment is better added before the function name in web_request_api_helpers.h as an example of why we use `script_exposed_headers` in addition to `request_headers` ?
OK to comment in both locations? (leave this comment but also add it to the function comment in the .h file?)
// Parse old cookie line. Use request_headers for simplicity as
// script_exposed_headers should never differ on cookies.nit: if you want to guarantee this, add a DCHECK
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
// that has to happen in the network stack (after CORS checks).dumb question: merging before would fail CORS checks?
Unfortunately yeah, because this all happens before the CORS checks, if we put a User-Agent on the `headers`, it triggers a CORS preflight. (Currently CORS ignores "User-Agent" because it's on the "banned" list in `net/http/http_util.cc` but this is the actual bug we're fixing -- it shouldn't be there.)
Lots of websites aren't expecting to receive a CORS preflight for "User-Agent" (because it's already on every request ever!), so they tell CORS "no User-Agent please" and then the request fails when it shouldn't.
request_.headers,nit: just use `request` instead of `request_` here. No difference in functionality but it reads a bit cleaner
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
// We have to reconstruct the script_exposed_headers_ here in case somethingstyle nit: surround variable names in backticks if you can, or pipes to be consistent with the rest of the file
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
// has changedThis implies that one of the `request_`'s headers, cors_exempt_headers or content_user_agent has changed? I'm curious where this could happen between when we first initialize script_exposed_headers_ and now.
Honestly I have no idea whether that could happen; in *theory* they should always be maintained the same, so this isn't necessary. I'll try removing this (in the next upload) and we'll see what happens with the trybots.
&request_headers_, nullptr);just curious: why is it "safe" to pass in null here?
Websocket and Webtransport behave differently w/r/t User-Agent (afaict) later in the network stack, so they don't need special handling to maintain & report the `script_exposed_request_headers` -- they can just report the usual `request_headers`.
`extensions_web_request_event_router.cc:1114` supports these alternatives by not requiring `script_exposed_request_headers` to be given -- it checks for nullptr before passing it off to the actual event. Maybe preference `std::optional` to make this more explicit?
&request_headers_, nullptr); &request_headers_, nullptr);Andrew Brownjust curious: why is it "safe" to pass in null here?
Websocket and Webtransport behave differently w/r/t User-Agent (afaict) later in the network stack, so they don't need special handling to maintain & report the `script_exposed_request_headers` -- they can just report the usual `request_headers`.
`extensions_web_request_event_router.cc:1114` supports these alternatives by not requiring `script_exposed_request_headers` to be given -- it checks for nullptr before passing it off to the actual event. Maybe preference `std::optional` to make this more explicit?
Looking at this again, it occurs to me that in these cases we should just also pass `&request_headers_` as the `script_exposed_request_headers` to be reported, rather than mucking around with `std::optional` pointers or other silly `nullptr` tricks.
I've done this locally, will upload with the rest of the nits, etc.
Going to take a more detailed pass tomorrow, mind addressing the nits since those shouldn't be blocked?
&request_headers_, nullptr);Andrew Brownjust curious: why is it "safe" to pass in null here?
Andrew BrownWebsocket and Webtransport behave differently w/r/t User-Agent (afaict) later in the network stack, so they don't need special handling to maintain & report the `script_exposed_request_headers` -- they can just report the usual `request_headers`.
`extensions_web_request_event_router.cc:1114` supports these alternatives by not requiring `script_exposed_request_headers` to be given -- it checks for nullptr before passing it off to the actual event. Maybe preference `std::optional` to make this more explicit?
Looking at this again, it occurs to me that in these cases we should just also pass `&request_headers_` as the `script_exposed_request_headers` to be reported, rather than mucking around with `std::optional` pointers or other silly `nullptr` tricks.
I've done this locally, will upload with the rest of the nits, etc.
Going to take a more detailed pass tomorrow, mind addressing the nits since those shouldn't be blocked?
Yeah, no worries, done.
I also rebased (mmm, 2.5 hour compile, yippee), and made a one other hopefully-non-behaviour-affecting change: the new `exposed_headers` on the `BlockedRequest` should follow the same usage pattern as the other `headers`, so instead of null-checking it everywhere, I switched to just `DCHECK()` once at the very start.
net::HttpRequestHeaders* exposed_headers);Andrew Brownnit: this is "script_exposed_headers" in the .cc file, why the "script_" part? and can we add a comment here describing what it is?
`script_exposed` vs `exposed` are basically arbitrary names. My original name was `user_visible_headers` but it's not clear who the *user* is (is it the extensions code, or page JS code, or the person sitting at the computer, or what), so I changed it.
As long as we all think the name indicates "headers that the extension will see", that's the goal. Possibly switch to use `exposed_headers` everywhere?
Switched to use `exposed_headers` most places, except for places where we already have a variable `request_headers`, then it's called `exposed_request_headers`.
I think this is mostly clear, and avoids questions about what "script" means? Thoughts welcome.
// User-Agent.Andrew Brownby this you mean the CORS exempt user agent that should be merged into request headers later?
If the headers already have a User-Agent it will be that, e.g. if this request comes from someone doing `fetch('foo.com' {headers: {"User-Agent": "baz"}});`, then the exposed_headers will have "User-Agent: baz". The content_user_agent shouldn't override any other User-Agent.
So it's not guaranteed that these headers have the CORS exempt User-Agent, just that they have *a* User-Agent.
I've updated this comment, hopefully it's clearer now?
Andrew Brownstyle nit: period at end of comments, and that comments start with a capital letter.
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
Done
// Modifies the headers in `request_headers` according to `deltas`. ConflictsAndrew Brownnit: "Modifies the headers in `request_headers` and `script_exposed_request_headers` according to `deltas`.
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
Done
// |script_exposed_headers| may have a (User-Agent) header whenAndrew Brownnit: backticks around variable names instead of pipes. I also think this comment is better added before the function name in web_request_api_helpers.h as an example of why we use `script_exposed_headers` in addition to `request_headers` ?
OK to comment in both locations? (leave this comment but also add it to the function comment in the .h file?)
Commented in both locations.
// Parse old cookie line. Use request_headers for simplicity as
// script_exposed_headers should never differ on cookies.Andrew Brownnit: if you want to guarantee this, add a DCHECK
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
Actually I looked more at this and while I haven't *noticed* anything that would cause `exposed_headers` to have a different Cookie, I don't think its explicitly impossible (the `Cookie` header is usually CORS-exempt, so it's possible it could end up on the `cors_exempt_headers`).
It's actually not "worse" to check the `exposed_headers` all the time anymore, because the code expects them to be non-null, and therefore you don't have to check that here, so I'll just do that instead.
Andrew Brownnit: just use `request` instead of `request_` here. No difference in functionality but it reads a bit cleaner
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
Done
// We have to reconstruct the script_exposed_headers_ here in case somethingAndrew Brownstyle nit: surround variable names in backticks if you can, or pipes to be consistent with the rest of the file
Fixed nit locally, will upload once other comments are also resolved (e.g. "script_exposed_headers" vs "exposed_headers" name)
Done
Andrew BrownThis implies that one of the `request_`'s headers, cors_exempt_headers or content_user_agent has changed? I'm curious where this could happen between when we first initialize script_exposed_headers_ and now.
Honestly I have no idea whether that could happen; in *theory* they should always be maintained the same, so this isn't necessary. I'll try removing this (in the next upload) and we'll see what happens with the trybots.
Done
Andrew Brownjust curious: why is it "safe" to pass in null here?
Andrew BrownWebsocket and Webtransport behave differently w/r/t User-Agent (afaict) later in the network stack, so they don't need special handling to maintain & report the `script_exposed_request_headers` -- they can just report the usual `request_headers`.
`extensions_web_request_event_router.cc:1114` supports these alternatives by not requiring `script_exposed_request_headers` to be given -- it checks for nullptr before passing it off to the actual event. Maybe preference `std::optional` to make this more explicit?
Kelvin JiangLooking at this again, it occurs to me that in these cases we should just also pass `&request_headers_` as the `script_exposed_request_headers` to be reported, rather than mucking around with `std::optional` pointers or other silly `nullptr` tricks.
I've done this locally, will upload with the rest of the nits, etc.
Ack
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Oh also, could someone please put this through the trybots again for me?
There's a couple of new `DCHECK()`s that I'm not 100% confident in, so it'd be good to send it through the bots to be sure.
Thanks for resolving most of the nits, assuming your approach works, most of my review is geared towards:
one more question:
at any point inside the extensions webRequest code (before the extension makes changes to headers), since we set some CORS exempt headers in "exposed_headers", will any of those headers be present in the raw "request_headers"?
net::HttpRequestHeaders* request_headers,
net::HttpRequestHeaders* exposed_headers);What do you think is cleaner and more readable? having two header objects like now or wrapping these in a struct since I see many places where these two are side by side? Also we can add struct methods that when called once, will modify both set of headers!
Also, I assume we need to use a `net::HttpRequestHeaders*` for headers that need to skip CORS and get applied afterwards because extensions should be able to modify these?
net::HttpRequestHeaders* exposed_headers);Andrew Brownnit: this is "script_exposed_headers" in the .cc file, why the "script_" part? and can we add a comment here describing what it is?
Andrew Brown`script_exposed` vs `exposed` are basically arbitrary names. My original name was `user_visible_headers` but it's not clear who the *user* is (is it the extensions code, or page JS code, or the person sitting at the computer, or what), so I changed it.
As long as we all think the name indicates "headers that the extension will see", that's the goal. Possibly switch to use `exposed_headers` everywhere?
Switched to use `exposed_headers` most places, except for places where we already have a variable `request_headers`, then it's called `exposed_request_headers`.
I think this is mostly clear, and avoids questions about what "script" means? Thoughts welcome.
how about `extension_visible_headers` ?
as in: "these are the headers that will be visible to extensions"
// The request headers that should be exposed to extensions for this request.reword nit: "the request headers that are shown to extensions for this request."
"expose" kind of hints that there are headers that we hide
// according to `deltas`. Conflicts are tried to be resolved.Grammar is tried to make sense.
wording nit: "modifies the <etc> according to deltas and attempts to resolve conflicts".
CreateRequestBodyData(method, exposed_headers, data_sources);tiny optonal nit: no reason for this LOC change, I just go with whatever's more readable if it adheres to the style guide
// modificationstyling nit: period at end of comment
&request_headers_, &request_headers_);nit: if we don't need special "exposed" headers then we should just pass null here. This would prevent bugs where doing something to a header object twice has an unintended effect (since both of these point to the same object.)
That probably means that elsewhere, if "exposed headers" (or whatever we rename it to) is null, fall back to the original headers.
&request_headers_, &request_headers_);ditto comment from web_request_proxying_websocket.cc
So yes, let me know if using a struct in certain places would make things cleaner or more readable too, and thanks for all the effort!
Thanks for resolving most of the nits, assuming your approach works, most of my review is geared towards:
- making sure I understand what's happening
- evaluating how we can implement this solution while keeping the code clean and readable
one more question:
at any point inside the extensions webRequest code (before the extension makes changes to headers), since we set some CORS exempt headers in "exposed_headers", will any of those headers be present in the raw "request_headers"?
It's possible that the raw `request_headers` has "X-Header-Name: foo" and the `cors_exempt_headers` has "X-Header-Name: bar", yes. In this case, on the final request, the `cors_exempt_headers` value will override the `request_headers` value. This generally shouldn't be relied on, but the CL slightly standardises the header merging behaviour by implementing `net::HttpUtil::MergeAndAddUserAgent` to combine everything together, and now all places that do single merges use this. (Unfortunately extensions is a weird edge case because it has to keep things up-to-date all the time, so it is substantially cheaper to modify everything, rather than doing a whole new `MergeAndAddUserAgent` every single time.)
Possibly there is bad behaviour here where extensions will see a header on the `exposed_headers`, modify it (which will go to the to-be-discarded `exposed_headers` and the request's ordinary `headers`), but then the final request will still use the unmodified `cors_exempt_headers` value.
I'm not sure whether this is worth worrying about in light of the *current* behaviour, which is that extensions never gets access to the `cors_exempt_headers` at all? The choice is between:
net::HttpRequestHeaders* request_headers,
net::HttpRequestHeaders* exposed_headers);What do you think is cleaner and more readable? having two header objects like now or wrapping these in a struct since I see many places where these two are side by side? Also we can add struct methods that when called once, will modify both set of headers!
Also, I assume we need to use a `net::HttpRequestHeaders*` for headers that need to skip CORS and get applied afterwards because extensions should be able to modify these?
Hmm, not sure. I don't think there is much benefit to combining them because most functions will still need to do things with one or the other individually (i.e. it's not like we do `foo(both)` which calls `bar(both)` which calls `baz(both)` that splits them up, where `foo` and `bar` never need to go inside the `both` struct -- basically every level will need to go inside the `both` struct. The only super-relevant benefit will be that last level doesn't need to do two `headers->SetHeader(key, val)`, it can do a special `SetBoth(both, key, val)`, but I think that's maybe less clear than just doing two SetHeaders. Also, there are a lot of places (e.g. WebRequestInfo) that only have the `exposed_headers` still, and implicitly receive the `headers` via access to the ResourceRequest or similar. So none of those places would benefit from some sort of `TwoHeaders` struct.
If you would prefer to combine the two into some struct I'm happy to do that, but my thought is probably that keeping them separate looks better. You're the owner who'll have to deal with things long-term though, so please override me if you want.
---
I'm not sure what you mean about headers that skip CORS and get applied afterwards. Are you talking about cases where the extensions modifies "Not-Cors-Exempt: x" to be "Not-Cors-Exempt: y", and this modification would go onto `headers` and cause a CORS check?
This doesn't need special handling (as of right now) because these headers are all on the list in `net/http/http_util.cc`, and headers from that list are assumed to always be CORS-safe no matter where they are. This is because javascript APIs do not allow setting values for any of those headers, so if they have a value set, it must have come from somewhere else. This CL will not change any of that behaviour, except for User-Agent.
Does this mean, then, that extensions-set User-Agent values should be exempt from CORS checks? If so, we need special logic to ensure that if extensions modifies User-Agent, that change is reflected in the `content_user_agent` value, not just set onto `headers`, because values of `user_agent` on the `headers` will trigger CORS.
net::HttpRequestHeaders* exposed_headers);Andrew Brownnit: this is "script_exposed_headers" in the .cc file, why the "script_" part? and can we add a comment here describing what it is?
Andrew Brown`script_exposed` vs `exposed` are basically arbitrary names. My original name was `user_visible_headers` but it's not clear who the *user* is (is it the extensions code, or page JS code, or the person sitting at the computer, or what), so I changed it.
As long as we all think the name indicates "headers that the extension will see", that's the goal. Possibly switch to use `exposed_headers` everywhere?
Kelvin JiangSwitched to use `exposed_headers` most places, except for places where we already have a variable `request_headers`, then it's called `exposed_request_headers`.
I think this is mostly clear, and avoids questions about what "script" means? Thoughts welcome.
how about `extension_visible_headers` ?
as in: "these are the headers that will be visible to extensions"
Sounds good to me :) Done.
// The request headers that should be exposed to extensions for this request.reword nit: "the request headers that are shown to extensions for this request."
"expose" kind of hints that there are headers that we hide
Done
// according to `deltas`. Conflicts are tried to be resolved.Grammar is tried to make sense.
wording nit: "modifies the <etc> according to deltas and attempts to resolve conflicts".
Yeah, that read strangely to me too. Done.
CreateRequestBodyData(method, exposed_headers, data_sources);tiny optonal nit: no reason for this LOC change, I just go with whatever's more readable if it adheres to the style guide
I think the line is slightly too long, `git cl upload` yells at me and say I need to autoformat :(
Anyway, with the new `extensions_visible` name, it's definitely too long, so it has to stay.
request_.content_user_agent);I re-added this because I think it was what caused the test failures. I have no idea how or why, but I was able to pass header-modifying tests locally with this line added, and they failed without it.
Still needs another trybot run to confirm.
styling nit: period at end of comment
Agh, I knew I'd miss one. Fixed.
nit: if we don't need special "exposed" headers then we should just pass null here. This would prevent bugs where doing something to a header object twice has an unintended effect (since both of these point to the same object.)
That probably means that elsewhere, if "exposed headers" (or whatever we rename it to) is null, fall back to the original headers.
Ok yeah, so not worth passing both, return to the original impl where webtransport and websocket pass in nullptr and the OnBeforeSendHeaders has to check which one to use. Can do.
I also added a comment to `extension_web_request_event_router.h` for `OnBeforeSendHeaders()` to explain that it handles `nullptr` in the `extensions_visible_request_headers`.
&request_headers_, &request_headers_);Andrew Brownditto comment from web_request_proxying_websocket.cc
Fixed as above.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// `extension_visible_request_headers` according to `deltas, and attempts to`deltas` missing a backtick
DCHECK(extension_visible_request_headers);Elsewhere you are passing `nullptr` to `OnBeforeSendHeaders` for this parameter. Are you sure we can't end up with a `nullptr` here?
/*extension_visible_headers=*/request.headers)),Are these the extension visible headers? These haven't been merged? Or am I missing something?
&request_headers_, nullptr);This is the `nullptr` I'm referring to. I noticed you used `&request_headers_` in `web_request_proxying_webtransport.cc` instead of `nullptr`. Can't you do the same here?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
DCHECK(extension_visible_request_headers);Elsewhere you are passing `nullptr` to `OnBeforeSendHeaders` for this parameter. Are you sure we can't end up with a `nullptr` here?
Looks from the failed tests that we can.
DCHECK(extension_visible_request_headers);Andrea OrruElsewhere you are passing `nullptr` to `OnBeforeSendHeaders` for this parameter. Are you sure we can't end up with a `nullptr` here?
Looks from the failed tests that we can.
Yeah, I spotted dcheck failures from the websocket tests.
My implementation that passed, 2 patchsets ago, handled this by checking `extension_visible_request_headers` every single time it was going to be used, but I figured that's a lot of extra code, so worth switching to dcheck to see if that might not be necessary. (And it wouldn't have been a problem if we'd stayed with the "just pass request headers twice" approach for websockets & webtransport, but I guess that might have had other issues if every action actually was executed twice.)
Anyway, it seems like we do need to check everywhere. I'll revert that change and reupload, hopefully in the next 15 min or so.
// `extension_visible_request_headers` according to `deltas, and attempts toAndrew Brown`deltas` missing a backtick
Done
DCHECK(extension_visible_request_headers);Andrea OrruElsewhere you are passing `nullptr` to `OnBeforeSendHeaders` for this parameter. Are you sure we can't end up with a `nullptr` here?
Andrew BrownLooks from the failed tests that we can.
Yeah, I spotted dcheck failures from the websocket tests.
My implementation that passed, 2 patchsets ago, handled this by checking `extension_visible_request_headers` every single time it was going to be used, but I figured that's a lot of extra code, so worth switching to dcheck to see if that might not be necessary. (And it wouldn't have been a problem if we'd stayed with the "just pass request headers twice" approach for websockets & webtransport, but I guess that might have had other issues if every action actually was executed twice.)
Anyway, it seems like we do need to check everywhere. I'll revert that change and reupload, hopefully in the next 15 min or so.
Hopefully done but needs a new dry run.
...so much for 15 minutes, lol.
/*extension_visible_headers=*/request.headers)),Are these the extension visible headers? These haven't been merged? Or am I missing something?
Websocket requests are immune to CORS (at least by my reading of https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CORS). This is a huge relief because otherwise I would need to manage their own completely different way of passing User-Agents down through mojo -- see `WebRequestProxyingWebSocket::StartProxying()`, where User-Agent is actually added to the headers there.
So AFAICT they don't actually need any special `extension_visible_headers` behaviour, because they're safe to put all headers onto the ordinary `headers`, including User-Agent, which they already do, without triggering CORS stuff later on.
I've added a comment to explain this -- wording suggestions welcome to make sure it's clear.
This is the `nullptr` I'm referring to. I noticed you used `&request_headers_` in `web_request_proxying_webtransport.cc` instead of `nullptr`. Can't you do the same here?
Mm, I must have missed changing that, my bad. I had thought to pass `&request_headers_` twice, no weird nullptr handling necessary, but preference by webRequest API owners was to pass `nullptr` and check for it, to avoid any weird behaviour from potentially handling the same object twice.
https://chromium-review.googlesource.com/c/chromium/src/+/5273743/comment/b9d31962_44f08fa2/
IIRC I commented about `nullptr` handling somewhere in the event_router file, but I've also added comments to `web_request_api_helpers.h` to explain that they need to handle `nullptr` in some cases too.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
/*extension_visible_headers=*/request.headers)),Andrew BrownAre these the extension visible headers? These haven't been merged? Or am I missing something?
Websocket requests are immune to CORS (at least by my reading of https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CORS). This is a huge relief because otherwise I would need to manage their own completely different way of passing User-Agents down through mojo -- see `WebRequestProxyingWebSocket::StartProxying()`, where User-Agent is actually added to the headers there.
So AFAICT they don't actually need any special `extension_visible_headers` behaviour, because they're safe to put all headers onto the ordinary `headers`, including User-Agent, which they already do, without triggering CORS stuff later on.
I've added a comment to explain this -- wording suggestions welcome to make sure it's clear.
Well, I guess, not immune to CORS altogether, but immune to the `access-control-allow-headers` part of CORS, which is the bit I care about here.
It looks like for some reason something was wrong with shard 2 on android-desktop-x64-rel? If I pull up the logs, I can see 52 tests being run on that shard, and there are 52 corresponding `[ RUN ]` messages in the logs. However, there's also 52 corresponding `[ PASSED ]` messages in the same log, so it looks like the actual chromium code is fine?
Is it possible that shard just has something wrong with the instrumentation scripts that surround the test had a bug that caused timeouts?
I'm wary because the failing shard definitely seemed to be doing a bunch of extensions-related stuff, which is within my area. But the logs just don't make sense to me; if all 52 tests printed `[ PASSED ]` then what was actually timing out?
"usER-agENt",Andrew BrownIf there's no visible API change for extensions, is this change still needed?
Devlin CroninHm, true, I guess there technically still is a "visible API change" here, but that's because chromium currently doesn't follow the fetch spec w/r/t User-Agent.
The extensions Downloads API docs say that additional headers can be provided for the request if XMLHttpRequest allows it. Until now, XMLHttpRequest couldn't set User-Agents, but that's the core bug we're fixing. Technically I guess we're not actually changing the documented behaviour, because the docs rely on XMLHttpRequest, and we're just changing that.
Do we need an announcement/chromestatus.com entry/whatever for this change, then? I'm not super familiar with the actual "release process" for how this code change actually ships, so there might be steps there that I'm missing.
I think I'm okay with this, but I'll defer to @olive...@chromium.org on if he thinks this needs a separate announcement or discussion in the WECG.
I don't think this needs separate discussion / an announcement. Thanks for the heads up and checking regardless.
Oh also, could someone please put this through the trybots again for me?
There's a couple of new `DCHECK()`s that I'm not 100% confident in, so it'd be good to send it through the bots to be sure.
Done
It looks like for some reason something was wrong with shard 2 on android-desktop-x64-rel? If I pull up the logs, I can see 52 tests being run on that shard, and there are 52 corresponding `[ RUN ]` messages in the logs. However, there's also 52 corresponding `[ PASSED ]` messages in the same log, so it looks like the actual chromium code is fine?
Is it possible that shard just has something wrong with the instrumentation scripts that surround the test had a bug that caused timeouts?
I'm wary because the failing shard definitely seemed to be doing a bunch of extensions-related stuff, which is within my area. But the logs just don't make sense to me; if all 52 tests printed `[ PASSED ]` then what was actually timing out?
DCHECK(extension_visible_request_headers);Andrea OrruElsewhere you are passing `nullptr` to `OnBeforeSendHeaders` for this parameter. Are you sure we can't end up with a `nullptr` here?
Andrew BrownLooks from the failed tests that we can.
Andrew BrownYeah, I spotted dcheck failures from the websocket tests.
My implementation that passed, 2 patchsets ago, handled this by checking `extension_visible_request_headers` every single time it was going to be used, but I figured that's a lot of extra code, so worth switching to dcheck to see if that might not be necessary. (And it wouldn't have been a problem if we'd stayed with the "just pass request headers twice" approach for websockets & webtransport, but I guess that might have had other issues if every action actually was executed twice.)
Anyway, it seems like we do need to check everywhere. I'll revert that change and reupload, hopefully in the next 15 min or so.
Hopefully done but needs a new dry run.
...so much for 15 minutes, lol.
Done
/*extension_visible_headers=*/request.headers)),Andrew BrownAre these the extension visible headers? These haven't been merged? Or am I missing something?
Andrew BrownWebsocket requests are immune to CORS (at least by my reading of https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CORS). This is a huge relief because otherwise I would need to manage their own completely different way of passing User-Agents down through mojo -- see `WebRequestProxyingWebSocket::StartProxying()`, where User-Agent is actually added to the headers there.
So AFAICT they don't actually need any special `extension_visible_headers` behaviour, because they're safe to put all headers onto the ordinary `headers`, including User-Agent, which they already do, without triggering CORS stuff later on.
I've added a comment to explain this -- wording suggestions welcome to make sure it's clear.
Well, I guess, not immune to CORS altogether, but immune to the `access-control-allow-headers` part of CORS, which is the bit I care about here.
Thanks for adding the comment.
std::optional<std::string> extension_modified_user_agent =Uhm... How do you know this header was actually set by an extension and not by arbitrary JavaScript?
I re-added this because I think it was what caused the test failures. I have no idea how or why, but I was able to pass header-modifying tests locally with this line added, and they failed without it.
Still needs another trybot run to confirm.
Looks like this was the cause of the test failures, yeah.
std::optional<std::string> extension_modified_user_agent =Uhm... How do you know this header was actually set by an extension and not by arbitrary JavaScript?
Ohhhhh. D'oh, yeah, that code is wrong.
Practically speaking, this depends on whether extensions-given values are CORS exempt (which I never specifically got an answer for, though I strongly suspect they are).
If they're not I can remove this and go back to the easy approach, but if they are then I need to go back to the drawing board and devise another way to track User-Agent modifications across the webRequest api.
Andrew BrownWell, I guess now I need +1's from all the different owners -- re-pinged Takashi, as I feel like yourself and Adam should be the first eyes over it, given this is a primarily network-focused change.
Takashi ToyoshimaSorry to be a pest, but re-pinging, as it's been a week.
I will need reviews from both of you as Network People, eventually, in order to cover all the files (plus this is a complex & large change that can't hurt to have more eyes) so please dw about duplicate reviews. (quoting the "contributing to chromium" guide)
Takashi ToyoshimaSorry for overlooking this pings. It still LGTM.
Andrew BrownBy the way, I think you still need a reviewer's approval on Extensions related directory.
Devlin CroninAll good, appreciate the vote :)
Pinging Devlin now for extensions approval -- happy to go bother others if there's specific internal complexity to be looked at (there is a bit of weird stuff in there), but there's no visible API change for extensions anymore.
Thanks, Andrew! If there's no API change, I'm okay with this at a high level (though see the one comment I added). I've added @kelvi...@chromium.org for a more thorough review, as an owner of the webRequest API.
Done
I have a lot of questions and I might ask more in another pass:
With respect to extensions that can modify network requests, what happens here?
1. Before the request is sent, compute the value of `content_user_agent` which will be added to headers later?
2. Request headers are set, and modifications from extensions are made?
3. CORS checks happen but don't merge headers yet because `content_user_agent` is exempt?
4. Now merge `content_user_agent` into headers?
No worries, thanks for your review! Happy to answer questions, fire away. This might be too granular of a breakdown, or there might be more aspects to ask about, but this is how I'm conceptualising what happens:
1. Start with a `network::ResourceRequest` with `headers`, `cors_exempt_headers`, and `content_user_agent` which comes from DevTools or FetchContext. (All three of these must be kept separate until after CORS checks.)
2. Extensions code receives this request.
3. Extensions code has to invent the `script_exposed_headers`, which is the headers that will be *really* sent on this request, after CORS. This has to be a separate object because again we can't merge them all into the `headers` until after CORS.
4. Extensions code tells a running extension that the `script_exposed_headers` are the headers for "this request", so the extension sees the "real" final state of the request headers, including `content_user_agent`.
5. All extensions modifications are applied to *both* the real `headers` and the user-visible `script_exposed_headers`. This keeps the `script_exposed_headers` that the user can see in-sync with the real `headers` but enables us to make sure they stay separate.
6. After all extensions have done modifications/looked at things/etc, toss away the `script_exposed_headers`. We go back to having a `network::ResourceRequest` with `headers`, `cors_exempt_headers`, and `content_user_agent`, all separate, but with all the extensions modifications applied to `headers` as requested.
7. CORS checks happen on `headers`. A CORS preflight may be sent for User-Agent here if the request has a User-Agent specified on the `headers`, as it must have come from an untrusted source.
8. After CORS, if the preflight said it's fine, then we can use the `headers` User-Agent. If it's not fine, or if there wasn't one, we can merge the `content_user_agent` onto the headers to actually make the request.
Done
Alright, well. Two months of trying to understand webRequest code and I think this approach should actually be much nicer (and is also ~200 lines shorter). Sorry for all the work you put in to understand the original system :(
The new idea is just that we can put the `content_user_agent` onto the headers, and then right at the end, before we actually `CreateLoaderAndStart()` to pass the request back to the network, we remove the header again. This is also substantially less involved of a change, because we only do fiddling of the User-Agent and only when necessary.
The only mildly dodgy thing is I had to add a `mutable bool` to `WebRequestInfo` to track whether the User-Agent was modified by an extension.
I've run this through a few bits of `browser_tests` and `extensions_unittests` locally but a full dryrun might still pick up other issues.
net::HttpRequestHeaders* request_headers,
net::HttpRequestHeaders* exposed_headers);Andrew BrownWhat do you think is cleaner and more readable? having two header objects like now or wrapping these in a struct since I see many places where these two are side by side? Also we can add struct methods that when called once, will modify both set of headers!
Also, I assume we need to use a `net::HttpRequestHeaders*` for headers that need to skip CORS and get applied afterwards because extensions should be able to modify these?
Hmm, not sure. I don't think there is much benefit to combining them because most functions will still need to do things with one or the other individually (i.e. it's not like we do `foo(both)` which calls `bar(both)` which calls `baz(both)` that splits them up, where `foo` and `bar` never need to go inside the `both` struct -- basically every level will need to go inside the `both` struct. The only super-relevant benefit will be that last level doesn't need to do two `headers->SetHeader(key, val)`, it can do a special `SetBoth(both, key, val)`, but I think that's maybe less clear than just doing two SetHeaders. Also, there are a lot of places (e.g. WebRequestInfo) that only have the `exposed_headers` still, and implicitly receive the `headers` via access to the ResourceRequest or similar. So none of those places would benefit from some sort of `TwoHeaders` struct.
If you would prefer to combine the two into some struct I'm happy to do that, but my thought is probably that keeping them separate looks better. You're the owner who'll have to deal with things long-term though, so please override me if you want.
---
I'm not sure what you mean about headers that skip CORS and get applied afterwards. Are you talking about cases where the extensions modifies "Not-Cors-Exempt: x" to be "Not-Cors-Exempt: y", and this modification would go onto `headers` and cause a CORS check?
This doesn't need special handling (as of right now) because these headers are all on the list in `net/http/http_util.cc`, and headers from that list are assumed to always be CORS-safe no matter where they are. This is because javascript APIs do not allow setting values for any of those headers, so if they have a value set, it must have come from somewhere else. This CL will not change any of that behaviour, except for User-Agent.Does this mean, then, that extensions-set User-Agent values should be exempt from CORS checks? If so, we need special logic to ensure that if extensions modifies User-Agent, that change is reflected in the `content_user_agent` value, not just set onto `headers`, because values of `user_agent` on the `headers` will trigger CORS.
I've assumed an extensions-set User-Agent is immune to CORS, as that's the current behaviour.
// User-Agent.Andrew Brownby this you mean the CORS exempt user agent that should be merged into request headers later?
Andrew BrownIf the headers already have a User-Agent it will be that, e.g. if this request comes from someone doing `fetch('foo.com' {headers: {"User-Agent": "baz"}});`, then the exposed_headers will have "User-Agent: baz". The content_user_agent shouldn't override any other User-Agent.
So it's not guaranteed that these headers have the CORS exempt User-Agent, just that they have *a* User-Agent.
I've updated this comment, hopefully it's clearer now?
Done
// that has to happen in the network stack (after CORS checks).Andrew Browndumb question: merging before would fail CORS checks?
Unfortunately yeah, because this all happens before the CORS checks, if we put a User-Agent on the `headers`, it triggers a CORS preflight. (Currently CORS ignores "User-Agent" because it's on the "banned" list in `net/http/http_util.cc` but this is the actual bug we're fixing -- it shouldn't be there.)
Lots of websites aren't expecting to receive a CORS preflight for "User-Agent" (because it's already on every request ever!), so they tell CORS "no User-Agent please" and then the request fails when it shouldn't.
Done
std::optional<std::string> extension_modified_user_agent =Andrew BrownUhm... How do you know this header was actually set by an extension and not by arbitrary JavaScript?
Ohhhhh. D'oh, yeah, that code is wrong.
Practically speaking, this depends on whether extensions-given values are CORS exempt (which I never specifically got an answer for, though I strongly suspect they are).
If they're not I can remove this and go back to the easy approach, but if they are then I need to go back to the drawing board and devise another way to track User-Agent modifications across the webRequest api.
"back to the drawing board" indeed
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Alright, well. Two months of trying to understand webRequest code and I think this approach should actually be much nicer (and is also ~200 lines shorter). Sorry for all the work you put in to understand the original system :(
The new idea is just that we can put the `content_user_agent` onto the headers, and then right at the end, before we actually `CreateLoaderAndStart()` to pass the request back to the network, we remove the header again. This is also substantially less involved of a change, because we only do fiddling of the User-Agent and only when necessary.
The only mildly dodgy thing is I had to add a `mutable bool` to `WebRequestInfo` to track whether the User-Agent was modified by an extension.
I've run this through a few bits of `browser_tests` and `extensions_unittests` locally but a full dryrun might still pick up other issues.
So: let the extension modify the content user agent as needed but remove it after the request goes through extensions WebRequest to make sure we don't trigger a CORS preflight unintentionally?
I assume this should work the same (pipe the user agent down with the request but only apply it when the time is right) if there are no webRequest extensions that would modify the request?
Can we add a WebRequest browser test for this? i.e. make sure that an extension can modify the user-agent but it won't cause a CORS preflght?
// `request_.content_user_agent` -- if so, it has to be cleared before theoptional nit: pipes for consistency with `|has_any_extra_headers_listeners_|` below
// Whether this request had it's User-Agent temporarily set based on theits
Kelvin JiangAlright, well. Two months of trying to understand webRequest code and I think this approach should actually be much nicer (and is also ~200 lines shorter). Sorry for all the work you put in to understand the original system :(
The new idea is just that we can put the `content_user_agent` onto the headers, and then right at the end, before we actually `CreateLoaderAndStart()` to pass the request back to the network, we remove the header again. This is also substantially less involved of a change, because we only do fiddling of the User-Agent and only when necessary.
The only mildly dodgy thing is I had to add a `mutable bool` to `WebRequestInfo` to track whether the User-Agent was modified by an extension.
I've run this through a few bits of `browser_tests` and `extensions_unittests` locally but a full dryrun might still pick up other issues.
So: let the extension modify the content user agent as needed but remove it after the request goes through extensions WebRequest to make sure we don't trigger a CORS preflight unintentionally?
I assume this should work the same (pipe the user agent down with the request but only apply it when the time is right) if there are no webRequest extensions that would modify the request?
Yeah pretty much. Move the `content_user_agent` over to the `headers` for extensions to modify, then move it back to the `content_user_agent` if needed.
If there are no extensions then this code doesn't trigger at all (I think?).
If there are extensions, but they don't modify the headers, then in `web_request_proxying_url_loader_factory.cc` line 767, we add the headers and set our `user_agent_added_for_extensions_` flag, then on line 886 the flag will trigger removal of the User-Agent again. The other flag for "extensions have modified these headers" will never be set so we won't move the value back over, we'll just get rid of it.
At the end, we should only have a `User-Agent` in the `headers` if it was there to begin with (i.e. `fetch(... {headers: {"user-agent": "foo"}});`), *and* no extension changed this value.
Test failures look like `ContinueToStartRequest` is already "too late" to avoid CORS checks, I'll look for a new spot to move the User-Agent back to `content_user_agent`.
Can we add a WebRequest browser test for this? i.e. make sure that an extension can modify the user-agent but it won't cause a CORS preflght?
That sounds like a good idea. If the current code all works then I'll look into it.
Test failures look like `ContinueToStartRequest` is already "too late" to avoid CORS checks, I'll look for a new spot to move the User-Agent back to `content_user_agent`.
...my logic was just wrong, now (hopefully) fixed.
I did move the user-agent fiddling block *slightly* earlier in `ContinueToStartRequest` as it appears there are a few blockers that get removed before `CreateLoaderAndStart()` gets called, and I don't want race conditions.
// `request_.content_user_agent` -- if so, it has to be cleared before theoptional nit: pipes for consistency with `|has_any_extra_headers_listeners_|` below
Done
// Whether this request had it's User-Agent temporarily set based on theAndrew Brownits
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
oh yeah could you rebase and resolve merge conflicts before we see if it passes the dry run?
oh yeah could you rebase and resolve merge conflicts before we see if it passes the dry run?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I'm actually quite impressed ASAN picked that up, that's a rather nifty system.
My money is that some part of the bottom of `ExecuteDeltas()` triggers the `WebRequestProxyingURLLoader()` to continue, which for a failed request, will then race on and free the `WebRequestInfo` *before* the new code sets the `user_agent_modified_by_extension` flag, making it a UAF. I'm recompiling to check this locally now.
I'll get back to this next week, apologies -- my thesis is due Friday.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I'll get back to this next week, apologies -- my thesis is due Friday.
No worries. Thank you for your work and good luck with the thesis!
Andrew BrownCan we add a WebRequest browser test for this? i.e. make sure that an extension can modify the user-agent but it won't cause a CORS preflght?
That sounds like a good idea. If the current code all works then I'll look into it.
Looking at this now. Is there a preference for whether I make a new "series" of extension tests (i.e. add a new folder & testing-extension `c/t/d/e/api_test/webrequest/test_user_agent_cors` or something) to test this, or should I just tack some new user-agent tests onto the end of `c/t/d/e/api_test/webrequest/test_cors/test_cors.js`?
I'm probably leaning towards the former, but better to check your preference too.
Andrea OrruI'll get back to this next week, apologies -- my thesis is due Friday.
No worries. Thank you for your work and good luck with the thesis!
Done -- phew!
net::HttpRequestHeaders* request_headers,
net::HttpRequestHeaders* exposed_headers);Andrew BrownWhat do you think is cleaner and more readable? having two header objects like now or wrapping these in a struct since I see many places where these two are side by side? Also we can add struct methods that when called once, will modify both set of headers!
Also, I assume we need to use a `net::HttpRequestHeaders*` for headers that need to skip CORS and get applied afterwards because extensions should be able to modify these?
Andrew BrownHmm, not sure. I don't think there is much benefit to combining them because most functions will still need to do things with one or the other individually (i.e. it's not like we do `foo(both)` which calls `bar(both)` which calls `baz(both)` that splits them up, where `foo` and `bar` never need to go inside the `both` struct -- basically every level will need to go inside the `both` struct. The only super-relevant benefit will be that last level doesn't need to do two `headers->SetHeader(key, val)`, it can do a special `SetBoth(both, key, val)`, but I think that's maybe less clear than just doing two SetHeaders. Also, there are a lot of places (e.g. WebRequestInfo) that only have the `exposed_headers` still, and implicitly receive the `headers` via access to the ResourceRequest or similar. So none of those places would benefit from some sort of `TwoHeaders` struct.
If you would prefer to combine the two into some struct I'm happy to do that, but my thought is probably that keeping them separate looks better. You're the owner who'll have to deal with things long-term though, so please override me if you want.
---
I'm not sure what you mean about headers that skip CORS and get applied afterwards. Are you talking about cases where the extensions modifies "Not-Cors-Exempt: x" to be "Not-Cors-Exempt: y", and this modification would go onto `headers` and cause a CORS check?
This doesn't need special handling (as of right now) because these headers are all on the list in `net/http/http_util.cc`, and headers from that list are assumed to always be CORS-safe no matter where they are. This is because javascript APIs do not allow setting values for any of those headers, so if they have a value set, it must have come from somewhere else. This CL will not change any of that behaviour, except for User-Agent.Does this mean, then, that extensions-set User-Agent values should be exempt from CORS checks? If so, we need special logic to ensure that if extensions modifies User-Agent, that change is reflected in the `content_user_agent` value, not just set onto `headers`, because values of `user_agent` on the `headers` will trigger CORS.
I've assumed an extensions-set User-Agent is immune to CORS, as that's the current behaviour.
Done, and will ensure this behaviour is included in the new test(s).
| Commit-Queue | +1 |
Andrew BrownCan we add a WebRequest browser test for this? i.e. make sure that an extension can modify the user-agent but it won't cause a CORS preflght?
Andrew BrownThat sounds like a good idea. If the current code all works then I'll look into it.
Looking at this now. Is there a preference for whether I make a new "series" of extension tests (i.e. add a new folder & testing-extension `c/t/d/e/api_test/webrequest/test_user_agent_cors` or something) to test this, or should I just tack some new user-agent tests onto the end of `c/t/d/e/api_test/webrequest/test_cors/test_cors.js`?
I'm probably leaning towards the former, but better to check your preference too.
would prefer a series of new tests to prevent test extension bloat, also some extension tests carry state between one test to the next so if we can move away from that too it'd be great :)
thanks Andrew! Some questions
// |request_.content_user_agent| -- if so, it has to be cleared before theis this AI?
nit: `user_agent|. If so ...` (just use a period)
!(request_.headers.HasHeader(net::HttpRequestHeaders::kUserAgent));nit: extra parentheses?
std::optional<std::string> extension_modified_user_agent =
request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent);nit: rename to user_agent since this may not have been actually modified by the extension (we'll know if it is in the below if statement)
request_.content_user_agent = *extension_modified_user_agent;I'm going to assume that `extension_modified_user_agent` should always be populated if `info_->user_agent_modified_by_extension` is true right? if so we can probably just `request_.content_user_agent = request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent);`
(and yeah add a `CHECK(request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent));` here too)
if (user_agent_added_for_extensions_) {For my understanding: what's the difference between this and `info_->user_agent_modified_by_extension` ? This reads a bit confusing
Andrew BrownCan we add a WebRequest browser test for this? i.e. make sure that an extension can modify the user-agent but it won't cause a CORS preflght?
Andrew BrownThat sounds like a good idea. If the current code all works then I'll look into it.
Kelvin JiangLooking at this now. Is there a preference for whether I make a new "series" of extension tests (i.e. add a new folder & testing-extension `c/t/d/e/api_test/webrequest/test_user_agent_cors` or something) to test this, or should I just tack some new user-agent tests onto the end of `c/t/d/e/api_test/webrequest/test_cors/test_cors.js`?
I'm probably leaning towards the former, but better to check your preference too.
would prefer a series of new tests to prevent test extension bloat, also some extension tests carry state between one test to the next so if we can move away from that too it'd be great :)
Alright, can do.
I'm struggling a bit with some rather frustrating `Media load rejected by URL safety check` problem that seems to just be an inconsistent flake caused by blink? It's hitting me every time I run my new test locally, so I'll have to work out what's making the new test trigger that system. (Not uploaded yet, because I'm not confident it works at all.)
// |request_.content_user_agent| -- if so, it has to be cleared before theis this AI?
nit: `user_agent|. If so ...` (just use a period)
Hah, nah, I used em-dashes before it was cool :( Happy to switch to a full stop.
!(request_.headers.HasHeader(net::HttpRequestHeaders::kUserAgent));Andrew Brownnit: extra parentheses?
Done
std::optional<std::string> extension_modified_user_agent =
request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent);nit: rename to user_agent since this may not have been actually modified by the extension (we'll know if it is in the below if statement)
Done
request_.content_user_agent = *extension_modified_user_agent;I'm going to assume that `extension_modified_user_agent` should always be populated if `info_->user_agent_modified_by_extension` is true right? if so we can probably just `request_.content_user_agent = request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent);`
(and yeah add a `CHECK(request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent));` here too)
In theory an extension could remove the user-agent altogether, in which case we shouldn't clear the `request_.content_user_agent`, because the request should still go out with some user-agent on it. So we still need to check it, we can't just assume it's set.
if (user_agent_added_for_extensions_) {For my understanding: what's the difference between this and `info_->user_agent_modified_by_extension` ? This reads a bit confusing
The `info_->user_agent_modified_by_extension` flag tracks whether an extension has actually changed the User-Agent (so we move the value over to the `content_user_agent`). Basically, "an extension touched this, so we need to make sure it stays CORS-exempt".
The `user_agent_added_for_extensions_` flag tracks whether we placed the `content_user_agent` onto the `request_.headers` before anything happened, so we should remove it so it's not visible to CORS. Basically, "we put the `User-Agent` on the non-CORS-exempt headers, so we need to take it back off again".
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
To unsubscribe from this group and stop receiving emails from it, send an email to devtools-revie...@chromium.org.
| Commit-Queue | +1 |
Andrew BrownCan we add a WebRequest browser test for this? i.e. make sure that an extension can modify the user-agent but it won't cause a CORS preflght?
Andrew BrownThat sounds like a good idea. If the current code all works then I'll look into it.
Kelvin JiangLooking at this now. Is there a preference for whether I make a new "series" of extension tests (i.e. add a new folder & testing-extension `c/t/d/e/api_test/webrequest/test_user_agent_cors` or something) to test this, or should I just tack some new user-agent tests onto the end of `c/t/d/e/api_test/webrequest/test_cors/test_cors.js`?
I'm probably leaning towards the former, but better to check your preference too.
Andrew Brownwould prefer a series of new tests to prevent test extension bloat, also some extension tests carry state between one test to the next so if we can move away from that too it'd be great :)
Alright, can do.
I'm struggling a bit with some rather frustrating `Media load rejected by URL safety check` problem that seems to just be an inconsistent flake caused by blink? It's hitting me every time I run my new test locally, so I'll have to work out what's making the new test trigger that system. (Not uploaded yet, because I'm not confident it works at all.)
Acknowledged
thanks Andrew! will wait for the new test
request_.content_user_agent = *extension_modified_user_agent;Andrew BrownI'm going to assume that `extension_modified_user_agent` should always be populated if `info_->user_agent_modified_by_extension` is true right? if so we can probably just `request_.content_user_agent = request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent);`
(and yeah add a `CHECK(request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent));` here too)
In theory an extension could remove the user-agent altogether, in which case we shouldn't clear the `request_.content_user_agent`, because the request should still go out with some user-agent on it. So we still need to check it, we can't just assume it's set.
oh forgot about that, noted! (a small test case here would be great too)
but I assume the "user agent" is not actually removed only if it's `request_.content_user_agent` ?
if (user_agent_added_for_extensions_) {Andrew BrownFor my understanding: what's the difference between this and `info_->user_agent_modified_by_extension` ? This reads a bit confusing
The `info_->user_agent_modified_by_extension` flag tracks whether an extension has actually changed the User-Agent (so we move the value over to the `content_user_agent`). Basically, "an extension touched this, so we need to make sure it stays CORS-exempt".
The `user_agent_added_for_extensions_` flag tracks whether we placed the `content_user_agent` onto the `request_.headers` before anything happened, so we should remove it so it's not visible to CORS. Basically, "we put the `User-Agent` on the non-CORS-exempt headers, so we need to take it back off again".
in that case should we rename `user_agent_added_for_extensions_` to something else: maybe something that reflects more about why it was set (based on your comment above?)
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Andrew BrownCan we add a WebRequest browser test for this? i.e. make sure that an extension can modify the user-agent but it won't cause a CORS preflght?
Andrew BrownThat sounds like a good idea. If the current code all works then I'll look into it.
Kelvin JiangLooking at this now. Is there a preference for whether I make a new "series" of extension tests (i.e. add a new folder & testing-extension `c/t/d/e/api_test/webrequest/test_user_agent_cors` or something) to test this, or should I just tack some new user-agent tests onto the end of `c/t/d/e/api_test/webrequest/test_cors/test_cors.js`?
I'm probably leaning towards the former, but better to check your preference too.
Andrew Brownwould prefer a series of new tests to prevent test extension bloat, also some extension tests carry state between one test to the next so if we can move away from that too it'd be great :)
Kelvin JiangAlright, can do.
I'm struggling a bit with some rather frustrating `Media load rejected by URL safety check` problem that seems to just be an inconsistent flake caused by blink? It's hitting me every time I run my new test locally, so I'll have to work out what's making the new test trigger that system. (Not uploaded yet, because I'm not confident it works at all.)
Acknowledged
Update: I think I've worked that issue out. Some of the behaviour is still a bit weird (e.g. the extension isn't actually setting the user-agent, even though I've copied code from `test_extra_headers`, which passes this CL...), so I'm rebasing my test onto main now in order to check what's *supposed* to happen.
Is it a problem that basically every extensions test uses the deprecated manifest V2 APIs? I tried using manifest V3 for this new test but I need access to the `framework.js` script (the one that `test_cors` and `test_extra_headers` uses), and it uses a few V2-only APIs.
request_.content_user_agent = *extension_modified_user_agent;Andrew BrownI'm going to assume that `extension_modified_user_agent` should always be populated if `info_->user_agent_modified_by_extension` is true right? if so we can probably just `request_.content_user_agent = request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent);`
(and yeah add a `CHECK(request_.headers.GetHeader(net::HttpRequestHeaders::kUserAgent));` here too)
Kelvin JiangIn theory an extension could remove the user-agent altogether, in which case we shouldn't clear the `request_.content_user_agent`, because the request should still go out with some user-agent on it. So we still need to check it, we can't just assume it's set.
oh forgot about that, noted! (a small test case here would be great too)
but I assume the "user agent" is not actually removed only if it's `request_.content_user_agent` ?
The flow we have to be careful about is:
If this happens we have a request going through with no user-agent at all.
I have experienced that there is another fallback somewhere deep in the net// internals that will still add the default-user-agent back to the request, if it somehow gets that far and doesn't have one. But the problem is Devtools may have set an override, which is reflected in `content_user_agent`, but not in this very-last-fallback in net//.
So the super obnoxious edge case is:
I'll add a test to check that extensions removing a UA doesn't cause problems. I'm not sure about testing this specific flow because it involves devtools setting an override and that's somewhat out-of-scope for an extensions test, but I'll look into it.
if (user_agent_added_for_extensions_) {Andrew BrownFor my understanding: what's the difference between this and `info_->user_agent_modified_by_extension` ? This reads a bit confusing
Kelvin JiangThe `info_->user_agent_modified_by_extension` flag tracks whether an extension has actually changed the User-Agent (so we move the value over to the `content_user_agent`). Basically, "an extension touched this, so we need to make sure it stays CORS-exempt".
The `user_agent_added_for_extensions_` flag tracks whether we placed the `content_user_agent` onto the `request_.headers` before anything happened, so we should remove it so it's not visible to CORS. Basically, "we put the `User-Agent` on the non-CORS-exempt headers, so we need to take it back off again".
in that case should we rename `user_agent_added_for_extensions_` to something else: maybe something that reflects more about why it was set (based on your comment above?)
Sounds good, though I'm not exactly sure how I'd express that in a different way than the current name. `added_extension_visible_user_agent_`?
To unsubscribe from this group and stop receiving emails from it, send an email to devtools-revie...@chromium.org.