| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks for diving into this! Redirects (and how existing code depends on them) can be tricky, and it's good that you've identified a few places that would need to be updated to allow the sanitization.
I think we may want to use those cases to think about how to structure the change, per the comments below. It's good to know that the current approaches are enough to pass the try jobs, but I'm a bit nervous about adding the state and how the silent rewriting is done, which might make the redirect code (even) harder to reason about. :)
This CL sanitizes those fields.I think it's worth a step back here. Many of the changes below seem like they could introduce races, edge cases, or other bugs, and it's not obvious what invariant change we're aiming for.
From skimming the bug and CL, it sounds like we pass full URLs (in location headers and original URL params) from browser to renderer and back again, today. IIUC, it seems like the goal is to only send origins to the renderer, and the approach to keep things working is to preserve the full URLs on the browser side for when they are needed later in the navigation. Is that right?
If so, I'm wondering if we can be more explicit about what we are sending to the renderer going forward. Can we make it clear it's an origin and not a full URL, rather than silently converting between the two types of values, to avoid confusion of what's in the param?
(For that matter, does the renderer actually need the origin, or could the whole thing just be a placeholder/token or even be removed entirely from commit params if the real value is only needed back in the browser process?)
navigation_id to the renderer and maps them to full URLs in the parentAs you noted, I think we'd probably prefer an unguessable token than sending an actual unique ID from the browser process to the renderer. (That's if we conclude that we do need the state; we may want to discuss that more first.)
in for sanitizing `original_url` to prevent the non-standardnit: Drop "in"? (I'm not sure, but it seems like there's a typo here.)
This does seem like a fairly high risk change, so I imagine we'll want to flag guard it. I'm not sure how easy that is, so we can try to settle on the approach before adding all of that complexity.
// Helper to get the origin of a URL for sanitization purposes.We should probably make these helpers more clearly about redirect sanitization, since sanitization is ambiguous and sounds like it might need to be used in more places.
This collection of helpers might also be good to move somewhere more specific to IPC security than NavigationRequest, though I'm not sure. We do have ipc_utils.cc for somewhat similar verification code, as one possibiliy.
// DeprecatedGetOriginAsURL() returns an empty URL for non-standard schemes.Is there a way to give an example? At first glance it's hard to know why this is needed and whether it's sufficient.
/*navigation_id=*/0);I'm nervous about this. :) It seems hard to reason about having to pass this parameter around when many places don't assign it.
->SetNavigationIdForChild(I'd really like to avoid storing this as state in the renderer if possible. That creates chances for the state to get out of sync or be stale (or lied about), on top of all the plumbing it requires. Let's chat about possible approaches to see if we can avoid it.
GetRenderFrameHost()->set_navigation_original_url(This doesn't really fit with the comment above. Should it move into the helper function, if all the callers have to do it? (That's assuming we keep the value, which I'm separately wondering about.)
if (!AreSameOriginForSanitization(commit_params->original_url,I'm curious: Nasko's previous code cropped everything down to an origin, without a same-origin check. Should we be consistent across the cases?
const GURL& next_url = commit_params->redirects[i + 1];I'm not sure I follow the `i+1` part at first glance, so it might be worth having comments about why it's needed.
GURL navigation_original_url_;It's generally not great to store navigation state on RFH-- that's what NavigationRequest is for. Is it actually needed after the navigation completes, and how does that work today?
Also, is it actually test-specific, or do real observers depend on the value? If it's only for tests, maybe we can find a fix in the test framework that doesn't require new state on RFH.
resource_load_info->original_url = navigation_original_url_;This is where I'm wondering if we can be more explicit about what's sent to the renderer and what's stored in the browser. It seems confusing to have one named variable with different types of values depending on where we are in the navigation flow.
If the authoritative copy lives in the browser and we can just send either an Origin or a placeholder to the renderer under a different name, then it might help to be explicit about that in the data structures.
KURL pending_navigation_url_;I'm also concerned about storing the pending navigation URL (along with the previous comment about the ID), since these both seem easy to get wrong or use incorrectly.
// TODO(crbug.com/1410705): fix this properly by moving IFrame reporting toInteresting-- is the code below still needed at all? The bug in this TODO is marked fixed, so I'm curious if we can remove it instead of adding the state.
pending_navigation_url_ = KURL(fallback_timing_info_->name);I'm not sure I understand how `name` and `url` are related here.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Liam Brady abandoned this change.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks! I'm happy with the approach to land the non-working feature in a disabled state, if we're reasonably confident that we've caught the places that need sanitizing. Basically looks good with some comment suggestions and a few questions.
Some features will break as a result of this change and will need to beLet's clarify that this CL isn't actively breaking anything-- it's just that the disabled-by-default feature introduced here can't be enabled until these features have been updated.
`original_origin` field, which will happen once all features are workingMight be worth mentioning your plan to make it an Origin type after the conversion as well, since it's still a GURL for now.
Out of curiosity, have you attempted a draft CL that changes the type (e.g., without bothering with a feature flag), just to see the extent of the changes and have a better sense that the conversion is going to work?
(The approach in this CL has a nice small delta, but it doesn't give a lot of confidence that we've looked at the places that could be affected.)
? common_params->url.DeprecatedGetOriginAsURL()This is generally not a safe function to call, but I think it's consistent with what Nasko was doing. Can you include a comment above this block similar to https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/navigation_request.cc;drc=dbf362a812f4e3822fac968906d87a77d07d6a84;l=7950-7953 to explain why it's ok?
Probably the same in the other places where we call this.
// See https://crbug.com/40095391 https://crbug.com/495463654.Should we revert these changes? It looks like the other flags are defined elsewhere now, and we're not modifying this flag or how it's used.
commit_params_->original_url = common_params_->url;All of these places seem reasonable to me, though I'm not sure how to tell that we caught them all. (Maybe the earlier idea of a draft CL where we change the type to Origin would help with that?)
url::Origin final_origin = url::Origin::Create(common_params_->url);This is also potentially a bit fraught:
1) Are we ok not handling any cases that inherit an origin, like about:blank? Maybe those will never have Location headers as part of a redirect chain?
2) Should we be sanitizing if we cross a CSP sandbox boundary?
It's possible we could get by with a comment like the other cases, but I'm curious if a TODO is needed.
}Can these new tests also be checking original_url?
url::AddStandardScheme("chrome-native", url::SCHEME_WITH_HOST);Does this depend on actually being chrome-native (which I didn't think had much support in content/, though it looks like some [metrics code](https://chromium-review.googlesource.com/c/chromium/src/+/6173196) knows about it), or can we make something up like chrome-foo?
// 1. Redirect to data: URL.(Technically only extensions can do this, but it's a good test. Probably better not to mention extensions in the comment since this test is in content/.)
/*disabled_features=*/{});Should this test verify anything about Location headers or original_url? Otherwise I'm not sure why these features need to be enabled.
// to only include the origin when cross-origin to the final URL.Let's include the bug number here and below.
// original URL).I wonder if we can clarify this at all. I think the second sentence is correct and that we always have the full final URL in non-redirecting cases. Is the first sentence wrong if the feature is enabled, since we're using just an original origin in that case?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Some features will break as a result of this change and will need to beLet's clarify that this CL isn't actively breaking anything-- it's just that the disabled-by-default feature introduced here can't be enabled until these features have been updated.
Done
`original_origin` field, which will happen once all features are workingMight be worth mentioning your plan to make it an Origin type after the conversion as well, since it's still a GURL for now.
Out of curiosity, have you attempted a draft CL that changes the type (e.g., without bothering with a feature flag), just to see the extent of the changes and have a better sense that the conversion is going to work?
(The approach in this CL has a nice small delta, but it doesn't give a lot of confidence that we've looked at the places that could be affected.)
Done.
From the [abandoned CL that tries to do everything at once](https://chromium-review.git.corp.google.com/c/chromium/src/+/7765993), it's primarily resource timing, observer notifications, and Android chrome-native:// navigations.
I have 2 do-not-submit CLs up to get a sense of what will break:
1. https://chromium-review.git.corp.google.com/c/chromium/src/+/7870029 CL to turn `original_url` into `original_origin` and see what breaks.
2. https://chromium-review.git.corp.google.com/c/chromium/src/+/7869869?tab=checks CL to enable the 2 features introduced in this CL to see what breaks.
This is generally not a safe function to call, but I think it's consistent with what Nasko was doing. Can you include a comment above this block similar to https://source.chromium.org/chromium/chromium/src/+/main:content/browser/renderer_host/navigation_request.cc;drc=dbf362a812f4e3822fac968906d87a77d07d6a84;l=7950-7953 to explain why it's ok?
Probably the same in the other places where we call this.
Done
// See https://crbug.com/40095391 https://crbug.com/495463654.Should we revert these changes? It looks like the other flags are defined elsewhere now, and we're not modifying this flag or how it's used.
Done
All of these places seem reasonable to me, though I'm not sure how to tell that we caught them all. (Maybe the earlier idea of a draft CL where we change the type to Origin would help with that?)
I'll get a CL up for that.
url::Origin final_origin = url::Origin::Create(common_params_->url);This is also potentially a bit fraught:
1) Are we ok not handling any cases that inherit an origin, like about:blank? Maybe those will never have Location headers as part of a redirect chain?
2) Should we be sanitizing if we cross a CSP sandbox boundary?
It's possible we could get by with a comment like the other cases, but I'm curious if a TODO is needed.
I've added a TODO comment to look at this as a follow-up.
Can these new tests also be checking original_url?
Done
url::AddStandardScheme("chrome-native", url::SCHEME_WITH_HOST);Does this depend on actually being chrome-native (which I didn't think had much support in content/, though it looks like some [metrics code](https://chromium-review.googlesource.com/c/chromium/src/+/6173196) knows about it), or can we make something up like chrome-foo?
I think we can get away with `chrome-foo`. The `chrome-native` stuff that this bug fix needs to be careful around is tested elsewhere so I think this should be fine.
Should this test verify anything about Location headers or original_url? Otherwise I'm not sure why these features need to be enabled.
I'll keep just `kSanitizeOriginalUrlDuringNavigation` enabled and check the `original_url` value.
// to only include the origin when cross-origin to the final URL.Let's include the bug number here and below.
Done
I wonder if we can clarify this at all. I think the second sentence is correct and that we always have the full final URL in non-redirecting cases. Is the first sentence wrong if the feature is enabled, since we're using just an original origin in that case?
For correctness, I've updated this comment to mention redirect sanitization as a possibility.
// the same as the original URL).Liam BradySame here.
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
Sanitize more commit params for cross-origin redirects.nit: Just to make it clearer from the CL summary, maybe prefix this with something like "[Disabled] " so that it doesn't look like we've fixed the bug when this lands?
`original_origin` field, which will happen once all features are workingLiam BradyMight be worth mentioning your plan to make it an Origin type after the conversion as well, since it's still a GURL for now.
Out of curiosity, have you attempted a draft CL that changes the type (e.g., without bothering with a feature flag), just to see the extent of the changes and have a better sense that the conversion is going to work?
(The approach in this CL has a nice small delta, but it doesn't give a lot of confidence that we've looked at the places that could be affected.)
Done.
From the [abandoned CL that tries to do everything at once](https://chromium-review.git.corp.google.com/c/chromium/src/+/7765993), it's primarily resource timing, observer notifications, and Android chrome-native:// navigations.
I have 2 do-not-submit CLs up to get a sense of what will break:
1. https://chromium-review.git.corp.google.com/c/chromium/src/+/7870029 CL to turn `original_url` into `original_origin` and see what breaks.
2. https://chromium-review.git.corp.google.com/c/chromium/src/+/7869869?tab=checks CL to enable the 2 features introduced in this CL to see what breaks.
Fantastic, thanks! That CL was really useful to step through alongside this one to make sure we see all the implications, and I found only one case in this CL that we missed.
*frame_entry, common_params->url, common_params->method,Looks like we need to update this as well, based on https://chromium-review.googlesource.com/c/chromium/src/+/7870029/2/content/browser/renderer_host/navigation_controller_impl.cc#4870?
// original URL).Liam BradyI wonder if we can clarify this at all. I think the second sentence is correct and that we always have the full final URL in non-redirecting cases. Is the first sentence wrong if the feature is enabled, since we're using just an original origin in that case?
For correctness, I've updated this comment to mention redirect sanitization as a possibility.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
dcheng@: PTAL at:
Thanks!
nit: Just to make it clearer from the CL summary, maybe prefix this with something like "[Disabled] " so that it doesn't look like we've fixed the bug when this lands?
Done
*frame_entry, common_params->url, common_params->method,Looks like we need to update this as well, based on https://chromium-review.googlesource.com/c/chromium/src/+/7870029/2/content/browser/renderer_host/navigation_controller_impl.cc#4870?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
// Use the original navigation URL (or its sanitized origin, if redirectAt least here, it's not entirely clear where the origin woudl come from; the next few lines only reference "original_url". Maybe the comment could give a bit more context?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// Use the original navigation URL (or its sanitized origin, if redirectAt least here, it's not entirely clear where the origin woudl come from; the next few lines only reference "original_url". Maybe the comment could give a bit more context?
I've expanded on the comment to be clear that `commit_params->original_url` is sanitized by the time it reaches this point.
6 is the latest approved patch-set.
The change was submitted with unreviewed changes in the following files:
```
The name of the file: content/renderer/render_frame_impl.cc
Insertions: 8, Deletions: 5.
@@ -490,11 +490,14 @@
const blink::mojom::CommonNavigationParams& common_params,
const blink::mojom::CommitNavigationParams& commit_params,
blink::WebNavigationParams* navigation_params) {
- // Use the original navigation URL (or its sanitized origin, if redirect
- // sanitization is enabled) to start with. We'll replay the redirects
- // afterwards and will eventually arrive at the final URL. For non-redirecting
- // navigations, use the final URL to be committed (as that is the same as the
- // original URL).
+ // Use the original navigation URL to start with. Note that when the browser
+ // enables redirect sanitization (via
+ // `features::kSanitizeOriginalUrlDuringNavigation`), it populates
+ // `commit_params->original_url` with only the sanitized origin of the
+ // original URL instead of the full URL by calling DeprecatedGetOriginAsURL().
+ // We'll replay the redirects afterwards and will eventually arrive at the
+ // final URL. For non-redirecting navigations, use the final URL to be
+ // committed (as that is the same as the original URL).
const bool should_use_original_url = !commit_params.redirect_infos.empty() &&
!commit_params.original_url.is_empty();
navigation_params->url =
```
```
The name of the file: third_party/blink/renderer/platform/loader/fetch/url_loader/navigation_body_loader.cc
Insertions: 8, Deletions: 5.
@@ -620,11 +620,14 @@
bool is_main_frame,
WebNavigationParams* navigation_params,
bool is_ad_frame) {
- // Use the original navigation URL (or its sanitized origin, if redirect
- // sanitization is enabled) to start with. We'll replay the redirects
- // afterwards and will eventually arrive at the final URL. For non-redirecting
- // navigations, use the final URL to be committed (as that is the same as the
- // original URL).
+ // Use the original navigation URL to start with. Note that when the browser
+ // enables redirect sanitization (via
+ // `features::kSanitizeOriginalUrlDuringNavigation`), it populates
+ // `commit_params->original_url` with only the sanitized origin of the
+ // original URL instead of the full URL by calling DeprecatedGetOriginAsURL().
+ // We'll replay the redirects afterwards and will eventually arrive at the
+ // final URL. For non-redirecting navigations, use the final URL to be
+ // committed (as that is the same as the original URL).
const bool should_use_original_url = !commit_params->redirect_infos.empty() &&
!commit_params->original_url.is_empty();
const KURL original_url = should_use_original_url
```
[Disabled] Sanitize more commit params for cross-origin redirects.
The `NavigationRequest::SanitizeRedirectsForCommit` method was
introduced to prevent sensitive cross-site URL information (such as
query parameters) from making it to a committed navigation's renderer if
the navigation was the result of a redirect or redirect chain. This
method sanitized the commit params' `redirects` and `redirect_infos`
fields. However, it did not account for information being in either the
`original_url` field or in "Location" headers in `redirect_response`.
This CL sanitizes those fields.
This CL is behind two new default-disabled feature flags, so this will
essentially be a no-op. This is done because we expect some features to
break with the feature enabled. They will need to be modified to work
with these changes. That will be done in follow-up CLs. Landing this CL
will not actively break anything since the features will be turned off.
The end goal is to fully replace `original_url` with an Origin type
`original_origin` field, which will happen once all features are working
as expected with this change.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
posting draft comments just to make this not be on the top of my gerrit dashboard anymore
This CL sanitizes those fields.I think it's worth a step back here. Many of the changes below seem like they could introduce races, edge cases, or other bugs, and it's not obvious what invariant change we're aiming for.
From skimming the bug and CL, it sounds like we pass full URLs (in location headers and original URL params) from browser to renderer and back again, today. IIUC, it seems like the goal is to only send origins to the renderer, and the approach to keep things working is to preserve the full URLs on the browser side for when they are needed later in the navigation. Is that right?
If so, I'm wondering if we can be more explicit about what we are sending to the renderer going forward. Can we make it clear it's an origin and not a full URL, rather than silently converting between the two types of values, to avoid confusion of what's in the param?
(For that matter, does the renderer actually need the origin, or could the whole thing just be a placeholder/token or even be removed entirely from commit params if the real value is only needed back in the browser process?)
My current plan is to turn original_url in the commit params into original_origin. That way, we're explicit about sanitizing this value *unconditionally* for anything passed to the renderer. I'll work through features that break as the CQ finds them.
// Helper to get the origin of a URL for sanitization purposes.We should probably make these helpers more clearly about redirect sanitization, since sanitization is ambiguous and sounds like it might need to be used in more places.
This collection of helpers might also be good to move somewhere more specific to IPC security than NavigationRequest, though I'm not sure. We do have ipc_utils.cc for somewhat similar verification code, as one possibiliy.
These helper functions are no longer needed and are being removed, so I'll resolve this comment.
// DeprecatedGetOriginAsURL() returns an empty URL for non-standard schemes.Is there a way to give an example? At first glance it's hard to know why this is needed and whether it's sufficient.
These helper functions are no longer needed and are being removed, so I'll resolve this comment. But for documentation, this was running into issues with `chrome-native://` URLs on Android failing `IsSameOriginWith()` checks even if they are same origin, since they are marked especially as always being cross-origin.
/*navigation_id=*/0);Liam BradyI'm nervous about this. :) It seems hard to reason about having to pass this parameter around when many places don't assign it.
With our new solution of grabbing the original URL/name from the `fallback_timing_info_` in the `HTMLFrameOwnerElement`, there's no need to track any navigation ID or token, so this can be removed.
->SetNavigationIdForChild(Liam BradyI'd really like to avoid storing this as state in the renderer if possible. That creates chances for the state to get out of sync or be stale (or lied about), on top of all the plumbing it requires. Let's chat about possible approaches to see if we can avoid it.
Because of the new solution of getting the information directly from the embedder's HTMLFrameOwnerElement, this is no longer needed.
GetRenderFrameHost()->set_navigation_original_url(Liam BradyThis doesn't really fit with the comment above. Should it move into the helper function, if all the callers have to do it? (That's assuming we keep the value, which I'm separately wondering about.)
Resolving as this part is being removed because of the new renderer-specific solution.
if (!AreSameOriginForSanitization(commit_params->original_url,Liam BradyI'm curious: Nasko's previous code cropped everything down to an origin, without a same-origin check. Should we be consistent across the cases?
That would be simpler. We'd get no security benefit for the same origin case, but this becomes easier to reason about.
const GURL& next_url = commit_params->redirects[i + 1];Liam BradyI'm not sure I follow the `i+1` part at first glance, so it might be worth having comments about why it's needed.
The original intention of this was to do a sanitization if there was some cross-origin hop. But I'm realizing that we really only want to sanitize if it's cross-origin to the final URL that will be committed in the renderer process, so all of this `i+1` complexity is unnecessary and functionally not what we want. I've rewritten this to sanitize the `Location` header iff it's cross-origin to the final origin to be committed.
It's generally not great to store navigation state on RFH-- that's what NavigationRequest is for. Is it actually needed after the navigation completes, and how does that work today?
Also, is it actually test-specific, or do real observers depend on the value? If it's only for tests, maybe we can find a fix in the test framework that doesn't require new state on RFH.
Resolving as this has been removed with the latest iteration.
KURL pending_navigation_url_;Liam BradyI'm also concerned about storing the pending navigation URL (along with the previous comment about the ID), since these both seem easy to get wrong or use incorrectly.
Resolving as we already have the original navigation URL stored in the `fallback_timing_info_`'s `name`.
KURL pending_navigation_url_;Liam BradyI'm also concerned about storing the pending navigation URL (along with the previous comment about the ID), since these both seem easy to get wrong or use incorrectly.
With the new approach, this is no longer needed, so I'll remove this.
// TODO(crbug.com/1410705): fix this properly by moving IFrame reporting toInteresting-- is the code below still needed at all? The bug in this TODO is marked fixed, so I'm curious if we can remove it instead of adding the state.
So the [CL that closed out the bug](https://chromium-review.git.corp.google.com/c/chromium/src/+/4299070) isn't fully moving the reporting to the browser. Let me check with nrosenthal@ to see if this is still needed.
Update: He thinks this is still needed. The linked bug ended up simply consolidating reporting on the renderer side, and I guess the race is still possible.
Second update: He took another look and thinks this is safe to remove after all. I'm going to remove this and see if anything breaks.
pending_navigation_url_ = KURL(fallback_timing_info_->name);Liam BradyI'm not sure I understand how `name` and `url` are related here.
This block is going away with our new design, but for documentation, the timing info's `name` is essentially the "key" that's used to associate info. That name is really just the [original URL for the navigation](https://developer.mozilla.org/en-US/docs/Web/API/PerformanceNavigationTiming#:~:text=Returns%20the%20document%27s%20URL).
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (base::FeatureList::IsEnabled(
features::kSanitizeOriginalUrlDuringNavigation)) {
CHECK_EQ(common_params_->url.DeprecatedGetOriginAsURL(),
commit_params_->original_url);
} else {
CHECK_EQ(common_params_->url, commit_params_->original_url);
}Note: @cr...@chromium.org and @lbr...@google.com
I have to revert the CHECK back to a DCHECK, but this conflict with this change that split it in two. I resolved assuming both are failing.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (base::FeatureList::IsEnabled(
features::kSanitizeOriginalUrlDuringNavigation)) {
CHECK_EQ(common_params_->url.DeprecatedGetOriginAsURL(),
commit_params_->original_url);
} else {
CHECK_EQ(common_params_->url, commit_params_->original_url);
}Note: @cr...@chromium.org and @lbr...@google.com
I have to revert the CHECK back to a DCHECK, but this conflict with this change that split it in two. I resolved assuming both are failing.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |