Redid this cl so now it uses a token to tie soft navs and same document navs together so that the correct URL gets logged. And Cl description now describes motivation, fix, and details. Ready for review. :-)
void SoftNavigationHeuristics::SameDocumentNavigationCommitted(Johannes HenkelNote to myself: This needs to be called at least for all committed same document navs in the main frame. I think at the moment we filter some out at the call site, e.g. the replacestate, we'll need to do the filtering in here so we can still get the count correctly... or do the count in the caller.
Done
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (ukm_source_id == ukm::kInvalidSourceId) {In what cases will this happen?
if (token.has_value() && !token->is_empty()) {In what cases will this fail?
return ukm::kInvalidSourceId;Similarly, is this expected to happen?
/*same_document_navigation_token*/ std::nullopt);nit: to match the rest:
```suggestion
/*same_document_navigation_token=*/std::nullopt);
```
std::optional<base::UnguessableToken> same_document_navigation_token;
if (same_document_params) {
same_document_navigation_token =
same_document_params->same_document_navigation_token;
}Can you lift this block above the comment, to keep that with the relevant code?
Also, maybe check `is_same_document_navigation` instead, like the other checks above (I guess it's equivalent, but maybe for consistency)?
(and maybe match the ternary expression style, if you want.)
base::UnguessableToken same_document_navigation_token_;nit/IWYU: #include "base/unguessable_token.h" (here and elsewhere)?
(https://google.github.io/styleguide/cppguide.html#Include_What_You_Use)
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Does this also work for browser-inititated navs, like back button?
| Commit-Queue | +1 |
Main update: I'm no longer adding the token in navigation_params.mojom, but rather, have it be a simple field in navigation_request.cc, and setting it in the browser just before the request gets destroyed. This way, the single location where the token gets created (in the renderer at commit time) can also cover navigations that originate in the browser, and the thing works for back/forward buttons now.
PTAL
if (ukm_source_id == ukm::kInvalidSourceId) {In what cases will this happen?
I'm thinking it's more robust, since the PageLoadTracker could return this invalid source id, if the renderer sent a corrupt soft_navigation_metrics message, or if the timing is way to early (also due to corruption). I don't think these are likely / I haven't exercised this, it's just based on the idea to not trust stuff coming from the renderer.
if (token.has_value() && !token->is_empty()) {In what cases will this fail?
Since I'm now setting the thingy unconditionally for navigation_request objects that are about to get destroyed, I'm expecting this to not fail, so converted it to checks.
return ukm::kInvalidSourceId;Similarly, is this expected to happen?
This could happen if a corrupted soft_navigation_metrics message is sent from a renderer. So, I think it's best to be graceful here.
Previously, {potential,previous}_soft_navigation_id_ were initialized with ukm::kInvalidSourceId by default, so I think a corrupt renderer could also have triggered this return value (e.g. if the soft navs metrics message was received unexpectedly early).
I don't think these are likely / I haven't exercised this, it's just based on the idea to not trust stuff coming from the renderer.
/*same_document_navigation_token*/ std::nullopt);nit: to match the rest:
```suggestion
/*same_document_navigation_token=*/std::nullopt);
```
parameter is gone
std::optional<base::UnguessableToken> same_document_navigation_token;
if (same_document_params) {
same_document_navigation_token =
same_document_params->same_document_navigation_token;
}Can you lift this block above the comment, to keep that with the relevant code?
Also, maybe check `is_same_document_navigation` instead, like the other checks above (I guess it's equivalent, but maybe for consistency)?
(and maybe match the ternary expression style, if you want.)
I'm using the is_same_document_navigation check now, but instead of
trying to set the parameter at construction time here, I'm now
tagging the navigation_request after this section with
the token; this way it covers both the newly constructed
requests and those that already existed (because they were
started in the browser like the back/forward buttons.
base::UnguessableToken same_document_navigation_token_;nit/IWYU: #include "base/unguessable_token.h" (here and elsewhere)?
(https://google.github.io/styleguide/cppguide.html#Include_What_You_Use)
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I think this works, but I'll take a closer look once the bots are green.
if (ukm_source_id == ukm::kInvalidSourceId) {Johannes HenkelIn what cases will this happen?
I'm thinking it's more robust, since the PageLoadTracker could return this invalid source id, if the renderer sent a corrupt soft_navigation_metrics message, or if the timing is way to early (also due to corruption). I don't think these are likely / I haven't exercised this, it's just based on the idea to not trust stuff coming from the renderer.
ACK, makes sense, and crashing the browser would be bad :). Just making sure it's needed, since otherwise it could mask a bug, i.e. a case we didn't anticipate, that we won't find out about.
if (token.has_value() && !token->is_empty()) {Johannes HenkelIn what cases will this fail?
Since I'm now setting the thingy unconditionally for navigation_request objects that are about to get destroyed, I'm expecting this to not fail, so converted it to checks.
Looks like maybe it can still be null in tests, so maybe it's okay to document that and ignore it? Also, to your other point, should you not trust this either since it's coming from the renderer (I'm not sure the what's typical to do for this)?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (token.has_value() && !token->is_empty()) {Johannes HenkelIn what cases will this fail?
Scott HaseleySince I'm now setting the thingy unconditionally for navigation_request objects that are about to get destroyed, I'm expecting this to not fail, so converted it to checks.
Looks like maybe it can still be null in tests, so maybe it's okay to document that and ignore it? Also, to your other point, should you not trust this either since it's coming from the renderer (I'm not sure the what's typical to do for this)?
I think it's OK to trust this one, reasons being:
(1) if we get into this code path, there must be a DidCommitSameDocumentNavigationParams. Even though this thing came from the renderer in practice, I think since the field is non-optional in these params, I can also safely assume that it's not null, and further, I can assume that it's not empty (UnguesableToken::is_empty() check) because empty unguessable tokens wouldn't serialize and I think verifying that sort of thing is up to the IPC layer.
(2) On the other hand, getting the token into the navigation_handle object, which is under the hood a NavigationRequest, that happens in the browser (in render_frame_host_impl.cc), so that's why I think it's OK to check-fail if that went wrong.
The situation with the other place in this change where I want to be robust and allow the invalid ukm source id to be passed is different, since that condition would not be caught by the IPC layer, in a sense there'd be structurally valid messages but at the IPC layer but we'd fail to match up the soft nav with the same document nav due to timing or the wrong token or stuff like that.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
Nice! I like this approach; this feels pretty clean. Hopefully navigation folks agree :). LGTM % test for back/forward, which we should probably make sure we have coverage on.
- For each soft navigation we detect, there's exactly one corresponding same-document navigation: it's the initial URL change caused by the interaction that triggered the soft navigation.nit: can you format the CL description? There are a bunch of long lines > 72 char.
Pointers for reading the cl:nit: for the commit message, it might be better to give a shorter/concise overview of the plumbing/flow?
covers a portion of this behavior and continues to pass.\o/
It sounds like you found some holes in testing with back/forward though? It would be probably be good to exercise that path as well here, or in a prequel CL?
that UKM would attribute to the soft navigations, and surfed a few sitesnit: this bit might not be needed in the commit description?
if (token.has_value() && !token->is_empty()) {Johannes HenkelIn what cases will this fail?
Scott HaseleySince I'm now setting the thingy unconditionally for navigation_request objects that are about to get destroyed, I'm expecting this to not fail, so converted it to checks.
Johannes HenkelLooks like maybe it can still be null in tests, so maybe it's okay to document that and ignore it? Also, to your other point, should you not trust this either since it's coming from the renderer (I'm not sure the what's typical to do for this)?
I think it's OK to trust this one, reasons being:
(1) if we get into this code path, there must be a DidCommitSameDocumentNavigationParams. Even though this thing came from the renderer in practice, I think since the field is non-optional in these params, I can also safely assume that it's not null, and further, I can assume that it's not empty (UnguesableToken::is_empty() check) because empty unguessable tokens wouldn't serialize and I think verifying that sort of thing is up to the IPC layer.
(2) On the other hand, getting the token into the navigation_handle object, which is under the hood a NavigationRequest, that happens in the browser (in render_frame_host_impl.cc), so that's why I think it's OK to check-fail if that went wrong.The situation with the other place in this change where I want to be robust and allow the invalid ukm source id to be passed is different, since that condition would not be caught by the IPC layer, in a sense there'd be structurally valid messages but at the IPC layer but we'd fail to match up the soft nav with the same document nav due to timing or the wrong token or stuff like that.
Done
source_id_by_same_document_navigation_.try_emplace(Do we need to handle the case where try_emplace fails? Probably not? Under normal circumstances this will never happen, but wondering how you want to handle the untrusted renderer case, especially when that's the thing generating the token?
return ukm::kInvalidSourceId;Johannes HenkelSimilarly, is this expected to happen?
This could happen if a corrupted soft_navigation_metrics message is sent from a renderer. So, I think it's best to be graceful here.
Previously, {potential,previous}_soft_navigation_id_ were initialized with ukm::kInvalidSourceId by default, so I think a corrupt renderer could also have triggered this return value (e.g. if the soft navs metrics message was received unexpectedly early).
Scott HaseleyI don't think these are likely / I haven't exercised this, it's just based on the idea to not trust stuff coming from the renderer.
Acknowledged
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |