I think it's a good idea to conform to the spec here, so thanks for putting this patch together. Two things:
bookmark =
tree_.ActiveFormattingElements()->BookmarkFor(formatting_element);Why is this `bookmark` reassignment line here? I think it's likely a bug.
// <aside> should contain a new <b> element, because <sdide> is not anit: `aside`
How come this one is being deleted?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I think it's a good idea to conform to the spec here, so thanks for putting this patch together. Two things:
- Since it's a behavior change, it should definitely be gated behind a runtime enabled feature flag (default "stable" is ok). Just in case.
- I'd like a security person to take a look too, since changes in this area can expose MXSS or sanitizer holes. I added a few folks.
Acknowledged
Thanks for the review.
I addressed some of the suggested changes and replied inline to the questions.
bookmark =
tree_.ActiveFormattingElements()->BookmarkFor(formatting_element);Why is this `bookmark` reassignment line here? I think it's likely a bug.
Given that we have removed an element from the ActiveFormattingElements list, the bookmark contains an index that is no longer valid.
// <aside> should contain a new <b> element, because <sdide> is not aJavier Fernandeznit: `aside`
Done
How come this one is being deleted?
This test has been added in the CL 1389333005 [1] as a regression tests for the crbug.com/40356377. The bug was about a change in how the styles are being added to the DOM elements. However, there was a case where the some elements in the HTML tree where re-parented, precisely as part of the Agency Adoption algorithm.
After this change, the test case defined in the test doesn't trigger such re-parenting process. I tried to find another case which produced similar behavior, but I couldn't find it.
I don't think that test cover an actual use case, but some unexpected behavior due to the Agency Algorithm's behavior implemented back then. However, I admit my thesis isn't solid enough; the fact that I couldn't find a case where a similar reparenting happens doesn't mean it doesn't exists. Hence, I appreciate any feedback on this issue.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
bookmark =
tree_.ActiveFormattingElements()->BookmarkFor(formatting_element);Javier FernandezWhy is this `bookmark` reassignment line here? I think it's likely a bug.
Given that we have removed an element from the ActiveFormattingElements list, the bookmark contains an index that is no longer valid.
During the first iteration of the inner loop, if `last_node == furthest_block`, the algorithm correctly moves the bookmark (Step 4.13.7: `bookmark.MoveToAfter(node_entry);`). If a subsequent iteration triggers the `inner_loop_counter > 3` condition, resetting the bookmark to `BookmarkFor(formatting_element)` overwrites the MoveToAfter adjustment made in the earlier iteration. This will cause the adopted elements to be inserted in the wrong position in the formatting elements list.
Since `Remove()` invalidates the existing bookmark index, we might need to handle that invalidation gracefully somehow, rather than resetting the bookmark to the original formatting_element.
Test case:
`<b><i><s><u><em><div></b><p>X</p>`
Javier FernandezHow come this one is being deleted?
This test has been added in the CL 1389333005 [1] as a regression tests for the crbug.com/40356377. The bug was about a change in how the styles are being added to the DOM elements. However, there was a case where the some elements in the HTML tree where re-parented, precisely as part of the Agency Adoption algorithm.
After this change, the test case defined in the test doesn't trigger such re-parenting process. I tried to find another case which produced similar behavior, but I couldn't find it.
I don't think that test cover an actual use case, but some unexpected behavior due to the Agency Algorithm's behavior implemented back then. However, I admit my thesis isn't solid enough; the fact that I couldn't find a case where a similar reparenting happens doesn't mean it doesn't exists. Hence, I appreciate any feedback on this issue.
Thanks for the archeology. I'm ok deleting it, given what you found.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
Javier FernandezI think it's a good idea to conform to the spec here, so thanks for putting this patch together. Two things:
- Since it's a behavior change, it should definitely be gated behind a runtime enabled feature flag (default "stable" is ok). Just in case.
- I'd like a security person to take a look too, since changes in this area can expose MXSS or sanitizer holes. I added a few folks.
Acknowledged
I'd like a security person to take a look too, since changes in this area can expose MXSS or sanitizer holes. I added a few folks.
Thanks. I don't see an additional risk here.
Sanitizer: Sanitizer deliberately runs _after_ the full parser - and thus after any re-parenting - and thus shouldn't be affected. So I don't think Sanitizer would be affected. (The only Sanitizer construct of relevance here is replace-with-children, which would see whatever final tree the HTML parser produces.)
mXSS: I think the whole re-parenting is indeed mXSS-sensitive, but I don't think this opens any opportunities that weren't there before. At the worst, people might have to adopt their POCs a little... :-D
----
If we want to be fancy, we could include an additional test in third_party/blink/web_tests/external/wpt/sanitizer-api/sethtml-tree-construction.sub.dat that ensures Sanitizer only looks at it after the full parse.
E.g. something like this, adopted from run-adoption01-data.
```
#data
<div><b><em><foo><foob><fooc><aside></b></em></div>
#config
{ "removeElements": ["em"] }
#document
| <div>
| <b>
| <aside>
| <b>
```
lgtm (from Sanitizer perspective; please still wait for other reviewers)
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Hi @masonf
The last submitted patch should have fixed the issue you mentioned in the last review. Could you please take a look ?
Thanks.
bookmark =
tree_.ActiveFormattingElements()->BookmarkFor(formatting_element);Javier FernandezWhy is this `bookmark` reassignment line here? I think it's likely a bug.
Mason FreedGiven that we have removed an element from the ActiveFormattingElements list, the bookmark contains an index that is no longer valid.
During the first iteration of the inner loop, if `last_node == furthest_block`, the algorithm correctly moves the bookmark (Step 4.13.7: `bookmark.MoveToAfter(node_entry);`). If a subsequent iteration triggers the `inner_loop_counter > 3` condition, resetting the bookmark to `BookmarkFor(formatting_element)` overwrites the MoveToAfter adjustment made in the earlier iteration. This will cause the adopted elements to be inserted in the wrong position in the formatting elements list.
Since `Remove()` invalidates the existing bookmark index, we might need to handle that invalidation gracefully somehow, rather than resetting the bookmark to the original formatting_element.
Test case:
`<b><i><s><u><em><div></b><p>X</p>`
Yeah, the way I overrode the bookmark was indeed buggy. I defined a new unit test with that case.
The last patch fixed the bug and I think it regenerates the bookmark correctly after removing the element form the formatting list.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
LGTM, while I agree this code is in a potentially hot mXSS area, I don't think it gives any new capabilities that weren't there before.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
LGTM, while I agree this code is in a potentially hot mXSS area, I don't think it gives any new capabilities that weren't there before.
Also I moved myself to CC, since I'm not actually permitted to vote.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks! I think that looks like the right logic.
Given the potential for bugs in this area of the code, can you please add a feature flag that guards the new behavior? So that if there are issues, we can roll it back easily? Be sure to add a comment about what the flag does, and a note that it will land in M147 and can be removed in M149.
Element* bookmark_element = bookmark.Mark()->GetElement();Can you add this, just above this line?
```
DCHECK(bookmark.Mark());
```
node_in_active_formatting_elements = false;Can you add a comment above this line:
```
// Set to false so the subsequent step (4.13.5) removes the node from
// the stack of open elements and continues the inner loop.
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks! I think that looks like the right logic.
Given the potential for bugs in this area of the code, can you please add a feature flag that guards the new behavior? So that if there are issues, we can roll it back easily? Be sure to add a comment about what the flag does, and a note that it will land in M147 and can be removed in M149.
Done
Element* bookmark_element = bookmark.Mark()->GetElement();Can you add this, just above this line?
```
DCHECK(bookmark.Mark());
```
Done
bookmark =
tree_.ActiveFormattingElements()->BookmarkFor(formatting_element);Javier FernandezWhy is this `bookmark` reassignment line here? I think it's likely a bug.
Mason FreedGiven that we have removed an element from the ActiveFormattingElements list, the bookmark contains an index that is no longer valid.
Javier FernandezDuring the first iteration of the inner loop, if `last_node == furthest_block`, the algorithm correctly moves the bookmark (Step 4.13.7: `bookmark.MoveToAfter(node_entry);`). If a subsequent iteration triggers the `inner_loop_counter > 3` condition, resetting the bookmark to `BookmarkFor(formatting_element)` overwrites the MoveToAfter adjustment made in the earlier iteration. This will cause the adopted elements to be inserted in the wrong position in the formatting elements list.
Since `Remove()` invalidates the existing bookmark index, we might need to handle that invalidation gracefully somehow, rather than resetting the bookmark to the original formatting_element.
Test case:
`<b><i><s><u><em><div></b><p>X</p>`
Yeah, the way I overrode the bookmark was indeed buggy. I defined a new unit test with that case.
The last patch fixed the bug and I think it regenerates the bookmark correctly after removing the element form the formatting list.
Done
Can you add a comment above this line:
```
// Set to false so the subsequent step (4.13.5) removes the node from
// the stack of open elements and continues the inner loop.
```
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (node == formatting_element_item) {
break;
}Thanks for adding the flag, but this branch still changes position relative to the old code.
Also, and more critically, I think the new logic (with the flag disabled) just doesn't check `inner_loop_counter` vs. `kInnerIterationLimit` at all. I think that's a potential infinite loop.
It might be a good idea to just add a big if/else block for the runtime feature, with exactly the old code in the `else` block. That way it's easy to see from inspection that the flag-disabled behavior is exactly the same.
// formatting list when the internal loop reached the maximum number of iterations.nit: line wrap this please.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |