Noam Rosenthal (Gerrit)

unread,

Feb 3, 2026, 9:09:59 AM (9 days ago) Feb 3

to Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal

Message from Noam Rosenthal

Set Ready For Review

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Philip Jägenstedt (Gerrit)

unread,

Feb 6, 2026, 7:15:05 AM (6 days ago) Feb 6

to Noam Rosenthal, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal

Philip Jägenstedt added 5 comments

Patchset-level comments

File-level comment, Patchset 17 (Latest):

Philip Jägenstedt . resolved

Currently trying to figure out how to handle xml* and lit$ targets without unbounded buffering of both original + lowercased data. But I think we can land something and then update it.

File third_party/blink/renderer/core/html/parser/html_tokenizer.cc

Line 1164, Patchset 17 (Latest): // TODO(nrosenthal): allow the full unicode range of ID_Start?

Philip Jägenstedt . unresolved

Consensus says no: https://github.com/whatwg/html/pull/12118#issuecomment-3847775588

Line 1166, Patchset 17 (Latest): token_.BeginProcessingInstruction();

Philip Jägenstedt . unresolved

I made an editorial change to the spec to not create a PI token that we might later discard, which I also think made the handling of "xml" and "xml-stylesheet" easier to follow. This step should just be to clear the temp buffer and reconsume (below).

However, I now see that both spec and implementation only use the temp buffer to compare to the lowercase string "script", and doesn't use the buffer to "back up" and get the original characters as I've attempted.

I'm leaning towards changing the spec again to not use the temp buffer in this novel way, WDYT?

Line 1206, Patchset 17 (Latest): ParseError();

Philip Jägenstedt . unresolved

Does it matter what character we're at when this is called, does this end up pointing to a specific location in DevTools or something?

I ask because the right location to point to for the "xml" and "xml-stylesheet" cases are probably the first "x" but we're way past that when we can see the error.

Line 1207, Patchset 17 (Latest): token_.Clear();

Philip Jägenstedt . unresolved

I see that `token_.Clear()` is never called outside of HTMLTokenizer::Reset() before this change, just like the spec never said "discard the token".

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review

No-Unresolved-Comments

Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Philip Jägenstedt (Gerrit)

unread,

Feb 6, 2026, 7:47:41 AM (6 days ago) Feb 6

to Noam Rosenthal, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal

Philip Jägenstedt added 6 comments

File third_party/blink/renderer/core/editing/testing/selection_sample_test.cc

Line 191, Patchset 17 (Latest): EXPECT_EQ("<?foo ba|r ?>", SetAndGetSelectionText("<?foo ba|r ?>"))

Philip Jägenstedt . unresolved

I guess this is in an XML document or https://chromium-review.googlesource.com/c/chromium/src/+/7531667 would have failed this test.

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17 (Latest): String processing_instruction_data_;

Philip Jägenstedt . unresolved

Why can't this reuse data_? We don't need to use both at the same time, right?

File third_party/blink/renderer/core/html/parser/html_document_parser_test.cc

Line 226, Patchset 17 (Latest):TEST_P(HTMLDocumentParserTest, ProcessingInstructionNoQuestionMark) {

Philip Jägenstedt . unresolved

Do you think these tests should be kept fairly minimal, or what goes here vs. WPT? Do these tests get the code coverage to 100%?

File third_party/blink/web_tests/external/wpt/html/syntax/parsing/parse-processing-instruction.tentative.html

Line 151, Patchset 17 (Latest): processing_instruction_test_equivalent("<?hey there?>", "<?hey there>");

Philip Jägenstedt . unresolved

A few more processing_instruction_test_equivalent would be:

`<?hey?there>` equivalent to `<?hey ?there>` (with > or ?> as the closing syntax, doesn't matter)

`<?HEY THERE>` equivalent to `<?hey THERE>`.

Line 169, Patchset 17 (Latest): processing_instruction_test("<?something ? >", [

Philip Jägenstedt . unresolved

Can you also add a more basic trailing whitespace test like `<?something a >` to show that all trailing whitespace is preserved, not just following `?`?

Line 237, Patchset 17 (Latest): "price$value",

Philip Jägenstedt . unresolved

Throw in lit$123456789 in a compat category?

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
No-Unresolved-Comments
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Noam Rosenthal (Gerrit)

unread,

Feb 8, 2026, 3:57:19 PM (3 days ago) Feb 8

to Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Philip Jägenstedt

Noam Rosenthal added 10 comments

File third_party/blink/renderer/core/editing/testing/selection_sample_test.cc

Line 191, Patchset 17: EXPECT_EQ("<?foo ba|r ?>", SetAndGetSelectionText("<?foo ba|r ?>"))

Philip Jägenstedt . resolved

I guess this is in an XML document or https://chromium-review.googlesource.com/c/chromium/src/+/7531667 would have failed this test.

Noam Rosenthal

Acknowledged

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17: String processing_instruction_data_;

Philip Jägenstedt . unresolved

Why can't this reuse data_? We don't need to use both at the same time, right?

Noam Rosenthal

data_ is used for the target, as we can't use "name" since it has all kinds of special rules.

File third_party/blink/renderer/core/html/parser/html_document_parser_test.cc

Line 226, Patchset 17:TEST_P(HTMLDocumentParserTest, ProcessingInstructionNoQuestionMark) {

Philip Jägenstedt . resolved

Do you think these tests should be kept fairly minimal, or what goes here vs. WPT? Do these tests get the code coverage to 100%?

Noam Rosenthal

Agreed, it was more about duplicating what what already there. Kept only one test for this.

File third_party/blink/renderer/core/html/parser/html_tokenizer.cc

Line 1164, Patchset 17: // TODO(nrosenthal): allow the full unicode range of ID_Start?

Philip Jägenstedt . resolved

Consensus says no: https://github.com/whatwg/html/pull/12118#issuecomment-3847775588

Noam Rosenthal

Acknowledged

Line 1166, Patchset 17: token_.BeginProcessingInstruction();

Philip Jägenstedt . unresolved

I made an editorial change to the spec to not create a PI token that we might later discard, which I also think made the handling of "xml" and "xml-stylesheet" easier to follow. This step should just be to clear the temp buffer and reconsume (below).

However, I now see that both spec and implementation only use the temp buffer to compare to the lowercase string "script", and doesn't use the buffer to "back up" and get the original characters as I've attempted.
I'm leaning towards changing the spec again to not use the temp buffer in this novel way, WDYT?

Noam Rosenthal

Seems like that's what you've done?

Line 1206, Patchset 17: ParseError();

Philip Jägenstedt . unresolved

Does it matter what character we're at when this is called, does this end up pointing to a specific location in DevTools or something?
I ask because the right location to point to for the "xml" and "xml-stylesheet" cases are probably the first "x" but we're way past that when we can see the error.

Noam Rosenthal

I don't think we do anything with ParseError.

Line 1207, Patchset 17: token_.Clear();

Philip Jägenstedt . unresolved

I see that `token_.Clear()` is never called outside of HTMLTokenizer::Reset() before this change, just like the spec never said "discard the token".

Noam Rosenthal

Yea but without that we have an initialized token and we call "BeginComment" and assert. Should I not have initialized the "ProcessingInstruction" state until we know we have a valid target?

File third_party/blink/web_tests/external/wpt/html/syntax/parsing/parse-processing-instruction.tentative.html

Line 151, Patchset 17: processing_instruction_test_equivalent("<?hey there?>", "<?hey there>");

Philip Jägenstedt . unresolved

A few more processing_instruction_test_equivalent would be:
`<?hey?there>` equivalent to `<?hey ?there>` (with > or ?> as the closing syntax, doesn't matter)
`<?HEY THERE>` equivalent to `<?hey THERE>`.

Noam Rosenthal

Done the first.
I think the second one is no longer valid?

Line 169, Patchset 17: processing_instruction_test("<?something ? >", [

Philip Jägenstedt . resolved

Can you also add a more basic trailing whitespace test like `<?something a >` to show that all trailing whitespace is preserved, not just following `?`?

Noam Rosenthal

Done

Line 237, Patchset 17: "price$value",

Philip Jägenstedt . resolved

Throw in lit$123456789 in a compat category?

Noam Rosenthal

Done

Open in Gerrit

Related details

Attention is currently required from:

Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
No-Unresolved-Comments
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Philip Jägenstedt (Gerrit)

unread,

Feb 9, 2026, 5:42:58 AM (3 days ago) Feb 9

to Noam Rosenthal, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal and Philip Jägenstedt

Philip Jägenstedt added 5 comments

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17: String processing_instruction_data_;

Philip Jägenstedt . unresolved

Why can't this reuse data_? We don't need to use both at the same time, right?

Noam Rosenthal

data_ is used for the target, as we can't use "name" since it has all kinds of special rules.

Philip Jägenstedt

I see, the name confused me. Will we still need this if we create a PI token early and just add to its target data?

File third_party/blink/renderer/core/html/parser/html_tokenizer.cc

Line 1166, Patchset 17: token_.BeginProcessingInstruction();

Philip Jägenstedt . unresolved

I made an editorial change to the spec to not create a PI token that we might later discard, which I also think made the handling of "xml" and "xml-stylesheet" easier to follow. This step should just be to clear the temp buffer and reconsume (below).
However, I now see that both spec and implementation only use the temp buffer to compare to the lowercase string "script", and doesn't use the buffer to "back up" and get the original characters as I've attempted.
I'm leaning towards changing the spec again to not use the temp buffer in this novel way, WDYT?

Noam Rosenthal

Seems like that's what you've done?

Philip Jägenstedt

Yes, Henri thought we should just make target case-sensitive, so I did that. Now the spec creates a PI token as soon as seeing <? and if there's an error the target is read back and turned into a comment token instead. There's no observable difference between this and keeping a buffer until we know what token type it will be, but it made things more readable IMHO.

Line 1206, Patchset 17: ParseError();

Philip Jägenstedt . resolved

Does it matter what character we're at when this is called, does this end up pointing to a specific location in DevTools or something?
I ask because the right location to point to for the "xml" and "xml-stylesheet" cases are probably the first "x" but we're way past that when we can see the error.

Noam Rosenthal

I don't think we do anything with ParseError.

Philip Jägenstedt

That's what I thought.

Line 1207, Patchset 17: token_.Clear();

Philip Jägenstedt . resolved

I see that `token_.Clear()` is never called outside of HTMLTokenizer::Reset() before this change, just like the spec never said "discard the token".

Noam Rosenthal

Yea but without that we have an initialized token and we call "BeginComment" and assert. Should I not have initialized the "ProcessingInstruction" state until we know we have a valid target?

Philip Jägenstedt

That's one option, but in the end I made the spec throw away a token instead, just like you're doing here.

File third_party/blink/web_tests/external/wpt/html/syntax/parsing/parse-processing-instruction.tentative.html

Line 151, Patchset 17: processing_instruction_test_equivalent("<?hey there?>", "<?hey there>");

Philip Jägenstedt . resolved

A few more processing_instruction_test_equivalent would be:
`<?hey?there>` equivalent to `<?hey ?there>` (with > or ?> as the closing syntax, doesn't matter)
`<?HEY THERE>` equivalent to `<?hey THERE>`.

Noam Rosenthal

Done the first.
I think the second one is no longer valid?

Philip Jägenstedt

Yep, now we should instead test for case-sensitivity.

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
No-Unresolved-Comments
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Noam Rosenthal (Gerrit)

unread,

Feb 9, 2026, 5:56:29 AM (3 days ago) Feb 9

to Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Philip Jägenstedt and Philip Jägenstedt

Noam Rosenthal added 2 comments

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17: String processing_instruction_data_;

Philip Jägenstedt . unresolved

Why can't this reuse data_? We don't need to use both at the same time, right?

Noam Rosenthal

data_ is used for the target, as we can't use "name" since it has all kinds of special rules.

Philip Jägenstedt

I see, the name confused me. Will we still need this if we create a PI token early and just add to its target data?

Noam Rosenthal

This is AtomicHTMLToken, it's only created when we emit the valid token to the parser.

I changed it to "processing_instruction_target_" instead, is that clearer?

File third_party/blink/renderer/core/html/parser/html_tokenizer.cc

Line 1166, Patchset 17: token_.BeginProcessingInstruction();

Philip Jägenstedt . resolved

I made an editorial change to the spec to not create a PI token that we might later discard, which I also think made the handling of "xml" and "xml-stylesheet" easier to follow. This step should just be to clear the temp buffer and reconsume (below).
However, I now see that both spec and implementation only use the temp buffer to compare to the lowercase string "script", and doesn't use the buffer to "back up" and get the original characters as I've attempted.
I'm leaning towards changing the spec again to not use the temp buffer in this novel way, WDYT?

Noam Rosenthal

Seems like that's what you've done?

Philip Jägenstedt

Yes, Henri thought we should just make target case-sensitive, so I did that. Now the spec creates a PI token as soon as seeing <? and if there's an error the target is read back and turned into a comment token instead. There's no observable difference between this and keeping a buffer until we know what token type it will be, but it made things more readable IMHO.

Noam Rosenthal

Acknowledged

Open in Gerrit

Related details

Attention is currently required from:

Philip Jägenstedt
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
No-Unresolved-Comments
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Philip Jägenstedt (Gerrit)

unread,

Feb 9, 2026, 7:29:10 AM (3 days ago) Feb 9

to Noam Rosenthal, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal and Philip Jägenstedt

Philip Jägenstedt added 1 comment

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17: String processing_instruction_data_;

Philip Jägenstedt . unresolved

Why can't this reuse data_? We don't need to use both at the same time, right?

Noam Rosenthal

data_ is used for the target, as we can't use "name" since it has all kinds of special rules.

Philip Jägenstedt

I see, the name confused me. Will we still need this if we create a PI token early and just add to its target data?

Noam Rosenthal

This is AtomicHTMLToken, it's only created when we emit the valid token to the parser.
I changed it to "processing_instruction_target_" instead, is that clearer?

Philip Jägenstedt

Yep, that's clear. Just checking, now that the target can only be alphanumeric and hyphens, does name_ still have special rules that prevent reuse?

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
No-Unresolved-Comments
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Noam Rosenthal (Gerrit)

unread,

Feb 9, 2026, 7:47:42 AM (3 days ago) Feb 9

to Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Philip Jägenstedt and Philip Jägenstedt

Noam Rosenthal added 1 comment

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17: String processing_instruction_data_;

Philip Jägenstedt . unresolved

Why can't this reuse data_? We don't need to use both at the same time, right?

Noam Rosenthal

data_ is used for the target, as we can't use "name" since it has all kinds of special rules.

Philip Jägenstedt

I see, the name confused me. Will we still need this if we create a PI token early and just add to its target data?

Noam Rosenthal

This is AtomicHTMLToken, it's only created when we emit the valid token to the parser.
I changed it to "processing_instruction_target_" instead, is that clearer?

Philip Jägenstedt

Yep, that's clear. Just checking, now that the target can only be alphanumeric and hyphens, does name_ still have special rules that prevent reuse?

Noam Rosenthal

Yea, it needs to be an HTML element name.

Open in Gerrit

Related details

Attention is currently required from:

Philip Jägenstedt
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
No-Unresolved-Comments
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Philip Jägenstedt (Gerrit)

unread,

Feb 9, 2026, 8:30:10 AM (3 days ago) Feb 9

to Noam Rosenthal, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal and Philip Jägenstedt

Philip Jägenstedt added 1 comment

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17: String processing_instruction_data_;

Philip Jägenstedt . unresolved

Why can't this reuse data_? We don't need to use both at the same time, right?

Noam Rosenthal

data_ is used for the target, as we can't use "name" since it has all kinds of special rules.

Philip Jägenstedt

I see, the name confused me. Will we still need this if we create a PI token early and just add to its target data?

Noam Rosenthal

This is AtomicHTMLToken, it's only created when we emit the valid token to the parser.
I changed it to "processing_instruction_target_" instead, is that clearer?

Philip Jägenstedt

Yep, that's clear. Just checking, now that the target can only be alphanumeric and hyphens, does name_ still have special rules that prevent reuse?

Noam Rosenthal

Yea, it needs to be an HTML element name.

Philip Jägenstedt

Is something else used for <my-button> then?

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
No-Unresolved-Comments
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Noam Rosenthal (Gerrit)

unread,

Feb 9, 2026, 9:03:40 AM (3 days ago) Feb 9

to Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Philip Jägenstedt and Philip Jägenstedt

Noam Rosenthal added 1 comment

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17: String processing_instruction_data_;

Philip Jägenstedt . unresolved

Why can't this reuse data_? We don't need to use both at the same time, right?

Noam Rosenthal

data_ is used for the target, as we can't use "name" since it has all kinds of special rules.

Philip Jägenstedt

I see, the name confused me. Will we still need this if we create a PI token early and just add to its target data?

Noam Rosenthal

This is AtomicHTMLToken, it's only created when we emit the valid token to the parser.
I changed it to "processing_instruction_target_" instead, is that clearer?

Philip Jägenstedt

Yep, that's clear. Just checking, now that the target can only be alphanumeric and hyphens, does name_ still have special rules that prevent reuse?

Noam Rosenthal

Yea, it needs to be an HTML element name.

Philip Jägenstedt

Is something else used for <my-button> then?

Noam Rosenthal

It gets the name "unknown" and then an additional string of "my-button". I find it a bit odd. Lots of asserts... we can use it awkwardly and say that it's an "unknown" with the target as the "real" name.

Open in Gerrit

Related details

Attention is currently required from:

Philip Jägenstedt
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
No-Unresolved-Comments
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Philip Jägenstedt (Gerrit)

unread,

Feb 10, 2026, 6:53:47 AM (yesterday) Feb 10

to Noam Rosenthal, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal and Philip Jägenstedt

Philip Jägenstedt voted and added 2 comments

Votes added by Philip Jägenstedt

Code-Review

+1

2 comments

File third_party/blink/renderer/core/html/parser/atomic_html_token.h

Line 342, Patchset 17: String processing_instruction_data_;

Philip Jägenstedt . resolved

Why can't this reuse data_? We don't need to use both at the same time, right?

Noam Rosenthal

data_ is used for the target, as we can't use "name" since it has all kinds of special rules.

Philip Jägenstedt

I see, the name confused me. Will we still need this if we create a PI token early and just add to its target data?

Noam Rosenthal

This is AtomicHTMLToken, it's only created when we emit the valid token to the parser.
I changed it to "processing_instruction_target_" instead, is that clearer?

Philip Jägenstedt

Yep, that's clear. Just checking, now that the target can only be alphanumeric and hyphens, does name_ still have special rules that prevent reuse?

Noam Rosenthal

Yea, it needs to be an HTML element name.

Philip Jägenstedt

Is something else used for <my-button> then?

Noam Rosenthal

It gets the name "unknown" and then an additional string of "my-button". I find it a bit odd. Lots of asserts... we can use it awkwardly and say that it's an "unknown" with the target as the "real" name.

Philip Jägenstedt

OK, let's not 😊

File third_party/blink/renderer/core/html/parser/html_tokenizer.cc

Line 1190, Patchset 19 (Latest): const AtomicString target_lower = target.AsAtomicString().LowerASCII();

Philip Jägenstedt . unresolved

This works, but can't we use a ASCII case-insensitive compare instead of lowercasing the target?

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners

Code-Review

No-Unresolved-Comments

Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Noam Rosenthal (Gerrit)

unread,

Feb 10, 2026, 7:13:10 AM (yesterday) Feb 10

to Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Philip Jägenstedt

Noam Rosenthal added 1 comment

File third_party/blink/renderer/core/html/parser/html_tokenizer.cc

Line 1190, Patchset 19 (Latest): const AtomicString target_lower = target.AsAtomicString().LowerASCII();

Philip Jägenstedt . resolved

This works, but can't we use a ASCII case-insensitive compare instead of lowercasing the target?

Noam Rosenthal

Done

Open in Gerrit

Related details

Attention is currently required from:

Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review

Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Noam Rosenthal (Gerrit)

unread,

Feb 10, 2026, 7:14:27 AM (yesterday) Feb 10

to Dominic Farolino, Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Dominic Farolino and Philip Jägenstedt

Noam Rosenthal added 1 comment

Patchset-level comments

File-level comment, Patchset 19 (Latest):

Noam Rosenthal . resolved

Hi Dominic! VTS review plz?

Open in Gerrit

Related details

Attention is currently required from:

Dominic Farolino
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Dominic Farolino (Gerrit)

unread,

10:44 AM (13 hours ago) 10:44 AM

to Noam Rosenthal, Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal and Philip Jägenstedt

Dominic Farolino added 1 comment

Patchset-level comments

File-level comment, Patchset 20 (Latest):

Dominic Farolino . resolved

Seems like there are tons of unrelated indentation/styling changes to VTS, can those be reverted?

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Noam Rosenthal (Gerrit)

unread,

11:06 AM (12 hours ago) 11:06 AM

to Dominic Farolino, Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Dominic Farolino and Philip Jägenstedt

Noam Rosenthal added 1 comment

Patchset-level comments

File-level comment, Patchset 20:

Dominic Farolino . resolved

Seems like there are tons of unrelated indentation/styling changes to VTS, can those be reverted?

Noam Rosenthal

Done

Open in Gerrit

Related details

Attention is currently required from:

Dominic Farolino
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

unsatisfied_requirement

open

diffy

Dominic Farolino (Gerrit)

unread,

11:22 AM (12 hours ago) 11:22 AM

to Noam Rosenthal, Kouhei Ueno, Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal and Philip Jägenstedt

Dominic Farolino voted and added 1 comment

Votes added by Dominic Farolino

Code-Review

+1

1 comment

Patchset-level comments

File-level comment, Patchset 21 (Latest):

Dominic Farolino . resolved

VTS lgtm; cc'ing kouhei@ (non-blocking) since he's usually interested in this kind of change.

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners

Code-Review
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

Blink W3C Test Autoroller (Gerrit)

unread,

11:36 AM (12 hours ago) 11:36 AM

to Noam Rosenthal, Dominic Farolino, Kouhei Ueno, Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Noam Rosenthal and Philip Jägenstedt

Message from Blink W3C Test Autoroller

Exportable changes to web-platform-tests were detected in this CL and a pull request in the upstream repo has been made: https://github.com/web-platform-tests/wpt/pull/57716.

When this CL lands, the bot will automatically merge the PR on GitHub if the required GitHub checks pass; otherwise, ecosystem-infra@ team will triage the failures and may contact you.

WPT Export docs:
https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md#Automatic-export-process

Open in Gerrit

Related details

Attention is currently required from:

Noam Rosenthal
Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

Noam Rosenthal (Gerrit)

unread,

12:04 PM (11 hours ago) 12:04 PM

to Blink W3C Test Autoroller, Dominic Farolino, Kouhei Ueno, Philip Jägenstedt, Philip Jägenstedt, Chromium LUCI CQ, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Attention needed from Philip Jägenstedt

Noam Rosenthal voted Commit-Queue+2

Commit-Queue

+2

Open in Gerrit

Related details

Attention is currently required from:

Philip Jägenstedt

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

Chromium LUCI CQ (Gerrit)

unread,

1:18 PM (10 hours ago) 1:18 PM

to Noam Rosenthal, Blink W3C Test Autoroller, Dominic Farolino, Kouhei Ueno, Philip Jägenstedt, Philip Jägenstedt, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Chromium LUCI CQ submitted the change

Change information

Commit message:

Support processing instructions in the HTML parser

The parser now recognizes <?target data> as a ProcessingInstruction and
adds it to the DOM instead of a bogus comment.

As per spec PR:
- xml/xml-stylesheet are blocklisted, and stay a bogus comment.
  We can add more of these if there are compat issues.
- A PI can appear wherever a comment appears.
- ?> at the end ignores the ?

Currently in this CL, PI targets are constrained to
/^[A-Za-z][A-Za-z0-9-]*$/.

Added a VTS that keeps current behavior, so that we don't lose some of
the existing html5lib tests while this is in development.

See spec PR: https://github.com/whatwg/html/pull/12118

I2P: https://groups.google.com/a/chromium.org/d/msgid/blink-dev/6981ee47.050a0220.baa59.0100.GAE%40google.com

Bug: 481087638

Change-Id: I1dd22c09f0b2961d07e8d73a1de1c10c91655be0

Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/7532085

Commit-Queue: Noam Rosenthal <nrose...@google.com>

Reviewed-by: Philip Jägenstedt <foo...@chromium.org>

Reviewed-by: Dominic Farolino <d...@chromium.org>

Cr-Commit-Position: refs/heads/main@{#1583351}

Files:

M third_party/blink/renderer/core/editing/testing/selection_sample_test.cc
M third_party/blink/renderer/core/html/html_view_source_document.cc
M third_party/blink/renderer/core/html/parser/atomic_html_token.cc
M third_party/blink/renderer/core/html/parser/atomic_html_token.h
M third_party/blink/renderer/core/html/parser/html_construction_site.cc
M third_party/blink/renderer/core/html/parser/html_construction_site.h
M third_party/blink/renderer/core/html/parser/html_document_parser_test.cc
M third_party/blink/renderer/core/html/parser/html_token.h
M third_party/blink/renderer/core/html/parser/html_tokenizer.cc
M third_party/blink/renderer/core/html/parser/html_tokenizer.h
M third_party/blink/renderer/core/html/parser/html_tokenizer_test.cc
M third_party/blink/renderer/core/html/parser/html_tree_builder.cc
M third_party/blink/renderer/core/html/parser/html_tree_builder.h
M third_party/blink/renderer/platform/runtime_enabled_features.json5
M third_party/blink/web_tests/VirtualTestSuites
A third_party/blink/web_tests/external/wpt/dom/nodes/MutationObserver-characterData-expected.txt
A third_party/blink/web_tests/external/wpt/html/syntax/parsing/parse-processing-instruction.tentative.html
M third_party/blink/web_tests/fast/xpath/xpath-functional-test-expected.txt
A third_party/blink/web_tests/virtual/parse-processing-instructions-as-bogus-comments/README.md

Change size: L

Delta: 19 files changed, 607 insertions(+), 18 deletions(-)

Branch: refs/heads/main

Submit Requirements:

Code-Review: +1 by Dominic Farolino, +1 by Philip Jägenstedt

Open in Gerrit

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

open

diffy

satisfied_requirement

Blink W3C Test Autoroller (Gerrit)

unread,

1:56 PM (9 hours ago) 1:56 PM

to Chromium LUCI CQ, Noam Rosenthal, Dominic Farolino, Kouhei Ueno, Philip Jägenstedt, Philip Jägenstedt, chromium...@chromium.org, blink-rev...@chromium.org, blink-revie...@chromium.org, blink-...@chromium.org, jmedle...@chromium.org, kinuko...@chromium.org, loading-rev...@chromium.org

Message from Blink W3C Test Autoroller

The WPT PR for this CL has been merged upstream! https://github.com/web-platform-tests/wpt/pull/57716

Open in Gerrit

Related details

Attention set is empty

Submit Requirements:

Code-Coverage
Code-Owners
Code-Review
Review-Enforcement

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

Reply all

Reply to author

Forward

Support processing instructions in the HTML parser [chromium/src : main]

Noam Rosenthal (Gerrit)

Message from Noam Rosenthal

Related details

Philip Jägenstedt (Gerrit)

Philip Jägenstedt added 5 comments

Related details

Philip Jägenstedt (Gerrit)

Philip Jägenstedt added 6 comments

Related details

Noam Rosenthal (Gerrit)

Noam Rosenthal added 10 comments

Related details

Philip Jägenstedt (Gerrit)

Philip Jägenstedt added 5 comments

Related details

Noam Rosenthal (Gerrit)

Noam Rosenthal added 2 comments

Related details

Philip Jägenstedt (Gerrit)

Philip Jägenstedt added 1 comment

Related details

Noam Rosenthal (Gerrit)

Noam Rosenthal added 1 comment

Related details

Philip Jägenstedt (Gerrit)

Philip Jägenstedt added 1 comment

Related details

Noam Rosenthal (Gerrit)

Noam Rosenthal added 1 comment

Related details

Philip Jägenstedt (Gerrit)

Philip Jägenstedt voted and added 2 comments

Votes added by Philip Jägenstedt

2 comments

Related details

Noam Rosenthal (Gerrit)

Noam Rosenthal added 1 comment

Related details

Noam Rosenthal (Gerrit)

Noam Rosenthal added 1 comment

Related details

Dominic Farolino (Gerrit)

Dominic Farolino added 1 comment

Related details

Noam Rosenthal (Gerrit)

Noam Rosenthal added 1 comment

Related details

Dominic Farolino (Gerrit)

Dominic Farolino voted and added 1 comment

Votes added by Dominic Farolino

1 comment

Related details

Blink W3C Test Autoroller (Gerrit)

Message from Blink W3C Test Autoroller

Related details

Noam Rosenthal (Gerrit)

Noam Rosenthal voted Commit-Queue+2

Related details

Chromium LUCI CQ (Gerrit)

Chromium LUCI CQ submitted the change

Change information

Blink W3C Test Autoroller (Gerrit)

Message from Blink W3C Test Autoroller

Related details