[regexp] Move JSRegExp::source() to RegExpData::escaped_source() [v8/v8 : main]

0 views
Skip to first unread message

Patrick Thier (Gerrit)

unread,
Mar 10, 2026, 1:46:16 PMMar 10
to Jakob Linke, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org
Attention needed from Jakob Linke

Patrick Thier added 9 comments

Patchset-level comments
File-level comment, Patchset 1 (Latest):
Patrick Thier . resolved

PTAL

File src/builtins/base.tq
Line 937, Patchset 1 (Latest):extern macro LoadTrustedPointerFromObject(
Patrick Thier . unresolved

Moved here from regexp-match.tq

File src/builtins/regexp-match.tq
Line 15, Patchset 1 (Parent):const kRegExpDataIndirectPointerTag:
Patrick Thier . unresolved

Moved to js-regexp.tq

Line 18, Patchset 1 (Parent):extern macro LoadTrustedPointerFromObject(
Patrick Thier . unresolved

Moved to base.tq

File src/builtins/regexp.tq
Line 426, Patchset 1 (Parent):const kJSRegExpRegExpDataOffset:
Patrick Thier . unresolved

Moved to js-regexp.tq

File src/objects/js-regexp.cc
Line 184, Patchset 1 (Parent):bool IsLineTerminator(int c) {
Patrick Thier . unresolved

Moved without modifications to regexp.cc

File src/objects/js-regexp.tq
Line 79, Patchset 1 (Latest):const kJSRegExpRegExpDataOffset:
Patrick Thier . unresolved

Moved from regexp.tq

Line 81, Patchset 1 (Latest):const kRegExpDataIndirectPointerTag:
Patrick Thier . unresolved

Moved from regexp-match.tq

File src/regexp/regexp.cc
Line 211, Patchset 1 (Latest):bool IsLineTerminator(int c) {
Patrick Thier . unresolved

Moved from js-regexp.cc (without change).

Open in Gerrit

Related details

Attention is currently required from:
  • Jakob Linke
Submit Requirements:
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: v8/v8
Gerrit-Branch: main
Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
Gerrit-Change-Number: 7654235
Gerrit-PatchSet: 1
Gerrit-Owner: Patrick Thier <pth...@chromium.org>
Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
Gerrit-CC: Hannes Payer <hpa...@chromium.org>
Gerrit-Attention: Jakob Linke <jgr...@chromium.org>
Gerrit-Comment-Date: Tue, 10 Mar 2026 17:46:11 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
unsatisfied_requirement
open
diffy

Jakob Linke (Gerrit)

unread,
Mar 11, 2026, 2:52:21 AMMar 11
to Patrick Thier, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org
Attention needed from Patrick Thier

Jakob Linke added 18 comments

Patchset-level comments
Jakob Linke . resolved

Thank you, this must've been tedious to fix! Comments below are mostly about confusion between unmodified/escaped source vars. My two main requests are to 1) clearly distinguish between the two in RegExpData, and 2) pass RegExpData (or a small string container struct) in regexp.cc methods so each local callsite can pick the appropriate variant to use.

File src/builtins/base.tq
Line 937, Patchset 1 (Latest):extern macro LoadTrustedPointerFromObject(
Patrick Thier . resolved

Moved here from regexp-match.tq

Jakob Linke

Acknowledged

Line 937, Patchset 1 (Latest):extern macro LoadTrustedPointerFromObject(
Patrick Thier . resolved

Moved here from regexp-match.tq

Jakob Linke

Acknowledged

File src/builtins/builtins-regexp-gen.cc
Line 1361, Patchset 1 (Latest): LoadObjectField<String>(data, RegExpData::kSourceOffset);
Jakob Linke . unresolved

Should this be escaped_source?

Line 1480, Patchset 1 (Latest): LoadObjectField<String>(data, RegExpData::kSourceOffset);
Jakob Linke . unresolved

Same here.

File src/builtins/regexp-match.tq
Line 15, Patchset 1 (Parent):const kRegExpDataIndirectPointerTag:
Patrick Thier . resolved

Moved to js-regexp.tq

Jakob Linke

Acknowledged

Line 18, Patchset 1 (Parent):extern macro LoadTrustedPointerFromObject(
Patrick Thier . resolved

Moved to base.tq

Jakob Linke

Acknowledged

File src/builtins/regexp.tq
Line 426, Patchset 1 (Parent):const kJSRegExpRegExpDataOffset:
Patrick Thier . resolved

Moved to js-regexp.tq

Jakob Linke

Acknowledged

Line 426, Patchset 1 (Parent):const kJSRegExpRegExpDataOffset:
Patrick Thier . resolved

Moved to js-regexp.tq

Jakob Linke

Acknowledged

File src/objects/js-regexp.h
Line 47, Patchset 1 (Latest): inline Tagged<String> source(IsolateForSandbox isolate) const;
Jakob Linke . unresolved

Please remind us here that this is the escaped version.

File src/objects/js-regexp.cc
Line 184, Patchset 1 (Parent):bool IsLineTerminator(int c) {
Patrick Thier . resolved

Moved without modifications to regexp.cc

Jakob Linke

Acknowledged

File src/objects/js-regexp.tq
Line 79, Patchset 1 (Latest):const kJSRegExpRegExpDataOffset:
Patrick Thier . resolved

Moved from regexp.tq

Jakob Linke

Acknowledged

Line 81, Patchset 1 (Latest):const kRegExpDataIndirectPointerTag:
Patrick Thier . resolved

Moved from regexp-match.tq

Jakob Linke

Acknowledged

Line 87, Patchset 1 (Latest): return data.escaped_source;
Jakob Linke . unresolved

It's confusing that RegExpData.escaped_source is "renamed" to source on JSRegExp, especially since RegExpData has another var named `source`. I think it would help a lot to make the names in RegExpData very explicit.

Wdyt about eg `unmodified_source`, `escaped_source`?

File src/regexp/regexp.cc
Line 211, Patchset 1 (Latest):bool IsLineTerminator(int c) {
Patrick Thier . resolved

Moved from js-regexp.cc (without change).

Jakob Linke

Acknowledged

Line 889, Patchset 1 (Latest): Compile(isolate, &zone, &compile_data, flags, escaped_pattern,
Jakob Linke . unresolved

This now passes the escaped version which ends up flowing into https://source.chromium.org/chromium/chromium/src/+/main:v8/src/regexp/regexp.cc;l=1206;drc=10b6082ae57b34c1d623275ed00dae987fb2a590

Was this on purpose? Not a huge change, but potentially confusing.

Line 1054, Patchset 1 (Latest): DirectHandle<String> pattern(re_data->escaped_source(), isolate);
Jakob Linke . unresolved

We now pass `source` when the callee uses the string for parsing, and `escaped_source` for logging etc. This is totally implicit though in the signatures, and not clear which is appropriate (especially since the parameters for some callees are still called just `source`).

Wdyt about passing re_data instead and using the appropriate string directly at the use site? Some callees may even want to use both.

Line 1055, Patchset 1 (Latest): DCHECK(pattern->IsFlat());
Jakob Linke . unresolved

This must hold for both variants, right? If so, we could dcheck that in the RegExpData getters.

Open in Gerrit

Related details

Attention is currently required from:
  • Patrick Thier
Submit Requirements:
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: v8/v8
Gerrit-Branch: main
Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
Gerrit-Change-Number: 7654235
Gerrit-PatchSet: 1
Gerrit-Owner: Patrick Thier <pth...@chromium.org>
Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
Gerrit-CC: Hannes Payer <hpa...@chromium.org>
Gerrit-Attention: Patrick Thier <pth...@chromium.org>
Gerrit-Comment-Date: Wed, 11 Mar 2026 06:52:16 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Patrick Thier <pth...@chromium.org>
unsatisfied_requirement
open
diffy

Patrick Thier (Gerrit)

unread,
Mar 11, 2026, 5:07:02 AMMar 11
to Jakob Linke, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org
Attention needed from Jakob Linke

Patrick Thier added 8 comments

Patchset-level comments
File-level comment, Patchset 2 (Latest):
Patrick Thier . resolved

Thanks for the quick review. PTAL again

File src/builtins/builtins-regexp-gen.cc
Line 1361, Patchset 1: LoadObjectField<String>(data, RegExpData::kSourceOffset);
Jakob Linke . unresolved

Should this be escaped_source?

Patrick Thier

Ah sorry forgot to leave a comment on the CL for this change.
It was intentional.
The behaviour is the same (We don't escape already escaped characters for the source and the parser handles both escaped/unescaped), but now the code matches the spec text (https://tc39.es/ecma262/#sec-regexp-pattern-flags `4.a Let P be pattern.[[OriginalSource]].`)
This also makes the intended behaviour more obvious in my opinion (I had to triple check that the previous version was actually working as intended).

Line 1480, Patchset 1: LoadObjectField<String>(data, RegExpData::kSourceOffset);
Jakob Linke . unresolved

Same here.

Patrick Thier

Same reasoning as above, matches the spec text (https://tc39.es/ecma262/#sec-regexp.prototype.compile `3.b Let P be pattern.[[OriginalSource]].`)

File src/objects/js-regexp.h
Line 47, Patchset 1: inline Tagged<String> source(IsolateForSandbox isolate) const;
Jakob Linke . resolved

Please remind us here that this is the escaped version.

Patrick Thier

Done

File src/objects/js-regexp.tq
Line 87, Patchset 1: return data.escaped_source;
Jakob Linke . unresolved

It's confusing that RegExpData.escaped_source is "renamed" to source on JSRegExp, especially since RegExpData has another var named `source`. I think it would help a lot to make the names in RegExpData very explicit.

Wdyt about eg `unmodified_source`, `escaped_source`?

Patrick Thier

This just matches the previous behaviour.
I would propose:
`JSRegExp.source` to match `RegExp.prototype.source`, `RegExpData.original_source` and `RegExpData.escaped_source` to distinguish them in `RegExpData`.
Since this is another pretty big change (I think) I would like to do it in a follow-up.

File src/regexp/regexp.cc
Line 889, Patchset 1: Compile(isolate, &zone, &compile_data, flags, escaped_pattern,
Jakob Linke . unresolved

This now passes the escaped version which ends up flowing into https://source.chromium.org/chromium/chromium/src/+/main:v8/src/regexp/regexp.cc;l=1206;drc=10b6082ae57b34c1d623275ed00dae987fb2a590

Was this on purpose? Not a huge change, but potentially confusing.

Patrick Thier

Yes. That matches the previous behaviour (before the introduction of `RegExpData`).

Line 1054, Patchset 1: DirectHandle<String> pattern(re_data->escaped_source(), isolate);
Jakob Linke . unresolved

We now pass `source` when the callee uses the string for parsing, and `escaped_source` for logging etc. This is totally implicit though in the signatures, and not clear which is appropriate (especially since the parameters for some callees are still called just `source`).

Wdyt about passing re_data instead and using the appropriate string directly at the use site? Some callees may even want to use both.

Patrick Thier

Yeah at least the names should be explicit (going to change this in the follow-up where I am going to rename `RegExpData.source`.
Will also evaluate where it makes sense to just pass `RegExpData` instead.

Line 1055, Patchset 1: DCHECK(pattern->IsFlat());
Jakob Linke . resolved

This must hold for both variants, right? If so, we could dcheck that in the RegExpData getters.

Patrick Thier

Done

Open in Gerrit

Related details

Attention is currently required from:
  • Jakob Linke
Submit Requirements:
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: v8/v8
Gerrit-Branch: main
Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
Gerrit-Change-Number: 7654235
Gerrit-PatchSet: 2
Gerrit-Owner: Patrick Thier <pth...@chromium.org>
Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
Gerrit-CC: Hannes Payer <hpa...@chromium.org>
Gerrit-Attention: Jakob Linke <jgr...@chromium.org>
Gerrit-Comment-Date: Wed, 11 Mar 2026 09:06:57 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Jakob Linke <jgr...@chromium.org>
unsatisfied_requirement
open
diffy

Jakob Linke (Gerrit)

unread,
Mar 11, 2026, 8:17:43 AMMar 11
to Patrick Thier, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org
Attention needed from Patrick Thier

Jakob Linke voted and added 5 comments

Votes added by Jakob Linke

Code-Review+1

5 comments

File src/builtins/builtins-regexp-gen.cc
Line 1361, Patchset 1: LoadObjectField<String>(data, RegExpData::kSourceOffset);
Jakob Linke . resolved

Should this be escaped_source?

Patrick Thier

Ah sorry forgot to leave a comment on the CL for this change.
It was intentional.
The behaviour is the same (We don't escape already escaped characters for the source and the parser handles both escaped/unescaped), but now the code matches the spec text (https://tc39.es/ecma262/#sec-regexp-pattern-flags `4.a Let P be pattern.[[OriginalSource]].`)
This also makes the intended behaviour more obvious in my opinion (I had to triple check that the previous version was actually working as intended).

Jakob Linke

Acknowledged

Line 1480, Patchset 1: LoadObjectField<String>(data, RegExpData::kSourceOffset);
Jakob Linke . resolved

Same here.

Patrick Thier

Same reasoning as above, matches the spec text (https://tc39.es/ecma262/#sec-regexp.prototype.compile `3.b Let P be pattern.[[OriginalSource]].`)

Jakob Linke

Acknowledged

File src/objects/js-regexp.tq
Line 87, Patchset 1: return data.escaped_source;
Jakob Linke . unresolved

It's confusing that RegExpData.escaped_source is "renamed" to source on JSRegExp, especially since RegExpData has another var named `source`. I think it would help a lot to make the names in RegExpData very explicit.

Wdyt about eg `unmodified_source`, `escaped_source`?

Patrick Thier

This just matches the previous behaviour.
I would propose:
`JSRegExp.source` to match `RegExp.prototype.source`, `RegExpData.original_source` and `RegExpData.escaped_source` to distinguish them in `RegExpData`.
Since this is another pretty big change (I think) I would like to do it in a follow-up.

Jakob Linke

Right, it matches previous behavior but what I meant is that it's especially confusing now that JSRegExp and RegExpData both have `source`, but one is escaped and the other not.

Addressing in a followup is okay by me, although one (perhaps easy) alternative would be to swap the RegExpData fields in this commit:

  • source stores the escaped version and corresponds to JSRegExp.source.
  • original_source the unescaped version.

Perhaps that's just a matter of a few find/replace? Anyway, up to you.

File src/regexp/regexp.cc
Line 889, Patchset 1: Compile(isolate, &zone, &compile_data, flags, escaped_pattern,
Jakob Linke . unresolved

This now passes the escaped version which ends up flowing into https://source.chromium.org/chromium/chromium/src/+/main:v8/src/regexp/regexp.cc;l=1206;drc=10b6082ae57b34c1d623275ed00dae987fb2a590

Was this on purpose? Not a huge change, but potentially confusing.

Patrick Thier

Yes. That matches the previous behaviour (before the introduction of `RegExpData`).

Jakob Linke

Are you saying that the state before this patch (which passes `re_data->source()`) was wrong and we originally passed the escaped source here? Do you perhaps have a pointer to a cl that introduced the change?

(If so, it's a bit weird to use the escaped version for TooMuchRegExpCode since it penalizes any pattern that happens to need a lot of escaping.)

Line 1054, Patchset 1: DirectHandle<String> pattern(re_data->escaped_source(), isolate);
Jakob Linke . resolved

We now pass `source` when the callee uses the string for parsing, and `escaped_source` for logging etc. This is totally implicit though in the signatures, and not clear which is appropriate (especially since the parameters for some callees are still called just `source`).

Wdyt about passing re_data instead and using the appropriate string directly at the use site? Some callees may even want to use both.

Patrick Thier

Yeah at least the names should be explicit (going to change this in the follow-up where I am going to rename `RegExpData.source`.
Will also evaluate where it makes sense to just pass `RegExpData` instead.

Jakob Linke

Ack. Fwiw I *think* I prefer the latter just to make the uses less confusing.

Open in Gerrit

Related details

Attention is currently required from:
  • Patrick Thier
Submit Requirements:
    • requirement is not satisfiedCode-Owners
    • requirement satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement satisfiedReview-Enforcement
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: v8/v8
    Gerrit-Branch: main
    Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
    Gerrit-Change-Number: 7654235
    Gerrit-PatchSet: 2
    Gerrit-Owner: Patrick Thier <pth...@chromium.org>
    Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
    Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
    Gerrit-CC: Hannes Payer <hpa...@chromium.org>
    Gerrit-Attention: Patrick Thier <pth...@chromium.org>
    Gerrit-Comment-Date: Wed, 11 Mar 2026 12:17:39 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: Yes
    Comment-In-Reply-To: Patrick Thier <pth...@chromium.org>
    Comment-In-Reply-To: Jakob Linke <jgr...@chromium.org>
    unsatisfied_requirement
    satisfied_requirement
    open
    diffy

    Patrick Thier (Gerrit)

    unread,
    Mar 11, 2026, 8:34:29 AMMar 11
    to Jakob Linke, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org
    Attention needed from Jakob Linke

    Patrick Thier added 2 comments

    File src/objects/js-regexp.tq
    Line 87, Patchset 1: return data.escaped_source;
    Jakob Linke . unresolved

    It's confusing that RegExpData.escaped_source is "renamed" to source on JSRegExp, especially since RegExpData has another var named `source`. I think it would help a lot to make the names in RegExpData very explicit.

    Wdyt about eg `unmodified_source`, `escaped_source`?

    Patrick Thier

    This just matches the previous behaviour.
    I would propose:
    `JSRegExp.source` to match `RegExp.prototype.source`, `RegExpData.original_source` and `RegExpData.escaped_source` to distinguish them in `RegExpData`.
    Since this is another pretty big change (I think) I would like to do it in a follow-up.

    Jakob Linke

    Right, it matches previous behavior but what I meant is that it's especially confusing now that JSRegExp and RegExpData both have `source`, but one is escaped and the other not.

    Addressing in a followup is okay by me, although one (perhaps easy) alternative would be to swap the RegExpData fields in this commit:

    • source stores the escaped version and corresponds to JSRegExp.source.
    • original_source the unescaped version.

    Perhaps that's just a matter of a few find/replace? Anyway, up to you.

    Patrick Thier

    Yeah I agree that this is a bit confusing. Follow-up is already prepared. I prefer the explicit version. Otherwise I think we have the same problem where `source()` suddenly is escaped where we didn't expect it.

    File src/regexp/regexp.cc
    Line 889, Patchset 1: Compile(isolate, &zone, &compile_data, flags, escaped_pattern,
    Jakob Linke . unresolved

    This now passes the escaped version which ends up flowing into https://source.chromium.org/chromium/chromium/src/+/main:v8/src/regexp/regexp.cc;l=1206;drc=10b6082ae57b34c1d623275ed00dae987fb2a590

    Was this on purpose? Not a huge change, but potentially confusing.

    Patrick Thier

    Yes. That matches the previous behaviour (before the introduction of `RegExpData`).

    Jakob Linke

    Are you saying that the state before this patch (which passes `re_data->source()`) was wrong and we originally passed the escaped source here? Do you perhaps have a pointer to a cl that introduced the change?

    (If so, it's a bit weird to use the escaped version for TooMuchRegExpCode since it penalizes any pattern that happens to need a lot of escaping.)

    Patrick Thier

    Previously (before the CL that accidently changed the behaviour; linked below) we just passed the escaped (`JSRegExp::source()`).
    This is the CL: https://crrev.com/c/5713168
    I don't think either of them is "wrong" since `TooMuchRegExpCode` is just a very rough heuristic (and diffs between `original_source()` and `escaped_source()` are probably very rare anyways).
    Anyways in the follow-up I already changed it again to `original_source()`, since that makes more sense.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Jakob Linke
    Submit Requirements:
    • requirement is not satisfiedCode-Owners
    • requirement satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement satisfiedReview-Enforcement
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: v8/v8
    Gerrit-Branch: main
    Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
    Gerrit-Change-Number: 7654235
    Gerrit-PatchSet: 2
    Gerrit-Owner: Patrick Thier <pth...@chromium.org>
    Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
    Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
    Gerrit-CC: Hannes Payer <hpa...@chromium.org>
    Gerrit-Attention: Jakob Linke <jgr...@chromium.org>
    Gerrit-Comment-Date: Wed, 11 Mar 2026 12:34:24 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    unsatisfied_requirement
    satisfied_requirement
    open
    diffy

    Patrick Thier (Gerrit)

    unread,
    Mar 11, 2026, 8:37:35 AMMar 11
    to Dominik Inführ, Jakob Linke, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org
    Attention needed from Dominik Inführ and Jakob Linke

    Patrick Thier added 1 comment

    Patchset-level comments
    Patrick Thier . resolved

    +Dominik PTAL as owner of `src/profiler/heap-snapshot-generator.cc`

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Dominik Inführ
    • Jakob Linke
    Submit Requirements:
    • requirement is not satisfiedCode-Owners
    • requirement satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement satisfiedReview-Enforcement
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: v8/v8
    Gerrit-Branch: main
    Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
    Gerrit-Change-Number: 7654235
    Gerrit-PatchSet: 2
    Gerrit-Owner: Patrick Thier <pth...@chromium.org>
    Gerrit-Reviewer: Dominik Inführ <dinf...@chromium.org>
    Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
    Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
    Gerrit-CC: Hannes Payer <hpa...@chromium.org>
    Gerrit-Attention: Jakob Linke <jgr...@chromium.org>
    Gerrit-Attention: Dominik Inführ <dinf...@chromium.org>
    Gerrit-Comment-Date: Wed, 11 Mar 2026 12:37:29 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    unsatisfied_requirement
    satisfied_requirement
    open
    diffy

    Jakob Linke (Gerrit)

    unread,
    Mar 11, 2026, 8:48:35 AMMar 11
    to Patrick Thier, Dominik Inführ, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org
    Attention needed from Dominik Inführ and Patrick Thier

    Jakob Linke added 2 comments

    File src/objects/js-regexp.tq
    Line 87, Patchset 1: return data.escaped_source;
    Jakob Linke . resolved

    It's confusing that RegExpData.escaped_source is "renamed" to source on JSRegExp, especially since RegExpData has another var named `source`. I think it would help a lot to make the names in RegExpData very explicit.

    Wdyt about eg `unmodified_source`, `escaped_source`?

    Patrick Thier

    This just matches the previous behaviour.
    I would propose:
    `JSRegExp.source` to match `RegExp.prototype.source`, `RegExpData.original_source` and `RegExpData.escaped_source` to distinguish them in `RegExpData`.
    Since this is another pretty big change (I think) I would like to do it in a follow-up.

    Jakob Linke

    Right, it matches previous behavior but what I meant is that it's especially confusing now that JSRegExp and RegExpData both have `source`, but one is escaped and the other not.

    Addressing in a followup is okay by me, although one (perhaps easy) alternative would be to swap the RegExpData fields in this commit:

    • source stores the escaped version and corresponds to JSRegExp.source.
    • original_source the unescaped version.

    Perhaps that's just a matter of a few find/replace? Anyway, up to you.

    Patrick Thier

    Yeah I agree that this is a bit confusing. Follow-up is already prepared. I prefer the explicit version. Otherwise I think we have the same problem where `source()` suddenly is escaped where we didn't expect it.

    Jakob Linke

    Yeah, I meant as an incremental step :) sgtm.

    File src/regexp/regexp.cc
    Line 889, Patchset 1: Compile(isolate, &zone, &compile_data, flags, escaped_pattern,
    Jakob Linke . resolved

    This now passes the escaped version which ends up flowing into https://source.chromium.org/chromium/chromium/src/+/main:v8/src/regexp/regexp.cc;l=1206;drc=10b6082ae57b34c1d623275ed00dae987fb2a590

    Was this on purpose? Not a huge change, but potentially confusing.

    Patrick Thier

    Yes. That matches the previous behaviour (before the introduction of `RegExpData`).

    Jakob Linke

    Are you saying that the state before this patch (which passes `re_data->source()`) was wrong and we originally passed the escaped source here? Do you perhaps have a pointer to a cl that introduced the change?

    (If so, it's a bit weird to use the escaped version for TooMuchRegExpCode since it penalizes any pattern that happens to need a lot of escaping.)

    Patrick Thier

    Previously (before the CL that accidently changed the behaviour; linked below) we just passed the escaped (`JSRegExp::source()`).
    This is the CL: https://crrev.com/c/5713168
    I don't think either of them is "wrong" since `TooMuchRegExpCode` is just a very rough heuristic (and diffs between `original_source()` and `escaped_source()` are probably very rare anyways).
    Anyways in the follow-up I already changed it again to `original_source()`, since that makes more sense.

    Jakob Linke

    Agreed, ty.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Dominik Inführ
    • Patrick Thier
    Submit Requirements:
      • requirement is not satisfiedCode-Owners
      • requirement satisfiedCode-Review
      • requirement satisfiedReview-Enforcement
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: v8/v8
      Gerrit-Branch: main
      Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
      Gerrit-Change-Number: 7654235
      Gerrit-PatchSet: 2
      Gerrit-Owner: Patrick Thier <pth...@chromium.org>
      Gerrit-Reviewer: Dominik Inführ <dinf...@chromium.org>
      Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
      Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
      Gerrit-CC: Hannes Payer <hpa...@chromium.org>
      Gerrit-Attention: Patrick Thier <pth...@chromium.org>
      Gerrit-Attention: Dominik Inführ <dinf...@chromium.org>
      Gerrit-Comment-Date: Wed, 11 Mar 2026 12:48:29 +0000
      Gerrit-HasComments: Yes
      Gerrit-Has-Labels: No
      unsatisfied_requirement
      satisfied_requirement
      open
      diffy

      Dominik Inführ (Gerrit)

      unread,
      Mar 11, 2026, 9:03:25 AMMar 11
      to Patrick Thier, Jakob Linke, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org
      Attention needed from Patrick Thier

      Dominik Inführ voted and added 1 comment

      Votes added by Dominik Inführ

      Code-Review+1

      1 comment

      Patchset-level comments
      Dominik Inführ . resolved

      Thanks, LGTM

      Open in Gerrit

      Related details

      Attention is currently required from:
      • Patrick Thier
      Submit Requirements:
      • requirement satisfiedCode-Owners
      • requirement satisfiedCode-Review
      • requirement satisfiedReview-Enforcement
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: v8/v8
      Gerrit-Branch: main
      Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
      Gerrit-Change-Number: 7654235
      Gerrit-PatchSet: 2
      Gerrit-Owner: Patrick Thier <pth...@chromium.org>
      Gerrit-Reviewer: Dominik Inführ <dinf...@chromium.org>
      Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
      Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
      Gerrit-CC: Hannes Payer <hpa...@chromium.org>
      Gerrit-Attention: Patrick Thier <pth...@chromium.org>
      Gerrit-Comment-Date: Wed, 11 Mar 2026 13:03:21 +0000
      Gerrit-HasComments: Yes
      Gerrit-Has-Labels: Yes
      satisfied_requirement
      open
      diffy

      Patrick Thier (Gerrit)

      unread,
      Mar 11, 2026, 9:40:55 AMMar 11
      to Dominik Inführ, Jakob Linke, V8 LUCI CQ, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org

      Patrick Thier voted Commit-Queue+2

      Commit-Queue+2
      Open in Gerrit

      Related details

      Attention set is empty
      Submit Requirements:
      • requirement satisfiedCode-Owners
      • requirement satisfiedCode-Review
      • requirement satisfiedReview-Enforcement
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: v8/v8
      Gerrit-Branch: main
      Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
      Gerrit-Change-Number: 7654235
      Gerrit-PatchSet: 2
      Gerrit-Owner: Patrick Thier <pth...@chromium.org>
      Gerrit-Reviewer: Dominik Inführ <dinf...@chromium.org>
      Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
      Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
      Gerrit-CC: Hannes Payer <hpa...@chromium.org>
      Gerrit-Comment-Date: Wed, 11 Mar 2026 13:40:51 +0000
      Gerrit-HasComments: No
      Gerrit-Has-Labels: Yes
      satisfied_requirement
      open
      diffy

      V8 LUCI CQ (Gerrit)

      unread,
      Mar 11, 2026, 9:42:42 AMMar 11
      to Patrick Thier, Dominik Inführ, Jakob Linke, Hannes Payer, devtools-...@chromium.org, dmercadi...@chromium.org, jgrube...@chromium.org, leszek...@chromium.org, mlippau...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com, verwaes...@chromium.org, victorgo...@chromium.org

      V8 LUCI CQ submitted the change

      Change information

      Commit message:
      [regexp] Move JSRegExp::source() to RegExpData::escaped_source()

      - Move JSRegExp::source() to RegExpData::escaped_source().
      - Remove RegExpBoilerplateDescription::source.
      - Pass RegExpData::escaped_source() instead of RegExpData::source()
      through the compile pipeline (used by loggers).
      Fixed: 490381618
      Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
      Reviewed-by: Jakob Linke <jgr...@chromium.org>
      Reviewed-by: Dominik Inführ <dinf...@chromium.org>
      Commit-Queue: Patrick Thier <pth...@chromium.org>
      Cr-Commit-Position: refs/heads/main@{#105732}
      Files:
      • M src/api/api.cc
      • M src/builtins/base.tq
      • M src/builtins/builtins-constructor-gen.cc
      • M src/builtins/builtins-regexp-gen.cc
      • M src/builtins/regexp-match.tq
      • M src/builtins/regexp.tq
      • M src/compiler/access-builder.cc
      • M src/compiler/access-builder.h
      • M src/compiler/heap-refs.cc
      • M src/compiler/heap-refs.h
      • M src/compiler/js-create-lowering.cc
      • M src/diagnostics/objects-debug.cc
      • M src/diagnostics/objects-printer.cc
      • M src/heap/factory-base.cc
      • M src/heap/factory-base.h
      • M src/heap/factory.cc
      • M src/heap/factory.h
      • M src/maglev/maglev-graph-builder.cc
      • M src/maglev/maglev-ir.h
      • M src/objects/js-objects.cc
      • M src/objects/js-regexp-inl.h
      • M src/objects/js-regexp.cc
      • M src/objects/js-regexp.h
      • M src/objects/js-regexp.tq
      • M src/objects/literal-objects-inl.h
      • M src/objects/literal-objects.cc
      • M src/objects/literal-objects.h
      • M src/objects/literal-objects.tq
      • M src/objects/objects-body-descriptors-inl.h
      • M src/objects/value-serializer.cc
      • M src/profiler/heap-snapshot-generator.cc
      • M src/regexp/experimental/experimental.cc
      • M src/regexp/experimental/experimental.h
      • M src/regexp/regexp.cc
      • M src/runtime/runtime-literals.cc
      • M test/unittests/regexp/regexp-unittest.cc
      • M tools/gen-postmortem-metadata.py
      Change size: L
      Delta: 37 files changed, 288 insertions(+), 317 deletions(-)
      Branch: refs/heads/main
      Submit Requirements:
      • requirement satisfiedCode-Review: +1 by Jakob Linke, +1 by Dominik Inführ
      Open in Gerrit
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: merged
      Gerrit-Project: v8/v8
      Gerrit-Branch: main
      Gerrit-Change-Id: Ida57bfe80de9da2d7b9b9ebf987d394f8011a1f9
      Gerrit-Change-Number: 7654235
      Gerrit-PatchSet: 3
      Gerrit-Owner: Patrick Thier <pth...@chromium.org>
      Gerrit-Reviewer: Dominik Inführ <dinf...@chromium.org>
      Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
      Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
      Gerrit-CC: Hannes Payer <hpa...@chromium.org>
      open
      diffy
      satisfied_requirement
      Reply all
      Reply to author
      Forward
      0 new messages