[regexp] A different approach to the not-at-start optimization [v8/v8 : main]

0 views
Skip to first unread message

chromeperf@appspot.gserviceaccount.com (Gerrit)

unread,
Dec 10, 2025, 1:57:58 AM (6 days ago) Dec 10
to Erik Corry, V8 LUCI CQ, jgrube...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com
Attention needed from Erik Corry

Message from chrom...@appspot.gserviceaccount.com

📍 Job mac-m4-mini-perf/jetstream2 complete.

See results at: https://pinpoint-dot-chromeperf.appspot.com/job/1041c387310000

Open in Gerrit

Related details

Attention is currently required from:
  • Erik Corry
Submit Requirements:
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: v8/v8
Gerrit-Branch: main
Gerrit-Change-Id: I29f1dadee3c8d696081d36b42f9090f9d4bb82ae
Gerrit-Change-Number: 7241453
Gerrit-PatchSet: 4
Gerrit-Owner: Erik Corry <erik...@chromium.org>
Gerrit-Reviewer: Erik Corry <erik...@chromium.org>
Gerrit-Attention: Erik Corry <erik...@chromium.org>
Gerrit-Comment-Date: Wed, 10 Dec 2025 06:57:55 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: No
unsatisfied_requirement
open
diffy

Erik Corry (Gerrit)

unread,
Dec 10, 2025, 3:47:20 AM (6 days ago) Dec 10
to chrom...@appspot.gserviceaccount.com, V8 LUCI CQ, jgrube...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com

Erik Corry added 1 comment

Patchset-level comments
File-level comment, Patchset 5 (Latest):
Erik Corry . resolved

The only one that gets slower is tagcloud-SP from Sunspider. The only regexp that is code generated differently is this regexp:

/(?:^|:|,)(?:\s*\[)+/g

The old code was using quickcheck, where the new one uses a bitmap to find the colon or comma.

Old:

label[00ea8408]: (Bind)
LoadCurrentCharacter(cp_offset=0, label[00000000] (1 chars) (eats at least 1));
CheckCharacterAfterAnd(c=0x0028((), mask=0x00e9, label[f1bd3498]);
label[f1bd34a4]: (Bind)
AdvanceCurrentPosition(by=1);
GoTo(label[00ea8408]);

New:

SkipUntilBitInTable(cp_offset=0, advance_by=1, on_match=label[20ec9540], on_no_match=label[20ec9540]
................................
............X.............X.....
................................
................................

Ironically this is because previously the Boyer-Moore analysis was failing and now it is succeeding.

Also interesting that this regexp got faster on my machine (Intel), but slower on tagcloud (Mac) so perhaps it's about the bit search instructions. On my machine it's not a win to disable Boyer-Moore for this regexp.

Open in Gerrit

Related details

Attention set is empty
Submit Requirements:
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: v8/v8
Gerrit-Branch: main
Gerrit-Change-Id: I29f1dadee3c8d696081d36b42f9090f9d4bb82ae
Gerrit-Change-Number: 7241453
Gerrit-PatchSet: 5
Gerrit-Comment-Date: Wed, 10 Dec 2025 08:47:15 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
unsatisfied_requirement
open
diffy

chromeperf@appspot.gserviceaccount.com (Gerrit)

unread,
Dec 10, 2025, 4:43:52 AM (6 days ago) Dec 10
to Erik Corry, V8 LUCI CQ, jgrube...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com
Attention needed from Erik Corry

Message from chrom...@appspot.gserviceaccount.com

📍 Job mac-m4-mini-perf/jetstream2 complete.

See results at: https://pinpoint-dot-chromeperf.appspot.com/job/11a25abd310000

Open in Gerrit

Related details

Attention is currently required from:
  • Erik Corry
Submit Requirements:
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: v8/v8
Gerrit-Branch: main
Gerrit-Change-Id: I29f1dadee3c8d696081d36b42f9090f9d4bb82ae
Gerrit-Change-Number: 7241453
Gerrit-PatchSet: 5
Gerrit-Owner: Erik Corry <erik...@chromium.org>
Gerrit-Reviewer: Erik Corry <erik...@chromium.org>
unsatisfied_requirement
open
diffy

chromeperf@appspot.gserviceaccount.com (Gerrit)

unread,
Dec 15, 2025, 9:51:34 AM (23 hours ago) Dec 15
to Erik Corry, Jakob Linke, V8 LUCI CQ, jgrube...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com
Attention needed from Erik Corry and Jakob Linke

Message from chrom...@appspot.gserviceaccount.com

📍 Job mac-m4-mini-perf/jetstream2 complete.

See results at: https://pinpoint-dot-chromeperf.appspot.com/job/16d1951cb10000

Open in Gerrit

Related details

Attention is currently required from:
  • Erik Corry
  • Jakob Linke
Submit Requirements:
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: v8/v8
Gerrit-Branch: main
Gerrit-Change-Id: I29f1dadee3c8d696081d36b42f9090f9d4bb82ae
Gerrit-Change-Number: 7241453
Gerrit-PatchSet: 6
Gerrit-Owner: Erik Corry <erik...@chromium.org>
Gerrit-Reviewer: Erik Corry <erik...@chromium.org>
Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
Gerrit-Attention: Jakob Linke <jgr...@chromium.org>
Gerrit-Attention: Erik Corry <erik...@chromium.org>
Gerrit-Comment-Date: Mon, 15 Dec 2025 14:51:29 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: No
unsatisfied_requirement
open
diffy

chromeperf@appspot.gserviceaccount.com (Gerrit)

unread,
Dec 15, 2025, 9:53:34 AM (23 hours ago) Dec 15
to Erik Corry, Jakob Linke, V8 LUCI CQ, jgrube...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com
Attention needed from Erik Corry and Jakob Linke

Message from chrom...@appspot.gserviceaccount.com

📍 Job mac-m4-mini-perf/jetstream2 complete.

See results at: https://pinpoint-dot-chromeperf.appspot.com/job/10b38617310000

Open in Gerrit

Related details

Attention is currently required from:
  • Erik Corry
  • Jakob Linke
Submit Requirements:
  • requirement is not satisfiedCode-Owners
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedReview-Enforcement
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: v8/v8
Gerrit-Branch: main
Gerrit-Change-Id: I29f1dadee3c8d696081d36b42f9090f9d4bb82ae
Gerrit-Change-Number: 7241453
Gerrit-PatchSet: 6
Gerrit-Owner: Erik Corry <erik...@chromium.org>
Gerrit-Reviewer: Erik Corry <erik...@chromium.org>
Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
Gerrit-Attention: Jakob Linke <jgr...@chromium.org>
Gerrit-Attention: Erik Corry <erik...@chromium.org>
Gerrit-Comment-Date: Mon, 15 Dec 2025 14:53:29 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: No
unsatisfied_requirement
open
diffy

Jakob Linke (Gerrit)

unread,
8:19 AM (21 minutes ago) 8:19 AM
to Erik Corry, Patrick Thier, chrom...@appspot.gserviceaccount.com, V8 LUCI CQ, jgrube...@chromium.org, pthier...@chromium.org, v8-re...@googlegroups.com
Attention needed from Erik Corry and Patrick Thier

Jakob Linke added 17 comments

Commit Message
Line 7, Patchset 6 (Latest):[regexp] A different approach to the not-at-start optimization
Jakob Linke . unresolved

Very nice how we no longer pass not_at_start around.

It sounds like this is also expected to be a perf improvement. Do you know why we don't see that on pp, and why tagcloud regresses?

File src/regexp/regexp-compiler-tonode.cc
Line 35, Patchset 6 (Latest): if (on_success->IsBacktrack()) return on_success;
Jakob Linke . unresolved

Also here, as this is quite important please add a comment.

Line 1114, Patchset 6 (Latest): return zone->template New<EndNode>(EndNode::BACKTRACK, zone);
Jakob Linke . unresolved

Just curious, do you need the `template `? Below are calls that do fine without it.

File src/regexp/regexp-compiler.h
Line 609, Patchset 6 (Latest): bool not_at_start() const { return not_at_start_; }
Jakob Linke . unresolved

Optional: Since you're here, wdyt about flipping this to `at_start` to make conditions easier to read?

File src/regexp/regexp-compiler.cc
Line 638, Patchset 6 (Latest): assembler->GoTo(trace->backtrack());
Jakob Linke . unresolved

Please comment why this happens before everything else.

Isn't Bind needed to avoid dcheck failures?

Line 648, Patchset 6 (Latest): switch (action_) {
Jakob Linke . unresolved

Suggest: `DCHECK_EQ(action_, ACCEPT); ...`

Line 654, Patchset 6 (Latest): // This case is handled in a different virtual method.
Jakob Linke . unresolved

nit: this comment doesn't apply to BACKTRACK.

Line 1420, Patchset 6 (Parent): BoyerMooreLookahead* bm, bool not_at_start) {
Jakob Linke . resolved

Very nice!

Line 1485, Patchset 6 (Latest): if (action_ == BACKTRACK) return;
Jakob Linke . unresolved

CHECK_EQ(action_, BACKTRACK)

Line 1528, Patchset 6 (Latest):int RegExpNode::EatsAtLeast() { return eats_at_least_; }
Jakob Linke . unresolved

Why the change to int? Maybe uint8_t makes sense, also given the UINT8_MAX constants used elsewhere.

Line 2224, Patchset 6 (Latest): std::min(int{kMaxLookaheadForBoyerMoore}, EatsAtLeast());
Jakob Linke . unresolved

nit: static_cast, also below.

Line 2314, Patchset 6 (Latest): if (action_ == BACKTRACK) {
Jakob Linke . unresolved

nit: CHECK_EQ

Line 3659, Patchset 6 (Parent): // We are going to advance backward, so we may end up at the start.
Jakob Linke . unresolved

How does this new design handle the read_backwards case?

Line 3749, Patchset 6 (Latest): int eats_at_least = i == 0 ? UINT8_MAX : that->eats_at_least_info();
Jakob Linke . unresolved

nit: the underlying types aren't UINT8 anymore.

File src/regexp/regexp-nodes.h
Line 570, Patchset 6 (Latest): if (action == BACKTRACK) set_eats_at_least_info(UINT8_MAX);
Jakob Linke . unresolved

Also here. Wdyt of a named constant `kMaxEatsAtLeastValue`?

Line 253, Patchset 6 (Latest): virtual bool IsBacktrack() const { return false; }
Jakob Linke . resolved

I was going to suggest `bool Is##type##Node() { return As##type##Node() != nullptr; }`, but I see Backtrack is a special case of EndNode.

Line 228, Patchset 6 (Latest): int eats_at_least_info() const { return eats_at_least_; }
Jakob Linke . unresolved

nit: rename to remove the _info? Also the setter.

Open in Gerrit

Related details

Attention is currently required from:
  • Erik Corry
  • Patrick Thier
Submit Requirements:
    • requirement is not satisfiedCode-Owners
    • requirement is not satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement is not satisfiedReview-Enforcement
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: v8/v8
    Gerrit-Branch: main
    Gerrit-Change-Id: I29f1dadee3c8d696081d36b42f9090f9d4bb82ae
    Gerrit-Change-Number: 7241453
    Gerrit-PatchSet: 6
    Gerrit-Owner: Erik Corry <erik...@chromium.org>
    Gerrit-Reviewer: Erik Corry <erik...@chromium.org>
    Gerrit-Reviewer: Jakob Linke <jgr...@chromium.org>
    Gerrit-Reviewer: Patrick Thier <pth...@chromium.org>
    Gerrit-Attention: Patrick Thier <pth...@chromium.org>
    Gerrit-Attention: Erik Corry <erik...@chromium.org>
    Gerrit-Comment-Date: Tue, 16 Dec 2025 13:19:50 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    unsatisfied_requirement
    open
    diffy
    Reply all
    Reply to author
    Forward
    0 new messages