Issue 6633 in v8: RegExp is much faster than indexOf

60 views
Skip to first unread message

ahmadbam… via monorail

unread,
Jul 25, 2017, 4:45:08 AM7/25/17
to v8-re...@googlegroups.com
Status: Untriaged
Owner: ----

New issue 6633 by ahmadbam...@gmail.com: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633

Version: 851e8057e652d134fb032e42caff8b60fd40d363
OS: OSX

Lets take the following test case:

```
/lab(?:o)rum/.exec(loremString)

/laborum/.exec(loremString)
```

(I am testing against a lorem ipsum string where "laborum" is the last word)

The first expression (non-simple) is almost 2x faster that its string literal only (simple) counterpart.

Reading the sourcecode in v8 does the following:

1. The compiler checks if the expression is simple to opt out of regex by calling `AtomCompile`.

https://github.com/v8/v8/blob/46d815e7377e388dbcc81054801b35ac67f57b40/src/regexp/jsregexp.cc#L163

2. The `AtomCompile` does an `indexOf` search

https://github.com/v8/v8/blob/46d815e7377e388dbcc81054801b35ac67f57b40/src/regexp/jsregexp.cc#L206

3. the indexOf search does `StringSearch`

https://github.com/v8/v8/blob/master/src/string-search.h#L59

4. the string search does a `BoyerMooreHorspoolSearch`


As a Result:
It seems that v8 opts out from regex in simple strings assuming that simple atomic string search is faster, however it is not the case in my tests.



here is a link to the jsperf tests:
https://jsperf.com/simple-quantifier-optimization-2/1

here is a link to stackoverflow where the findings started:
https://stackoverflow.com/questions/45291700/javascript-regex-non-capture-faster-than-no-parenthesis-at-all



--
You received this message because:
1. The project was configured to send all issue notifications to this address

You may adjust your notification preferences at:
https://bugs.chromium.org/hosting/settings

habl… via monorail

unread,
Jul 25, 2017, 5:12:23 AM7/25/17
to v8-re...@googlegroups.com
Updates:
Components: Irregexp Compiler
Labels: Priority-2
Owner: jgr...@chromium.org
Status: Assigned

Comment #1 on issue 6633 by hab...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c1

That sounds like something we should check.

jgru… via monorail

unread,
Jul 25, 2017, 5:23:23 AM7/25/17
to v8-re...@googlegroups.com

Comment #2 on issue 6633 by jgr...@google.com: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c2

Thanks for the detailed repro!

See also: https://crbug.com/v8/6462

We recently improved this by removing the runtime call for ATOM regexps. I'm not sure it makes sense to optimize every case as the switch between search algorithms will always be a trade-off.

bmeu… via monorail

unread,
Jul 25, 2017, 5:25:56 AM7/25/17
to v8-re...@googlegroups.com
Updates:
Cc: ja...@chromium.org

Comment #3 on issue 6633 by bmeu...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c3

(No comment was entered for this change.)

jgru… via monorail

unread,
Jul 26, 2017, 9:37:05 AM7/26/17
to v8-re...@googlegroups.com
Updates:
Status: WontFix

Comment #4 on issue 6633 by jgr...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c4

I had another look at this and there's nothing obvious going wrong. The string search heuristics (which first do a naive search and later switch to BoyerMooreHorspool) are just plain slower in this case than generated irregexp code.

The picture changes when picking a pattern that can be exploited better by BMH:

// The pattern is 'aaaaaaaaaabaaaaaaaaaabaaaaaaaaaab', which BMH can use to skip ahead.

ATOM: 1191.586
IRREGEXP: 2534.766

Again, IMHO there's no point in tweaking heuristics for this as we'll always get it wrong in some cases and pick the slower algorithm.

Microbenchmarks attached for reference.

Attachments:
bench-regexp-p-exec1-atom.js 839 bytes
bench-regexp-p-exec1-irregexp.js 800 bytes

jgru… via monorail

unread,
Jul 26, 2017, 10:33:31 AM7/26/17
to v8-re...@googlegroups.com
Updates:
Cc: yan...@chromium.org
Labels: HW-All OS-All
Status: Assigned

Comment #5 on issue 6633 by jgr...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c5

Actually, on further investigation my 'atom' example from #4 is invalid (it also generates IRREGEXP code due to HasFewDifferentCharacters [0]).

I haven't been able to construct an example in which BoyerMooreHorspoolSearch (through ATOM regexps) is faster than generated IRREGEXP code.

The only significant speedup through ATOM regexps I've seen is with single-char patterns:

const atom = /a/;
const irregexp = /(?:a)/;

atom: 598.386
irregexp: 847.7489999999999

Perhaps we should reevaluate when to switch to ATOM vs. IRREGEXP kinds.

[0] https://cs.chromium.org/chromium/src/v8/src/regexp/jsregexp.cc?l=109&rcl=0e5abab4adf44b37b433e3cd361193b05d160b22

erikco… via monorail

unread,
Jul 26, 2017, 5:33:07 PM7/26/17
to v8-re...@googlegroups.com

Comment #6 on issue 6633 by erik...@google.com: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c6

Getting rid of the Atom version sounds great. Thoughts:

* What about benchmark suites that use regexps?
* What is the memory use of Atom vs Irregexp? Some extensions are very heavy regexp users.
* What can be done to fix the single-char regexp case? :-)

jgru… via monorail

unread,
Jul 27, 2017, 3:30:37 AM7/27/17
to v8-re...@googlegroups.com

Comment #7 on issue 6633 by jgr...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c7

I added this to my backlog. As a first step, we could change RegExpImpl::Compile to generate ATOM regexps only for the single-char case and check how benchmarks react.

Possibilities for the single-char case could be to call memchr either from IRREGEXP code or from RegExpExecInternal (which used to be RegExpExecStub).

jgru… via monorail

unread,
Jul 28, 2017, 5:12:55 AM7/28/17
to v8-re...@googlegroups.com

Comment #8 on issue 6633 by jgr...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c8

Irregexp code is fairly large for simple regexps:

d8> /(ab)/.exec(".")
Instruction size in bytes: 725

Atom regexps only store the pattern string (possibly twice), so we're looking at an overhead of several 100x.
But: Code is only generated once for each regexp literal in the source code, so it may just be an option to take this hit. Thoughts?

The Octane/RegExp score is unaffected by restricting Atom regexps to single-char patterns.

For reference, the disassembly of the above:

d8> /(ab)/.exec(".")
kind = REGEXP
name = IRREGEXP
compiler = unknown
Instructions (size = 725)
0x156930384060 0 e9d6000000 jmp 0x15693038413b <+0xdb>
0x156930384065 5 4883e904 REX.W subq rcx,0x4
0x156930384069 9 c70133010000 movl [rcx],0x133
0x15693038406f f 48a190889bd84d560000 REX.W movq rax,(0x564dd89b8890) ;; external reference (RegExpStack::limit_address())
0x156930384079 19 483bc8 REX.W cmpq rcx,rax
0x15693038407c 1c 0f8705000000 ja 0x156930384087 <+0x27>
0x156930384082 22 e848020000 call 0x1569303842cf <+0x26f>
0x156930384087 27 83ffff cmpl rdi,0xff
0x15693038408a 2a 0f8d27000000 jge 0x1569303840b7 <+0x57>
0x156930384090 30 0fb6543e01 movzxbl rdx,[rsi+rdi*1+0x1]
0x156930384095 35 48b8c9c792a3653c0000 REX.W movq rax,0x3c65a392c7c9 ;; object: 0x3c65a392c7c9 <ByteArray[128]>
0x15693038409f 3f 488bda REX.W movq rbx,rdx
0x1569303840a2 42 4883e37f REX.W andq rbx,0x7f
0x1569303840a6 46 807c180f00 cmpb [rax+rbx*1+0xf],0x0
0x1569303840ab 4b 0f8506000000 jnz 0x1569303840b7 <+0x57>
0x1569303840b1 51 4883c702 REX.W addq rdi,0x2
0x1569303840b5 55 ebd0 jmp 0x156930384087 <+0x27>
0x1569303840b7 57 83ffff cmpl rdi,0xff
0x1569303840ba 5a 0f8d9a010000 jge 0x15693038425a <+0x1fa>
0x1569303840c0 60 0fb7143e movzxwl rdx,[rsi+rdi*1]
0x1569303840c4 64 81fa61620000 cmpl rdx,0x6261
0x1569303840ca 6a 0f8406000000 jz 0x1569303840d6 <+0x76>
0x1569303840d0 70 4883c701 REX.W addq rdi,0x1
0x1569303840d4 74 ebb1 jmp 0x156930384087 <+0x27>
0x1569303840d6 76 4883e904 REX.W subq rcx,0x4
0x1569303840da 7a 8939 movl [rcx],rdi
0x1569303840dc 7c 48897db0 REX.W movq [rbp-0x50],rdi
0x1569303840e0 80 488d4702 REX.W leaq rax,[rdi+0x2]
0x1569303840e4 84 488945a8 REX.W movq [rbp-0x58],rax
0x1569303840e8 88 48897da0 REX.W movq [rbp-0x60],rdi
0x1569303840ec 8c 488d4702 REX.W leaq rax,[rdi+0x2]
0x1569303840f0 90 48894598 REX.W movq [rbp-0x68],rax
0x1569303840f4 94 4883c702 REX.W addq rdi,0x2
0x1569303840f8 98 4883e904 REX.W subq rcx,0x4
0x1569303840fc 9c c7011e010000 movl [rcx],0x11e
0x156930384102 a2 48a190889bd84d560000 REX.W movq rax,(0x564dd89b8890) ;; external reference (RegExpStack::limit_address())
0x15693038410c ac 483bc8 REX.W cmpq rcx,rax
0x15693038410f af 0f8705000000 ja 0x15693038411a <+0xba>
0x156930384115 b5 e8b5010000 call 0x1569303842cf <+0x26f>
0x15693038411a ba e9f1000000 jmp 0x156930384210 <+0x1b0>
0x15693038411f bf 488b45b8 REX.W movq rax,[rbp-0x48]
0x156930384123 c3 488945a0 REX.W movq [rbp-0x60],rax
0x156930384127 c7 48894598 REX.W movq [rbp-0x68],rax
0x15693038412b cb 486339 REX.W movsxlq rdi,[rcx]
0x15693038412e ce 4883c104 REX.W addq rcx,0x4
0x156930384132 d2 eb9c jmp 0x1569303840d0 <+0x70>
0x156930384134 d4 33c0 xorl rax,rax
0x156930384136 d6 e916010000 jmp 0x156930384251 <+0x1f1>
0x15693038413b db 55 push rbp
0x15693038413c dc 4889e5 REX.W movq rbp,rsp
0x15693038413f df 57 push rdi
0x156930384140 e0 56 push rsi
0x156930384141 e1 52 push rdx
0x156930384142 e2 51 push rcx
0x156930384143 e3 4150 push r8
0x156930384145 e5 4151 push r9
0x156930384147 e7 53 push rbx
0x156930384148 e8 6a00 push 0x0
0x15693038414a ea 6a00 push 0x0
0x15693038414c ec 4889e1 REX.W movq rcx,rsp
0x15693038414f ef 49baa8f197d84d560000 REX.W movq r10,0x564dd897f1a8 ;; external reference (StackGuard::address_of_jslimit())
0x156930384159 f9 492b0a REX.W subq rcx,[r10]
0x15693038415c fc 0f8616000000 jna 0x156930384178 <+0x118>
0x156930384162 102 4883f920 REX.W cmpq rcx,0x20
0x156930384166 106 0f8350000000 jnc 0x1569303841bc <+0x15c>
0x15693038416c 10c 48c7c0ffffffff REX.W movq rax,0xffffffff
0x156930384173 113 e9d9000000 jmp 0x156930384251 <+0x1f1>
0x156930384178 118 49b80140383069150000 REX.W movq r8,0x156930384001 ;; object: 0x156930384001 <Code REGEXP>
0x156930384182 122 4989e2 REX.W movq r10,rsp
0x156930384185 125 4883ec08 REX.W subq rsp,0x8
0x156930384189 129 4883e4f0 REX.W andq rsp,0xf0
0x15693038418d 12d 4c891424 REX.W movq [rsp],r10
0x156930384191 131 488bd5 REX.W movq rdx,rbp
0x156930384194 134 498bf0 REX.W movq rsi,r8
0x156930384197 137 488d7c24f8 REX.W leaq rdi,[rsp-0x8]
0x15693038419c 13c 48b8006f7c24207f0000 REX.W movq rax,0x7f20247c6f00 ;; external reference (RegExpMacroAssembler*::CheckStackGuardState())
0x1569303841a6 146 40f6c40f testb rsp,0xf
0x1569303841aa 14a 7401 jz 0x1569303841ad <+0x14d>
0x1569303841ac 14c cc int3l
0x1569303841ad 14d ffd0 call rax
0x1569303841af 14f 488b2424 REX.W movq rsp,[rsp]
0x1569303841b3 153 4885c0 REX.W testq rax,rax
0x1569303841b6 156 0f8595000000 jnz 0x156930384251 <+0x1f1>
0x1569303841bc 15c 4883ec20 REX.W subq rsp,0x20
0x1569303841c0 160 488b75e0 REX.W movq rsi,[rbp-0x20]
0x1569303841c4 164 488b7de8 REX.W movq rdi,[rbp-0x18]
0x1569303841c8 168 482bfe REX.W subq rdi,rsi
0x1569303841cb 16b 488b5df0 REX.W movq rbx,[rbp-0x10]
0x1569303841cf 16f 48f7db REX.W negq rbx
0x1569303841d2 172 488d441fff REX.W leaq rax,[rdi+rbx*1-0x1]
0x1569303841d7 177 488945b8 REX.W movq [rbp-0x48],rax
0x1569303841db 17b 49b80140383069150000 REX.W movq r8,0x156930384001 ;; object: 0x156930384001 <Code REGEXP>
0x1569303841e5 185 837df000 cmpl [rbp-0x10],0x0
0x1569303841e9 189 7507 jnz 0x1569303841f2 <+0x192>
0x1569303841eb 18b ba0a000000 movl rdx,0xa
0x1569303841f0 190 eb05 jmp 0x1569303841f7 <+0x197>
0x1569303841f2 192 0fb6543eff movzxbl rdx,[rsi+rdi*1-0x1]
0x1569303841f7 197 488945b0 REX.W movq [rbp-0x50],rax
0x1569303841fb 19b 488945a8 REX.W movq [rbp-0x58],rax
0x1569303841ff 19f 488945a0 REX.W movq [rbp-0x60],rax
0x156930384203 1a3 48894598 REX.W movq [rbp-0x68],rax
0x156930384207 1a7 488b4d10 REX.W movq rcx,[rbp+0x10]
0x15693038420b 1ab e955feffff jmp 0x156930384065 <+0x5>
0x156930384210 1b0 488b55f0 REX.W movq rdx,[rbp-0x10]
0x156930384214 1b4 488b5dd8 REX.W movq rbx,[rbp-0x28]
0x156930384218 1b8 488b4de0 REX.W movq rcx,[rbp-0x20]
0x15693038421c 1bc 482b4de8 REX.W subq rcx,[rbp-0x18]
0x156930384220 1c0 4803ca REX.W addq rcx,rdx
0x156930384223 1c3 488b45b0 REX.W movq rax,[rbp-0x50]
0x156930384227 1c7 4803c1 REX.W addq rax,rcx
0x15693038422a 1ca 8903 movl [rbx],rax
0x15693038422c 1cc 488b45a8 REX.W movq rax,[rbp-0x58]
0x156930384230 1d0 4803c1 REX.W addq rax,rcx
0x156930384233 1d3 894304 movl [rbx+0x4],rax
0x156930384236 1d6 488b45a0 REX.W movq rax,[rbp-0x60]
0x15693038423a 1da 4803c1 REX.W addq rax,rcx
0x15693038423d 1dd 894308 movl [rbx+0x8],rax
0x156930384240 1e0 488b4598 REX.W movq rax,[rbp-0x68]
0x156930384244 1e4 4803c1 REX.W addq rax,rcx
0x156930384247 1e7 89430c movl [rbx+0xc],rax
0x15693038424a 1ea 48c7c001000000 REX.W movq rax,0x1
0x156930384251 1f1 488b5dc8 REX.W movq rbx,[rbp-0x38]
0x156930384255 1f5 488be5 REX.W movq rsp,rbp
0x156930384258 1f8 5d pop rbp
0x156930384259 1f9 c3 retl
0x15693038425a 1fa 48a1a8f197d84d560000 REX.W movq rax,(0x564dd897f1a8) ;; external reference (StackGuard::address_of_jslimit())
0x156930384264 204 483be0 REX.W cmpq rsp,rax
0x156930384267 207 0f8705000000 ja 0x156930384272 <+0x212>
0x15693038426d 20d e80c000000 call 0x15693038427e <+0x21e>
0x156930384272 212 486319 REX.W movsxlq rbx,[rcx]
0x156930384275 215 4883c104 REX.W addq rcx,0x4
0x156930384279 219 4903d8 REX.W addq rbx,r8
0x15693038427c 21c ffe3 jmp rbx
0x15693038427e 21e 4c290424 REX.W subq [rsp],r8
0x156930384282 222 51 push rcx
0x156930384283 223 57 push rdi
0x156930384284 224 4989e2 REX.W movq r10,rsp
0x156930384287 227 4883ec08 REX.W subq rsp,0x8
0x15693038428b 22b 4883e4f0 REX.W andq rsp,0xf0
0x15693038428f 22f 4c891424 REX.W movq [rsp],r10
0x156930384293 233 488bd5 REX.W movq rdx,rbp
0x156930384296 236 498bf0 REX.W movq rsi,r8
0x156930384299 239 488d7c24f8 REX.W leaq rdi,[rsp-0x8]
0x15693038429e 23e 48b8006f7c24207f0000 REX.W movq rax,0x7f20247c6f00 ;; external reference (RegExpMacroAssembler*::CheckStackGuardState())
0x1569303842a8 248 40f6c40f testb rsp,0xf
0x1569303842ac 24c 7401 jz 0x1569303842af <+0x24f>
0x1569303842ae 24e cc int3l
0x1569303842af 24f ffd0 call rax
0x1569303842b1 251 488b2424 REX.W movq rsp,[rsp]
0x1569303842b5 255 4885c0 REX.W testq rax,rax
0x1569303842b8 258 7597 jnz 0x156930384251 <+0x1f1>
0x1569303842ba 25a 49b80140383069150000 REX.W movq r8,0x156930384001 ;; object: 0x156930384001 <Code REGEXP>
0x1569303842c4 264 5f pop rdi
0x1569303842c5 265 59 pop rcx
0x1569303842c6 266 488b75e0 REX.W movq rsi,[rbp-0x20]
0x1569303842ca 26a 4c010424 REX.W addq [rsp],r8
0x1569303842ce 26e c3 retl
0x1569303842cf 26f 4c290424 REX.W subq [rsp],r8
0x1569303842d3 273 56 push rsi
0x1569303842d4 274 57 push rdi
0x1569303842d5 275 4989e2 REX.W movq r10,rsp
0x1569303842d8 278 4883ec08 REX.W subq rsp,0x8
0x1569303842dc 27c 4883e4f0 REX.W andq rsp,0xf0
0x1569303842e0 280 4c891424 REX.W movq [rsp],r10
0x1569303842e4 284 488bf9 REX.W movq rdi,rcx
0x1569303842e7 287 488d7510 REX.W leaq rsi,[rbp+0x10]
0x1569303842eb 28b 48bac0d897d84d560000 REX.W movq rdx,0x564dd897d8c0 ;; external reference (isolate)
0x1569303842f5 295 48b820aa4f24207f0000 REX.W movq rax,0x7f20244faa20 ;; external reference (NativeRegExpMacroAssembler::GrowStack())
0x1569303842ff 29f 40f6c40f testb rsp,0xf
0x156930384303 2a3 7401 jz 0x156930384306 <+0x2a6>
0x156930384305 2a5 cc int3l
0x156930384306 2a6 ffd0 call rax
0x156930384308 2a8 488b2424 REX.W movq rsp,[rsp]
0x15693038430c 2ac 4885c0 REX.W testq rax,rax
0x15693038430f 2af 0f8414000000 jz 0x156930384329 <+0x2c9>
0x156930384315 2b5 488bc8 REX.W movq rcx,rax
0x156930384318 2b8 49b80140383069150000 REX.W movq r8,0x156930384001 ;; object: 0x156930384001 <Code REGEXP>
0x156930384322 2c2 5f pop rdi
0x156930384323 2c3 5e pop rsi
0x156930384324 2c4 4c010424 REX.W addq [rsp],r8
0x156930384328 2c8 c3 retl
0x156930384329 2c9 48c7c0ffffffff REX.W movq rax,0xffffffff
0x156930384330 2d0 e91cffffff jmp 0x156930384251 <+0x1f1>

yang… via monorail

unread,
Jul 28, 2017, 5:40:12 AM7/28/17
to v8-re...@googlegroups.com

Comment #9 on issue 6633 by yan...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c9

Part of the code of a regexp is boilerplate, for example to deal with repeating in a global regexp. Kaybe we can factor that out into builtins? Not sure about prio though.

ahmadbam… via monorail

unread,
Jul 28, 2017, 6:52:57 AM7/28/17
to v8-re...@googlegroups.com

Comment #10 on issue 6633 by ahmadbam...@gmail.com: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c10

In my opinion the byte size should not be much of a concern for this case:
1. users are aware that they are using regex, if they care so much for this difference they can choose otherwise.
2. its generated once per expression over the lifetime of the app.

jgru… via monorail

unread,
Jul 28, 2017, 6:57:36 AM7/28/17
to v8-re...@googlegroups.com

Comment #11 on issue 6633 by jgr...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c11

I have a CL here (http://crrev.com/c/589435) that limits ATOM regexps to single-char patterns if you'd like to try this out locally.

bugdro… via monorail

unread,
Aug 2, 2017, 8:26:16 AM8/2/17
to v8-re...@googlegroups.com

Comment #12 on issue 6633 by bugd...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c12

The following revision refers to this bug:
https://chromium.googlesource.com/v8/v8.git/+/062bb7d487b934d9e73df5ade31e887fe6d0f83c

commit 062bb7d487b934d9e73df5ade31e887fe6d0f83c
Author: jgruber <jgr...@chromium.org>
Date: Wed Aug 02 12:25:23 2017

[regexp] Limit ATOM regexps to single-character patterns

There's an inherent trade-off when deciding between ATOM and IRREGEXP
regexps: IRREGEXP is faster at runtime for all but trivial single-character
patterns, while ATOM regexps have a lower memory overhead.

This CL is intended to help investigate impact on benchmarks and real-world
code - if something tanks, it's easy to revert, otherwise it can be a first
step towards a possible removal of ATOM regexps.

Bug: v8:6633
Change-Id: Ia41d8eb28d33952735562d3d4127202746a6ac4e
Reviewed-on: https://chromium-review.googlesource.com/589435
Reviewed-by: Yang Guo <yan...@chromium.org>
Commit-Queue: Jakob Gruber <jgr...@chromium.org>
Cr-Commit-Position: refs/heads/master@{#47081}
[modify] https://crrev.com/062bb7d487b934d9e73df5ade31e887fe6d0f83c/src/regexp/jsregexp.cc

bugdro… via monorail

unread,
Aug 3, 2017, 2:34:10 AM8/3/17
to v8-re...@googlegroups.com

Comment #13 on issue 6633 by bugd...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c13


The following revision refers to this bug:
https://chromium.googlesource.com/v8/v8.git/+/5a8a7ed83cab58761683966127cd0e6925727925

commit 5a8a7ed83cab58761683966127cd0e6925727925
Author: Jakob Gruber <jgr...@chromium.org>
Date: Thu Aug 03 06:33:45 2017

Revert "[regexp] Limit ATOM regexps to single-character patterns"

This reverts commit 062bb7d487b934d9e73df5ade31e887fe6d0f83c.

Reason for revert: <INSERT REASONING HERE>

Original change's description:

> [regexp] Limit ATOM regexps to single-character patterns
>
> There's an inherent trade-off when deciding between ATOM and IRREGEXP
> regexps: IRREGEXP is faster at runtime for all but trivial single-character
> patterns, while ATOM regexps have a lower memory overhead.
>
> This CL is intended to help investigate impact on benchmarks and real-world
> code - if something tanks, it's easy to revert, otherwise it can be a first
> step towards a possible removal of ATOM regexps.
>
> Bug: v8:6633
> Change-Id: Ia41d8eb28d33952735562d3d4127202746a6ac4e
> Reviewed-on: https://chromium-review.googlesource.com/589435
> Reviewed-by: Yang Guo <yan...@chromium.org>
> Commit-Queue: Jakob Gruber <jgr...@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#47081}

TBR=yan...@chromium.org,jgr...@chromium.org

Change-Id: I8655bc4055af5d593f507e16918b434ff45f5379
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: v8:6633
Reviewed-on: https://chromium-review.googlesource.com/599547
Reviewed-by: Jakob Gruber <jgr...@chromium.org>
Commit-Queue: Jakob Gruber <jgr...@chromium.org>
Cr-Commit-Position: refs/heads/master@{#47106}
[modify] https://crrev.com/5a8a7ed83cab58761683966127cd0e6925727925/src/regexp/jsregexp.cc

jgru… via monorail

unread,
Aug 3, 2017, 2:37:04 AM8/3/17
to v8-re...@googlegroups.com

Comment #14 on issue 6633 by jgr...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c14

The CL in #12 tanked Octane2.1/RegExp and JetStream/regexp-2010 scores by around 6% each. It may be worth investigating which regexps benefit from being an ATOM in those benchmarks.

bugdro… via monorail

unread,
Aug 4, 2017, 8:19:24 AM8/4/17
to v8-re...@googlegroups.com

Comment #15 on issue 6633 by bugd...@chromium.org: RegExp is much faster than indexOf
https://bugs.chromium.org/p/v8/issues/detail?id=6633#c15


The following revision refers to this bug:
https://chromium.googlesource.com/v8/v8.git/+/1081720532abdbfd6e767e9f39a0eb80e0164682

commit 1081720532abdbfd6e767e9f39a0eb80e0164682
Author: jgruber <jgr...@chromium.org>
Date: Fri Aug 04 12:18:47 2017

[regexp] Limit ATOM regexps to patterns length <= 2

This is a modified reland of 062bb7d487b934d9e73df5ade31e887fe6d0f83c


There's an inherent trade-off when deciding between ATOM and IRREGEXP
regexps: IRREGEXP is faster at runtime for all but trivial short

patterns, while ATOM regexps have a lower memory overhead.

This CL is intended to help investigate impact on benchmarks and real-world
code - if something tanks, it's easy to revert, otherwise it can be a first
step towards a possible removal of ATOM regexps.

Bug: v8:6633
Change-Id: I8d946a7cbb398d4987b47ecba24c9faa88788d0d
Reviewed-on: https://chromium-review.googlesource.com/599910

Reviewed-by: Yang Guo <yan...@chromium.org>
Commit-Queue: Jakob Gruber <jgr...@chromium.org>
Cr-Commit-Position: refs/heads/master@{#47164}
[modify] https://crrev.com/1081720532abdbfd6e767e9f39a0eb80e0164682/src/regexp/jsregexp.cc
Reply all
Reply to author
Forward
0 new messages