Optimise string comparisons ignoring case sensativity

76 views
Skip to first unread message

Zeeshan Abid

unread,
Oct 8, 2025, 3:06:47 PM (11 days ago) Oct 8
to v8-dev
Hello, I noticed that `toLower() === toLower()` is the common practice for comparing strings ignoring case sensitivity. I wanted to know is it possible to add a function like

`String.compare(str1, str2, IgnoreLowerCaseOption);`

This should in theory run faster than doing `toLower() === toLower()`

I already tried using `
String.prototype.localeCompare` and it is slower than doing `toLower() === toLower()`

Thanks!

Ben Noordhuis

unread,
Oct 9, 2025, 12:51:32 AM (10 days ago) Oct 9
to v8-...@googlegroups.com
The answer to your question is both yes and no.

Yes, in that it would be possible to implement that method in V8.

No, in that V8 won't implement it until it's part of the ECMA
specification. Getting changes to the spec accepted means going
through TC39. I'd rather visit my local dentist; it's less painful and
over quicker.

String.prototype.localeCompare computes equivalence, which is much
more complex and complicated than the case conversion that
String.prototype.toLowerCase performs. For example, localeCompare
considers "s\u0307\u0323" and "\u1E69" equivalent, whereas toLowerCase
does not.

Zeeshan Abid

unread,
Oct 9, 2025, 3:22:33 AM (10 days ago) Oct 9
to v8-dev
Ben thanks so much for taking the time and giving me the explanation. I have not had experience going through TC39. So I guess I will give it a try until they tell me to go away.

Darius Mercadier

unread,
Oct 9, 2025, 4:08:50 AM (10 days ago) Oct 9
to v8-...@googlegroups.com
Hi,

Another option would be to pattern-match `str1.toLowerCase() == str2.toLowerCase()` in V8's optimizing compilers and call a special builtin that doesn't bother creating the 2 toLowerCase strings but does the comparison on the original strings with special rules regarding case (or even generates the full comparison in Maglev/Turbofan, but calling a builtin should already be a good improvement). I've created a feature request for this on our bug tracker: https://crbug.com/450288145. I'm not sure that anyone at V8 will have time in the near future to work on this, so feel free to give it a shot if you want.

This would be a bit more fragile than adding IgnoreLowerCaseOption to the spec and would not affect unoptimized code, but it doesn't require going through TC39 and it would benefit existing code that wouldn't be using this new feature. 

Cheers,
Darius

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/v8-dev/3671268e-fe4a-4f7b-b529-47ce54629637n%40googlegroups.com.

Darius Mercadier

Software Engineer

dmerc...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde. 

     

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

Zeeshan Abid

unread,
Oct 9, 2025, 4:12:23 AM (10 days ago) Oct 9
to v8-dev
That's awesome, And sure I would love to give it a try!
Thanks for doing that

Zeeshan Abid

unread,
Oct 9, 2025, 6:12:04 AM (10 days ago) Oct 9
to v8-dev
I am just attaching the TC39 feature request here in case anyone wants to follow it https://es.discourse.group/t/faster-string-comparisons/2444

Zeeshan Abid

unread,
Oct 9, 2025, 4:59:14 PM (10 days ago) Oct 9
to v8-dev
I wanted your guys thoughts because you probably know more than me but according to people atTC39. We should be changing localeCompare so it is faster.
It should check if the comparison is with ASCII or UNICODE then go through a fast or a slow path

Ben Noordhuis

unread,
Oct 10, 2025, 5:55:51 AM (9 days ago) Oct 10
to v8-...@googlegroups.com
On Thu, Oct 9, 2025 at 10:59 PM Zeeshan Abid <zeeshan....@gmail.com> wrote:
>
> I wanted your guys thoughts because you probably know more than me but according to people atTC39. We should be changing localeCompare so it is faster.
> It should check if the comparison is with ASCII or UNICODE then go through a fast or a slow path

V8 has that fast path in LocaleCompareFastPath but it iterates over
the strings char-at-a-time, whereas it could be using word-at-a-time
or SIMD at least some of the time*, so there's maybe still room for
improvement.

* Char-at-a-time iteration makes it easy for LocaleCompareFastPath to
handle mixed one-byte and two-byte strings. For word-at-a-time, it
would need four versions of the inner loop, two where the left and
right strides are of different sizes. Not impossible, but somewhat
awkward and inefficient.
Reply all
Reply to author
Forward
0 new messages