CRC32 for short atomic strings

43 views
Skip to first unread message

Anton Bikineev

unread,
Apr 8, 2025, 5:29:00 AMApr 8
to platform-architecture-dev
Hi all,

I wrote a short doc on how we can speed up insertion into the AtomicStringTable,
which is in particular hot on VanillaJS Speedometer3 stories. The idea is:
 - use crc32 for short strings (one-word and two-words) for hashing, which is
   faster for short strings than rapidhash (see the doc),
 - store short strings in separate tables and cache them inlined in the backing,
   to avoid indirection on comparison.

On Linux I see a 33% speedup for AtomicStringTable::Add() when running the S3
TodoMVC-JavaScript.* stories. On M1 the S3 total score improvement is 0.3%.

Kentaro Hara

unread,
Apr 8, 2025, 11:40:03 AMApr 8
to Anton Bikineev, platform-architecture-dev
LGTM. This is an internal optimization scoped to the hashing algorithm and I have no concern.


--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architect...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/platform-architecture-dev/CABH6udYDACF2gpB6X43xYngBjt107FpDJkDVqW%2BsrTOGji%3Dsiw%40mail.gmail.com.


--
Kentaro Hara, Tokyo

Dave Tapuska

unread,
Apr 8, 2025, 11:48:09 AMApr 8
to Kentaro Hara, Anton Bikineev, platform-architecture-dev
Do we know if this is limited to a set of strings? (I did produce a histogram of strings before)

When I last looked at speedometer2, we were inserting and removing data-id from the atomic string table frequently... See https://chromium-review.googlesource.com/c/chromium/src/+/5784168 which I never landed.

This change alone had a 0.3% improvement on speedometer2 at the time. I understand this is speedometer3 though, so it might be different.

dave.


Ian Kilpatrick

unread,
Apr 8, 2025, 1:14:23 PMApr 8
to Dave Tapuska, Kentaro Hara, Anton Bikineev, platform-architecture-dev, Steinar H. Gunderson

Anton Bikineev

unread,
Apr 9, 2025, 4:44:10 AMApr 9
to Ian Kilpatrick, Dave Tapuska, Kentaro Hara, platform-architecture-dev, Steinar H. Gunderson
Do we know if this is limited to a set of strings? (I did produce a histogram of strings before)
The short strings on S3 are pretty much random, I just pasted them here.

Anton Bikineev

unread,
Apr 9, 2025, 5:15:07 AMApr 9
to Ian Kilpatrick, Dave Tapuska, Kentaro Hara, platform-architecture-dev, Steinar H. Gunderson
When I last looked at speedometer2, we were inserting and removing data-id from the atomic string table frequently... See https://chromium-review.googlesource.com/c/chromium/src/+/5784168 which I never landed. 
This change alone had a 0.3% improvement on speedometer2 at the time. I understand this is speedometer3 though, so it might be different.

The idea of crc32 with extra tables with inlined storage is orthogonal in that it won't speed up string creation/destruction. It's meant to make the lookups in the table faster. So, I think your CL may still matter for S3.

Michael Lippautz

unread,
Apr 9, 2025, 6:58:21 AMApr 9
to Anton Bikineev, Ian Kilpatrick, Dave Tapuska, Kentaro Hara, platform-architecture-dev, Steinar H. Gunderson
LGTM. While this is some complexity it's well encapsulated and doesn't affect anything outside of the AtomicStringTable.

Reply all
Reply to author
Forward
0 new messages