Intent to Prototype: Document Local Dictionary API

Chromestatus

unread,

Jul 18, 2025, 6:08:19 AMJul 18

to blin...@chromium.org, ji...@igalia.com

Contact emails

ji...@igalia.com

Explainer

https://github.com/Igalia/explainers/tree/main/dictionary-api

Specification

None

Design docs

https://github.com/Igalia/explainers/tree/main/dictionary-api#-proposal

Summary

The proposed APIs enable users to modify the document local dictionary in the browser. Users can add, remove, and check words in the document local dictionary. This feature ensures the browser does not mark words in the document local dictionary as spelling errors.

Blink component

Blink>DOM

Motivation

Some words need to be added to the document custom dictionary so that the browser does not mark them as spelling errors. The added words need to be removed at some point if they aren't necessary. Current specs such as element.spellcheck attribute and ::spelling-error CSS pseudo-element manage the words already in the dictionary. Therefore, the new API would be needed to manipulate the document local dictionary.

Initial public proposal

None

TAG review

None

TAG review status

Pending

Risks

Interoperability and Compatibility

None

Gecko: No signal

WebKit: No signal

Web developers: No signals

Other signals:

WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?

None

Debuggability

None

Is this feature fully tested by web-platform-tests?

Yes

third_party/blink/web_tests/wpt_internal/dom/local-dictionary/* There is WIP patch which includes the tests

Flag name on about://flags

None

Finch feature name

None

Non-finch justification

None

Requires code in //chrome?

False

Tracking bug

https://issues.chromium.org/issues/428005649

Estimated milestones

No milestones specified

Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/6185007701557248?gate=4503614776934400

This intent message was generated by Chrome Platform Status.

Daniel Vogelheim

unread,

Jul 22, 2025, 7:37:28 AMJul 22

to ji...@igalia.com, sche...@chromium.org, blin...@chromium.org, Chromestatus

Hello,

This intent came up in security review, and I'm mostly confused:

- The explainer mostly seems to assume that these are stored in-memory, per-document. But it also talks about absence of cross-origin-requests; only to add info about CORS, which only makes sense for cross-origin requests.

- There are multiple references to loading data, but there is no explanation about what kind of network requests are being made when or where.

- The explainer suggests "Persistently store data" as an optimization for having to re-load large dictionaries. Again, no information about which requests are being optimized away.

- In "Data Storage" it is pointed out that CustomDictionaryEngine exists per renderer process. While renderer processes mostly don't have cross-origin data, they sometimes do. And they may hold multiple documents. This seems inconsistent with information being stored per-document.

Non-security feedback:

- Since this is a web-exposed API, I'd have expected some attempt at checking with other browser engines on support.

- I do not understand the "High-level Architecture". It seems to feature a stack of methods that feeds into yes/no decisions which feeds into a storage thing. I have no idea what this is meant to convey.

- Blink>DOM might not be the right component for this.

Could you please update the documentation to be more clear about where data is stored, and about which network requests are being made?

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/687a1d04.170a0220.2dad83.0168.GAE%40google.com.

Rick Byers

unread,

Jul 22, 2025, 10:55:05 AMJul 22

to Daniel Vogelheim, ji...@igalia.com, sche...@chromium.org, blin...@chromium.org, Chromestatus

FWIW I was also a little confused reading the explainer, but I think I understand the overall design and I think it's a good one: these dictionaries are transient and document-local, simply a mechanism to let pages selectively suppress spell check violations on their own page.

Presumably discussion of network fetches in the explainer are just about the app fetching from it's server (not fetches in the browser), and all the discussions of "persistent" storage are under the "future work" section so it's fine to me that there's no detail here (it's out of scope because it's hard). I'm not sure whether it would make sense to extend this design into persistent storage or not, but I'm also not sure it matters (as the explainer says it's simply an optimization - a problem that may or may not exist in practice so not worth worrying about today).

Ensuring the data is reliably per-document is definitely a key implementation concern, so I agree with you there Daniel. And yes we'll eventually want signals from other browser vendors, but our process has that step only after prototyping is complete (often we learn a lot about the design from prototyping), so it's premature to ask for it now at I2P phase.

Cheers,

Rick

To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CALG6KPPzd95-XN%2BjWHLmvwjLg3wv6WjZWYvP52T6Rp%3DjEg_EVw%40mail.gmail.com.

Stephen Chenney

unread,

Jul 22, 2025, 11:11:59 AMJul 22

to Rick Byers, Daniel Vogelheim, ji...@igalia.com, blin...@chromium.org, Chromestatus

Thanks for the early feedback, and sorry for the lack of clarity on the explainer. We're working on improving the explainer to address the issues raised here and issues raised on github.

We're also considering an entirely different approach whereby a site provides a "spelling server" URL in the HTML header. That would operate more like the existing "send it to Google" spell checking options. We're super early in designing such a thing, but if anyone has early feedback on that approach we would be interested.

Cheers,

Stephen.

Rick Byers

unread,

Jul 22, 2025, 12:03:27 PMJul 22

to Stephen Chenney, Daniel Vogelheim, ji...@igalia.com, blin...@chromium.org, Chromestatus

Spelling server seems a lot harder to get right to me, obviously more to worry about regarding privacy etc. Can you share anything more about the motivating use cases here? Like how large do these custom dictionaries tend to be? I'd guess that for even dictionaries up to 1MB compressed it's probably faster and simpler to just have the client download the whole thing. RTT latency is generally a bigger performance problem these days than raw throughput. But if it's important to solve scenarios with really large dictionaries then maybe it's worth exploring?

Stephen Chenney

unread,

Jul 22, 2025, 3:18:29 PMJul 22

to Rick Byers, Daniel Vogelheim, ji...@igalia.com, blin...@chromium.org, Chromestatus

Regarding motivation, our client has financial data, such as stock symbols and company names. There are similar use cases for medical data, fan fiction, or anything else with words that might not appear in hunspell's dictionaries. It's conceivable that the Google internal spelling APIs have these words but clients may be very reluctant to send their strings to Google.

The proposal in this intent is relatively straightforward to implement and privacy and security is relatively simple to assess. But for developers there will probably be significant load time costs around it, to fetch the site's dictionary and process it to add the words. We have some ideas around that in future work but nothing concrete. I think we'll have to address it before we ship.

A HTTP header approach would make the ergonomics easier (assuming the infrastructure for setting up a spelling server is reasonably standard) and fits better into the existing code, But ti would not work offline. Maybe the approaches are complementary and we do both.

I'll try to get some idea on the size of typical dictionaries in this space. It is important to know,

Cheers,

Stephen.

Rick Byers

unread,

Jul 22, 2025, 5:02:41 PMJul 22

to Stephen Chenney, Daniel Vogelheim, ji...@igalia.com, blin...@chromium.org, Chromestatus

On Tue, Jul 22, 2025 at 3:18 PM Stephen Chenney <sche...@chromium.org> wrote:

Regarding motivation, our client has financial data, such as stock symbols and company names. There are similar use cases for medical data, fan fiction, or anything else with words that might not appear in hunspell's dictionaries. It's conceivable that the Google internal spelling APIs have these words but clients may be very reluctant to send their strings to Google.

The proposal in this intent is relatively straightforward to implement and privacy and security is relatively simple to assess. But for developers there will probably be significant load time costs around it, to fetch the site's dictionary and process it to add the words.

I'd love to see some figures on this. Maybe a bulk add API would be enough? As a quick example I picked a random website (bloomberg.com) and found it downloaded 3.4MB compressed including a number of individual scripts, images and JSON blobs which were around 100kB compressed each. In contrast the entire american-english dictionary on my linux machine compresses down to 270kB. So as long as we're talking about something that's less than 10% the size of the whole american english dictionary, my hunch is that the transfer cost will be insignificant and lost in the noise. But still an http approach to at least enable caching would be a good idea with little downside. I could imagine, for example, a <link rel=dictionary> tag or something that would be even simpler than this JS API approach?

Anyway this is just random thoughts to try to nudge away from premature optimization, not API owner input or anything :-).

Ziran Sun

unread,

Sep 26, 2025, 9:41:17 AMSep 26

to blink-dev, Rick Byers, Daniel Vogelheim, ji...@igalia.com, blin...@chromium.org, Chromestatus, Stephen Chenney

Hi Rick, Daniel,

I'm looking at the case of a non-persistent and document-local dictionary that stores the word list in memory. Is it Okay to illustrate a bit more on why Blink>DOM might not be the right component for this? And what are the issues you could foresee on ensuring the data is reliably per-document?

Thank you!

Ziran

Ziran Sun

unread,

Nov 21, 2025, 4:44:34 AM (19 hours ago) Nov 21

to blink-dev, Ziran Sun, Rick Byers, Daniel Vogelheim, ji...@igalia.com, blin...@chromium.org, Chromestatus, Stephen Chenney

Hi,

Thanks very much for the review comments and discussions. They were very helpful!

We have updated the explainer and added a brief design note to address the per-document question.

Link for the explainer - https://github.com/Igalia/explainers/tree/main/document-local-dictionary

Link for the design doc - https://docs.google.com/document/d/1ND1a1Z4i6kXMHqMwEyRkHSj5VVTWgX5Ya0aNLgVQYGw/edit?tab=t.0#heading=h.kmfizh6cwyy4

@Rick, @Daniel, please let us have your thoughts about this.

Any further comments would be much appreciated!

Thanks,

Ziran

Sangwhan Moon

unread,

Nov 21, 2025, 10:44:38 PM (1 hour ago) Nov 21

to Ziran Sun, Rick Byers, Daniel Vogelheim, ji...@igalia.com, blin...@chromium.org, Chromestatus, Stephen Chenney, Ziran Sun, Evan Liu

Hello,

It occurred to me that document-local terminology likely would have to be conveyed to the browser in two different places with this proposal and Web Speech's Contextual Biasing - https://github.com/WebAudio/web-speech-api/blob/main/explainers/contextual-biasing.md

The end result is different (this proposal being for suppression, Contextual Biasing for boosting) but I would imagine the input likely heavily overlaps (e.g. given your Pokemon usecase) - it seems like an architectural consideration for interop between these two mechanisms would be beneficial for developer ergonomics.

Sangwhan

On Nov 21, 2025, at 1:43, Ziran Sun <zs...@igalia.com> wrote:

To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/4781c320-5a06-42f1-ae8c-aba939aa7cddn%40chromium.org.

Reply all

Reply to author

Forward