Intent to Ship: Web Speech API: On-Device Recognition Quality

8 views
Skip to first unread message

Chromestatus

unread,
6:48 PM (3 hours ago) 6:48 PM
to blin...@chromium.org, ev...@google.com
Contact emails
ev...@google.com

Explainer
https://github.com/WebAudio/web-speech-api/blob/main/explainers/quality-levels.md

Specification
https://webaudio.github.io/web-speech-api

Summary
Extends the SpeechRecognition interface by adding a quality property to SpeechRecognitionOptions. This allows developers to specify the semantic capability required for on-device recognition (via processLocally: true). The proposed quality enum supports three levels—'command', 'dictation', and 'conversation'—mapping to increasing task complexity and hardware requirements. This enables developers to determine if the local device can handle high-stakes use cases (like meeting transcription) or if they should fallback to cloud services, solving the current "black box" issue of on-device model capabilities.

Blink component
Blink>Speech

Web Feature ID
speech-recognition

Motivation
While the introduction of processLocally: true was a significant step for privacy and latency, it currently treats all on-device models as functionally equivalent. In reality, on-device capabilities are highly fragmented: a lightweight model optimized for simple voice commands (e.g., "turn on the lights") is often insufficient for high-stakes use cases like video conferencing transcription or accessibility captioning, which require handling continuous speech, multiple speakers, and background noise. Because developers currently have no way to verify the semantic capability of the local model, they must blindly trust the device or default to Cloud-based recognition to guarantee a minimum user experience. This lack of transparency forces developers to bypass on-device capabilities for high-end use cases, effectively negating the privacy and bandwidth benefits of the API. There is a critical need for a mechanism that allows applications to define their required "floor" of utility (e.g., conversation-grade accuracy) to confidently utilize local processing.

Initial public proposal
https://github.com/WebAudio/web-speech-api/issues/182

TAG review
No information provided

TAG review status
Issues addressed

Goals for experimentation
None

Risks


Interoperability and Compatibility
No information provided

Gecko: Positive (https://github.com/mozilla/standards-positions/issues/1375) Neutral/positive, Firefox is launching on-device Web Speech using a single LLM model, so they won't make use of this proposed quality hint but they don't have strong objections against adding it.

WebKit: No signal (https://github.com/WebKit/standards-positions/issues/634) N/A, Apple doesn't have anyone actively working on the Web Speech API on the moment.

Web developers: No signals

Other signals:

WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?

No information provided


Debuggability
No information provided

Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, ChromeOS, Android, and Android WebView)?
No

Is this feature fully tested by web-platform-tests?
No


Flag name on about://flags
No information provided

Finch feature name
OnDeviceWebSpeechQuality

Rollout plan
Will ship enabled for all users

Requires code in //chrome?
True

Tracking bug
https://g-issues.chromium.org/issues/476168420

Estimated milestones
Shipping on desktop150


Anticipated spec changes

Open questions about a feature may be a source of future web compat or interop issues. Please list open issues (e.g. links to known github issues in the project for the feature specification) whose resolution may introduce web compat/interop risk (e.g., changing to naming or structure of the API in a non-backward-compatible way).

No information provided

Link to entry on the Chrome Platform Status
https://chromestatus.com/feature/5136859632107520?gate=6594055733641216

Links to previous Intent discussions
Intent to Prototype: https://groups.google.com/a/chromium.org/d/msgid/blink-dev/69694ed0.050a0220.f8796.0337.GAE%40google.com


This intent message was generated by Chrome Platform Status.
Reply all
Reply to author
Forward
0 new messages