Contact emails
ev...@google.com
Explainer
https://github.com/WebAudio/web-speech-api/blob/main/explainers/quality-levels.md
Specification
https://webaudio.github.io/web-speech-api
Summary
Extends the SpeechRecognition interface by adding a quality property to SpeechRecognitionOptions. This allows developers to specify the semantic capability required for on-device recognition (via processLocally: true).
The proposed quality enum supports three levels—'command', 'dictation', and 'conversation'—mapping to increasing task complexity and hardware requirements. This enables developers to determine if the local device can handle high-stakes use cases (like meeting transcription) or if they should fallback to cloud services, solving the current "black box" issue of on-device model capabilities.
Blink component
Blink>Speech
Web Feature ID
speech-recognition
Motivation
While the introduction of processLocally: true was a significant step for privacy and latency, it currently treats all on-device models as functionally equivalent. In reality, on-device capabilities are highly fragmented: a lightweight model optimized for simple voice commands (e.g., "turn on the lights") is often insufficient for high-stakes use cases like video conferencing transcription or accessibility captioning, which require handling continuous speech, multiple speakers, and background noise.
Because developers currently have no way to verify the semantic capability of the local model, they must blindly trust the device or default to Cloud-based recognition to guarantee a minimum user experience. This lack of transparency forces developers to bypass on-device capabilities for high-end use cases, effectively negating the privacy and bandwidth benefits of the API. There is a critical need for a mechanism that allows applications to define their required "floor" of utility (e.g., conversation-grade accuracy) to confidently utilize local processing.
Initial public proposal
https://github.com/WebAudio/web-speech-api/issues/182
TAG review
No information provided
TAG review status
Issues addressed
Goals for experimentation
None
Risks
Interoperability and Compatibility
No information provided
Gecko: Positive (
https://github.com/mozilla/standards-positions/issues/1375) Neutral/positive, Firefox is launching on-device Web Speech using a single LLM model, so they won't make use of this proposed quality hint but they don't have strong objections against adding it.
WebKit: No signal (
https://github.com/WebKit/standards-positions/issues/634) N/A, Apple doesn't have anyone actively working on the Web Speech API on the moment.
Web developers: No signals
Other signals:
WebView application risks
Does this intent deprecate or change behavior of existing APIs,
such that it has potentially high risk for Android WebView-based
applications?
No information provided
Debuggability
No information provided
Will this feature be supported on all six Blink platforms
(Windows, Mac, Linux, ChromeOS, Android, and Android WebView)?
No
No
Flag name on about://flags
No information provided
Finch feature name
OnDeviceWebSpeechQuality
Rollout plan
Will ship enabled for all users
Requires code in //chrome?
True
Tracking bug
https://g-issues.chromium.org/issues/476168420
Estimated milestones
Anticipated spec changes
Open questions about a feature may be a source of future web compat or
interop issues. Please list open issues (e.g. links to known github
issues in the project for the feature specification) whose resolution
may introduce web compat/interop risk (e.g., changing to naming or
structure of the API in a non-backward-compatible way).
No information provided
Link to entry on the Chrome Platform Status
https://chromestatus.com/feature/5136859632107520?gate=6594055733641216
Links to previous Intent discussions
Intent to Prototype:
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/69694ed0.050a0220.f8796.0337.GAE%40google.com