Intent to Implement: Speech Recognition with WebRTC

289 views
Skip to first unread message

bur...@chromium.org

unread,
Sep 22, 2014, 3:13:04 AM9/22/14
to blin...@chromium.org, Shijing Xian, Tomas Gunnarsson, Glen Shires

Contact emails

bur...@chromium.org, xi...@chromium.org


Spec

http://goo.gl/9Ot3PC Spec is still in a draft stage.

CLs: Blink / Renderer / Browser


Summary

Expose a new member to speech recognition called webkitSpeechRecognition.audioTrack which can bind to a track obtained via getUserMedia.

This track will be processed on the renderer (e.g. echo cancellation) and be used for a different use case of speech recognition, such as conference calls or video chat with transcriptions.


Motivation

Current implementation limits the use cases for web speech recognition (e.g. voice search and short dictation) as it is done on the browser process. 

As WebRTC and real-time communication in general are getting even more popular, this would present a new powerful feature 

for web developers and lay out the stage for new web services.


Compatibility Risk

Chrome only for now. Since described changes to the API don't affect the current behaviour of web speech recognition, the feature presents small compatibility issues.

We are also targeting other browsers which support WebRTC. We're working on getting this into the W3C spec for WebSpeech as well.


Ongoing technical constraints

Android and ChromeOS platforms should be addressed as well.


Will this feature be supported on all five Blink platforms (Windows, Mac, Linux, Chrome OS and Android)?

Current development targets Win, Mac and Linux. Chrome OS and Android are also planned.


OWP launch tracking bug?

http://crbug.com/408940


http://www.chromestatus.com/features/5043404337577984


Requesting approval to ship?

No.


Jochen Eisinger

unread,
Sep 22, 2014, 4:12:54 AM9/22/14
to bur...@chromium.org, blin...@chromium.org, Shijing Xian, Tomas Gunnarsson, Glen Shires
implementing lgtm

Elliott Sprehn

unread,
Sep 22, 2014, 4:15:42 AM9/22/14
to bur...@chromium.org, blin...@chromium.org, Shijing Xian, Tomas Gunnarsson, Glen Shires


On Monday, September 22, 2014, <bur...@chromium.org> wrote:

Contact emails

bur...@chromium.org, xi...@chromium.org


Spec

http://goo.gl/9Ot3PC Spec is still in a draft stage.

CLs: Blink / Renderer / Browser


Summary

Expose a new member to speech recognition called webkitSpeechRecognition.audioTrack


What's the plan for getting SpeechRecognition unprefixed? I'm not sure we should be adding more features to this until we've removed the webkit prefix. 

- E

Shijing Xian

unread,
Sep 22, 2014, 7:27:34 AM9/22/14
to Elliott Sprehn, bur...@chromium.org, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
Hi Elliott, this project is an intern project which was planed to be 4 months long, and it is getting to its last month and we are closed to land the implementations in both Chrome and Blink. Removing the webkit prefix from the WebSpeech API sounds a great idea but probably should not block Kristijan.

Tommi is leading the speech team in Stockholm, probably he can provide some plans to address the webkit prefix.

Wdyt?


 
- E

Jochen Eisinger

unread,
Sep 22, 2014, 9:39:53 AM9/22/14
to Shijing Xian, Elliott Sprehn, bur...@chromium.org, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
I think it's fine to implement this, but as I mentioned on the CL, we shouldn't ship this without first coming to an agreement about what to do with the prefix.

(Note that there's a difference between intent to ship and intent to implement. You asked for the latter.)

best
-jochen

Kristijan Burnik

unread,
Sep 22, 2014, 10:10:17 AM9/22/14
to Jochen Eisinger, Shijing Xian, Elliott Sprehn, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
Is there a currently open issue that would block us from removing the prefix after this ships? (Planning to add an intent-to-ship soon)

Jochen: What about the CL? http://crrev.com/448163003
Can we land it?


--

Regards,
Kristijan Burnik,
SWE Intern

PhistucK

unread,
Sep 22, 2014, 3:55:00 PM9/22/14
to Kristijan Burnik, Jochen Eisinger, Shijing Xian, Elliott Sprehn, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
It looks like the changelist adds the feature behind the MediaStream flag (enabled by default, I think?), so you should add another flag for it, off by default (=experimental) first, or else, you would be shipping the feature and not only implementing it.


PhistucK

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Kristijan Burnik

unread,
Sep 23, 2014, 7:43:42 AM9/23/14
to PhistucK, Jochen Eisinger, Shijing Xian, Elliott Sprehn, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
Updated the CL to use a new experimental RuntimeEnabled flag. http://crrev.com/448163003

Jochen Eisinger

unread,
Sep 23, 2014, 7:47:52 AM9/23/14
to Kristijan Burnik, PhistucK, Shijing Xian, Elliott Sprehn, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
If you intend to ship this feature, you should send an intent to ship, it's not something I can decide alone. I expect that shipping a new feature on a prefixed API will at least spark some discussion about whether that's the right thing to do at this point. It will be helpful if you can include arguments in the intent mail about why you think this won't make it more difficult to remove the prefix, or otherwise increase the compatibility risks.

best
-jochen


Kristijan Burnik

unread,
Sep 23, 2014, 7:53:10 AM9/23/14
to Jochen Eisinger, PhistucK, Shijing Xian, Elliott Sprehn, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
Can this land without the intent-to-ship? I mean, it's currently still an experimental feature.

smaug

unread,
Sep 23, 2014, 7:55:33 AM9/23/14
to bur...@chromium.org, blin...@chromium.org, Shijing Xian, Tomas Gunnarsson, Glen Shires
On 09/22/2014 10:13 AM, bur...@chromium.org wrote:
> Contact emails
>
> bur...@chromium.org, xi...@chromium.org
>
>
> Spec
>
> http://goo.gl/9Ot3PC Spec is still in a draft stage.
>
> CLs: Blink <http://crrev.com/448163003> / Renderer <http://crrev.com/499233003> / Browser <http://crrev.com/549373003>
>
>
> Summary
>
> Expose a new member to speech recognition called *webkitSpeechRecognition.audioTrack***which can bind to a track obtained via *getUserMedia*.


I'm not aware of this being proposed to the draft specification.
In Gecko, where the API isn't yet exposed to the web by default, SpeechRecognition.start() has optional MediaStream param.
The spec bug for that was filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=26336



-Olli

>
> This track will be processed on the renderer (e.g. echo cancellation) and be used for a different use case of speech recognition, such as conference
> calls or video chat with transcriptions.
>
>
> Motivation
>
> Current implementation limits the use cases for web speech recognition (e.g. voice search and short dictation) as it is done on the browser process.
>
> As WebRTC and real-time communication in general are getting even more popular, this would present a new powerful feature
>
> for web developers and lay out the stage for new web services.
>
>
> Compatibility Risk
>
> Chrome only for now. Since described changes to the API don't affect the current behaviour of web speech recognition, the feature presents small
> compatibility issues.
>
> We are also targeting other browsers which support WebRTC. We're working on getting this into the W3C spec for WebSpeech as well.
>
>
> Ongoing technical constraints
>
> Android and ChromeOS platforms should be addressed as well.
>
>
> Will this feature be supported on all five Blink platforms (Windows, Mac, Linux, Chrome OS and Android)?
>
> Current development targets Win, Mac and Linux. Chrome OS and Android are also planned.
>
>
> OWP launch tracking bug?
>
> http://crbug.com/408940
>
>
> *Entry in Chromium Dashboard*http://www.chromestatus.com/features/5043404337577984
>
>
> Requesting approval to ship?
>
> No.
>
>
> To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org
> <mailto:blink-dev+...@chromium.org>.

Jochen Eisinger

unread,
Sep 23, 2014, 8:42:08 AM9/23/14
to Kristijan Burnik, PhistucK, Shijing Xian, Elliott Sprehn, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
On Tue Sep 23 2014 at 1:53:06 PM Kristijan Burnik <bur...@google.com> wrote:
Can this land without the intent-to-ship? I mean, it's currently still an experimental feature.

sure, as long as it's behind a flag, I think it's ok to implement this.

best
-jochen

Kristijan Burnik

unread,
Sep 23, 2014, 8:46:20 AM9/23/14
to smaug, blin...@chromium.org, Shijing Xian, Tomas Gunnarsson, Glen Shires
Yes, Glen, Shijing and I know about that spec "bug". It is somewhat similar to these changes but an entirely different intent.

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org
<mailto:blink-dev+unsubscribe@chromium.org>.

Kristijan Burnik

unread,
Sep 23, 2014, 8:47:14 AM9/23/14
to Jochen Eisinger, PhistucK, Shijing Xian, Elliott Sprehn, blin...@chromium.org, Tomas Gunnarsson, Glen Shires
Does the CL get your stamp then? :-)

smaug

unread,
Sep 23, 2014, 8:59:09 AM9/23/14
to Kristijan Burnik, blin...@chromium.org, Shijing Xian, Tomas Gunnarsson, Glen Shires
On 09/23/2014 03:46 PM, 'Kristijan Burnik' via blink-dev wrote:
> Yes, Glen, Shijing and I know about that spec "bug". It is somewhat similar to these changes but an entirely different intent.

I see, it is the opposite, getting a track out from SpeechRecognition, not passing a track/stream to it.
I misinterpreted the proposal.

But please propose the change to the spec before exposing it to the web.


-Olli

>
> On Tue, Sep 23, 2014 at 1:55 PM, smaug <sm...@welho.com <mailto:sm...@welho.com>> wrote:
>
> On 09/22/2014 10:13 AM, bur...@chromium.org <mailto:bur...@chromium.org> wrote:
>
> Contact emails
>
> bur...@chromium.org <mailto:bur...@chromium.org>, xi...@chromium.org <mailto:xi...@chromium.org>
>
>
> Spec
>
> http://goo.gl/9Ot3PC Spec is still in a draft stage.
>
> CLs: Blink <http://crrev.com/448163003> / Renderer <http://crrev.com/499233003> / Browser <http://crrev.com/549373003>
>
>
> Summary
>
> Expose a new member to speech recognition called *webkitSpeechRecognition.__audioTrack***which can bind to a track obtained via *getUserMedia*.
>
>
>
> I'm not aware of this being proposed to the draft specification.
> In Gecko, where the API isn't yet exposed to the web by default, SpeechRecognition.start() has optional MediaStream param.
> The spec bug for that was filed https://www.w3.org/Bugs/__Public/show_bug.cgi?id=26336 <https://www.w3.org/Bugs/Public/show_bug.cgi?id=26336>
>
>
>
> -Olli
>
>
> This track will be processed on the renderer (e.g. echo cancellation) and be used for a different use case of speech recognition, such as
> conference
> calls or video chat with transcriptions.
>
>
> Motivation
>
> Current implementation limits the use cases for web speech recognition (e.g. voice search and short dictation) as it is done on the browser
> process.
>
> As WebRTC and real-time communication in general are getting even more popular, this would present a new powerful feature
>
> for web developers and lay out the stage for new web services.
>
>
> Compatibility Risk
>
> Chrome only for now. Since described changes to the API don't affect the current behaviour of web speech recognition, the feature presents small
> compatibility issues.
>
> We are also targeting other browsers which support WebRTC. We're working on getting this into the W3C spec for WebSpeech as well.
>
>
> Ongoing technical constraints
>
> Android and ChromeOS platforms should be addressed as well.
>
>
> Will this feature be supported on all five Blink platforms (Windows, Mac, Linux, Chrome OS and Android)?
>
> Current development targets Win, Mac and Linux. Chrome OS and Android are also planned.
>
>
> OWP launch tracking bug?
>
> http://crbug.com/408940
>
>
> *Entry in Chromium Dashboard*http://www.__chromestatus.com/features/__5043404337577984 <http://www.chromestatus.com/features/5043404337577984>
>
>
> Requesting approval to ship?
>
> No.
>
>
> To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@__chromium.org
> <mailto:blink-dev%2Bunsu...@chromium.org>
> <mailto:blink-dev+unsubscribe@__chromium.org <mailto:blink-dev%2Bunsu...@chromium.org>>.
>
>
>
>
>
> --
>
> Regards,
> Kristijan Burnik,
> SWE Intern
>
> To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org
> <mailto:blink-dev+...@chromium.org>.

Kristijan Burnik

unread,
Sep 23, 2014, 9:04:30 AM9/23/14
to smaug, blin...@chromium.org, Shijing Xian, Tomas Gunnarsson, Glen Shires, Harald Alvestrand
Currently, this is intended for chromium, but I do agree that the spec should eventually display it.

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org
<mailto:blink-dev+unsubscribe@chromium.org>.

PhistucK

unread,
May 11, 2016, 2:25:15 PM5/11/16
to Kristijan Burnik, smaug, blin...@chromium.org, Shijing Xian, Tomas Gunnarsson, Glen Shires, Harald Alvestrand
What happened with this one?
I see that the launch issue is Archived. Is the functionality still there, or was it removed? Is there a plan to ship this (or remove this)?

Also, belated, but how is the speechRecognition.start() proposal the opposite of this proposal?


PhistucK

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Reply all
Reply to author
Forward
0 new messages