Speech recognition problems while playing music? This may help

27 views

Skip to first unread message

Nekora

unread,

Jun 27, 2015, 3:29:19 PM6/27/15

to voice...@googlegroups.com

Greetings, I encountered an issue with the speech recognition where playing music would confuse the speech recognition engine in voice attack, and found the following solution. But first how to identify if you have the same issue.

(And just to clarify this is an issue with the Microsoft speech recognition software NOT voice attack, voice attack uses the MS speech engine so this is not something Gary can fix.)

Okay this should be done in a quiet(ish) room, first load Voice attack and say something, there should be a volume bar on the bottom of the Voiceattack window showing your mike levels (As processed by the MS voice engine)

Second load a music program, set it to play a sound on repeat, mute your speakers and set the volume in your music player to max. (insane right? well...)

Okay now at this point although the computer is playing the sound you should NOT be able to hear it, but check your sound mixer to make sure it still shows sound through the device speakers channel. It should be grey and show muted but still moving.

Now, go back to voice attack and say something, if the bar continues to move a little after you stop, then you indeed have the issue I encountered. And this is how I fixed it.

Open regedit and add a key "HKEY_CURRENT_USER\Software\Microsoft\Speech\AudioInput\AudioFeatures"

You probably already have a key "HKEY_CURRENT_USER\Software\Microsoft\Speech\AudioInput" with a sub key "Tokenenum" just right click on Audio input on the key tree and select New>Key, then rename the key to "AudioFeatures"

now inside the key add a DWORD, and name it "AcousticEchoCancellation" once done it should by default have a value 0x00000000 (0).

Now close regedit, close voice attack and reopen it, and run the sound test again. Hopefully the bar should no longer move after you stop if so, you indeed had the same problem and should find the speech recognition WAY more accurate now.

For those who want to know what is happening, the MS-Speech engine is doing something called Acoustic Echo Cancellation where it mixes your computers output sound channel with the sound from your mic to try and eliminate feedback, which, most modern sound cards/headsets do automatically. Some on a hardware level, some on a software level. the problem is, since this is being done by my headset and then again by the speech engine its actually ADDING noise to the mic sound and there (to my knowledge) is no option to disable this in windows (depending on your headset/mike you may be able to disable this there) although the MS-Engine DOES check for the setting (the regkey we set up)

Also as a side note this greatly improved the accuracy for windows TTS when dictating.

Reply all

Reply to author

Forward

0 new messages