single utterance with long silence before speech

694 views
Skip to first unread message

Kevin Dempsey

unread,
Nov 1, 2018, 8:21:17 AM11/1/18
to Google Cloud Developers
I am using the v1 streaming API to recognize answers to questions. I start the StreamingRecognize and then play the question. I stop playing the question when I get the first intermediate result, as the user has barged in. As the answers are simple words, I have been using the single_utterance mode.

The problem is that when the question is quite long, around 8 seconds, and the user waits until the question has finished. In these circumstances, I get no results even though the user has definitely spoken. If I don't use single utterance mode, I get results regardless of how long the question is.

It seems that the long period of silence before the answer is the problem as shorter questions work OK.

Is this a reasonable use case for the single utterance mode?

George (Cloud Platform Support)

unread,
Nov 1, 2018, 4:58:26 PM11/1/18
to Google Cloud Developers
Hello Kevin, 

A single utterance should time out after a while, otherwise it behaves as if the singleUtterance flag would have been set to false. In other words, what you noticed is to be expected. If you want to capture just one word, a possible solution would be setting a maximum time limit. Once the message is captured and processed to text, you may implement a way to just retain the first word, or process resulting text in the desired way. You may read more on the StreamingRecognitionConfig documentation page
Reply all
Reply to author
Forward
0 new messages