Google Speech to Text latency issues

1,350 views
Skip to first unread message

Harry Stuart

unread,
Sep 13, 2018, 8:12:54 PM9/13/18
to Google Cloud Developers

I am developing a real time voice application and am streaming audio to the Google Speech API, however, I am getting response times of around 2 seconds, this is still too great a delay. Is it possible to get the delay to below 500ms? How big an impact do voice models and expected key word parameters actually have on the transcription? What is the average latency for streaming transcription?

Thank you, Harry 

George (Cloud Platform Support)

unread,
Sep 13, 2018, 9:44:02 PM9/13/18
to Google Cloud Developers
Hello Harry, 

The response times depend on more than one factor, most of them directly related to the quality of the initial recording. To improve response times in your case, down from 2 seconds, you are encouraged to follow more closely the related recommendations on the "Best Practices" documentation page

If the above information does not fully address your particular situation, you are most welcome coming back with more detail. How did you create your audio file initially? How did you record sound, which encoding? Did you accurately describe the audio data sent with your request to the Speech-to-Text API. Ensuring that the RecognitionConfig for your request describes the correct sampleRateHertzencoding, and languageCode will result in the most accurate transcription and billing for your request. A sample file would help us in reproducing your issue on our side. 
Reply all
Reply to author
Forward
0 new messages