Proper configuration of Google Speech client for mp3 file

278 views
Skip to first unread message

Dejan Stojanovic

unread,
Dec 29, 2018, 12:17:04 PM12/29/18
to Google Cloud Developers
I am trying to get transcript of a recorder phone call with Google Speech. The call is recorder in mp3 format and stored in the bucket.
For calling the Google Speech service I used C# library provided by Google from the NuGet.

var longOperation = await speech.LongRunningRecognizeAsync(new RecognitionConfig()
            {
                Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
                Model = "phone_call",
                SampleRateHertz = 44100,
                LanguageCode = "en-US",
                EnableAutomaticPunctuation = false,
                Metadata = new RecognitionMetadata()
                {
                    InteractionType = RecognitionMetadata.Types.InteractionType.PhoneCall
                }
            },
                RecognitionAudio.FromStorageUri("gs://my-bucket/my-file.mp3")
            );
var completedResponse = await longOperation.PollUntilCompletedAsync();
var response = completedResponse.Result;


The problem is that I keep getting null response for mp3 file, while when I convert mp3 to flac everything works fine. Flac files a a lot larger than mp3 so I would not stick to them plus I would have to convert each mp3 file to flac prior to calling Google Speech.

I bet it is something in the config when I instantiate the client object, but so far did not manage to figure out what is it. 
Any clue what I am doing wrong in the code above?

Ali T (Cloud Platform Support)

unread,
Jan 6, 2019, 11:09:59 AM1/6/19
to Google Cloud Developers
Your config appears to be fine. The issue you are facing appears to be related to the encoding. The Speech API only supports the encodings listed in the documentation

As per the best practices, if you don’t want to convert the encoding to FLAC due to bandwidth, you should convert it into AMR_WB, OGG_OPUS or SPEEX_WITH_HEADER_BYTE codecs. A possible solution to look into for easy codec conversion is sox
Reply all
Reply to author
Forward
0 new messages