How To Transcrib Audio To Text In Dialogflow-ES

Fox C

unread,

Mar 23, 2022, 5:28:12 PM3/23/22

to Dialogflow Essentials Edition users

I have a chatbot in dialogflow, it is conected with whatsapp by landbot, in this moment when an user send me an audio, landbot send the audio to dialogflow(literal in the format .ogv) but I need to transcrib it, because the dialogflow must understand. For example, this is the intent:

{ "id": "ffdc721f-8e94-4a9e-9ec8-a45b9b7f21be-37284719", "fulfillmentText": "🙁 Pronunciaste mal. \n La palabra era: Dog", "language_code": "en", "queryText": "https://media.eu-1.smooch.io/apps/5d2370ef6667cd00102fb9c2/conversations/31f2d6d5440d03fde4066b35/hJ9ob-c2Ogho4NTDLEuyFMg_/5zKf3-SgGMuS3RxcBUw5B6dj.oga", "webhookPayload": {}, "intentDetectionConfidence": 0.3, "action": "", "webhookSource": "", "parameters": { "pronunciacion": "https://media.eu-1.smooch.io/apps/5d2370ef6667cd00102fb9c2/conversations/31f2d6d5440d03fde4066b35/hJ9ob-c2Ogho4NTDLEuyFMg_/5zKf3-SgGMuS3RxcBUw5B6dj.oga", "palabra": "Dog" }, "fulfillmentMessages": [ { "text": { "text": [ "🙁 Pronunciaste mal. \n La palabra era: Dog" ] } } ], "diagnosticInfo": { "webhook_latency_ms": "1871.0" }, "webhookStatus": { "webhookStatus": { "message": "Webhook execution successful" }, "webhookUsed": true }, "intent": { "isFallback": false, "displayName": "Pronunciar", "id": "4dd12af2-94a6-486b-a1cd-daa2c65d6671" } }

so in the part of "pronunciacion" and "palabra" must be the same, but when I will convert the audio to text, and I don't know how to do that, could help me?

Fox C

unread,

Mar 23, 2022, 5:28:16 PM3/23/22

to Dialogflow Essentials Edition users

I have a chatbot in dialogflow, it is conected with whatsapp by landbot, in this moment when an user send me an audio, landbot send the audio to dialogflow(literal in the format .ogv) but I need to transcrib it, because the dialogflow must understand. For example, this is the intent:

{ "id": "ffdc721f-8e94-4a9e-9ec8-a45b9b7f21be-37284719", "fulfillmentText": "🙁 Pronunciaste mal. \n La palabra era: Dog", "language_code": "en", "queryText": "https://media.eu-1.smooch.io/apps/5d2370ef6667cd00102fb9c2/conversations/31f2d6d5440d03fde4066b35/hJ9ob-c2Ogho4NTDLEuyFMg_/5zKf3-SgGMuS3RxcBUw5B6dj.oga", "webhookPayload": {}, "intentDetectionConfidence": 0.3, "action": "", "webhookSource": "", "parameters": { "pronunciacion": "https://media.eu-1.smooch.io/apps/5d2370ef6667cd00102fb9c2/conversations/31f2d6d5440d03fde4066b35/hJ9ob-c2Ogho4NTDLEuyFMg_/5zKf3-SgGMuS3RxcBUw5B6dj.oga", "palabra": "Dog" }, "fulfillmentMessages": [ { "text": { "text": [ "🙁 Pronunciaste mal. \n La palabra era: Dog" ] } } ], "diagnosticInfo": { "webhook_latency_ms": "1871.0" }, "webhookStatus": { "webhookStatus": { "message": "Webhook execution successful" }, "webhookUsed": true }, "intent": { "isFallback": false, "displayName": "Pronunciar", "id": "4dd12af2-94a6-486b-a1cd-daa2c65d6671" } }

searched that I can do that with Speech-To-Text but I don't know how can I put it in dialogflow

Muhammad Sarder

unread,

Mar 24, 2022, 5:56:02 PM3/24/22

to Dialogflow Essentials Edition users

Hey Fox,

DialogFlow simply uses GCP Speech API in the backend via its own Google-owned project. DialogFlow processes the audio (using GCP Speech API) and converts it to text then matches intent [0].

Currently, audio files in .ogv format seem not supported by the Speech-to-text API[1].

Perhaps you can contact the landbot support about having other audio file formats?

[0]https://cloud.google.com/dialogflow/es/docs/how/detect-intent-audio

[1]https://cloud.google.com/speech-to-text/docs/encoding

Reply all

Reply to author

Forward