Google Speech Adaptation Boost and Class Tokens

Arsen Chaloyan

unread,

Apr 11, 2020, 5:53:26 PM4/11/20

to UniMRCP

Purpose

This post is intended to provide additional clarifications regarding the use and limitations of speech adaptation boost and class tokens with the following UniMRCP server plugins:

GSR 1.17.0
GDF 1.15.0

Speech Adaptation Boost

To start off, even in the latest Google APIs, speech adaptation boost is available only for Dialogflow v2. For Speech-to-Text, this feature is still available in v1b1beta1, but not v1. That is why the approach discussed in this post is currently applicable to GDF only. The same approach would be applicable to GSR in the future when Google makes this feature available in v1 and we have the Google APIs upgraded.

Boost values can be set in a speech context defined in the configuration file by using the attribute name weight or boost, for example, as follows. Note the two attribute names can be used interchangeably.

<speech-context id="custom" enable="true">
<phrase weight="15">fair</phrase>
<phrase weight="2">fare</phrase>
</speech-context>

The same can be specified via SRGS XML by using the attribute name weight, for example, as follows.

Refer to recommendations from Google for best practices on setting up boost values:

https://cloud.google.com/speech-to-text/docs/boost

Class Tokens

Class tokens can be used with both GSR and GDF. The following is a sample speech context defined in the configuration file which make use of the class token $TIME.

<speech-context id="time" language="en-US" enable="false">
<phrase>$TIME</phrase>
</speech-context>

The specified speech context can be referenced via a built-in grammar as follows.

builtin:speech/time

or

builtin:grammar/time

The same cab be specified via SRGS XML as follows

Refer to the following page for the list of available class tokens:

https://cloud.google.com/speech-to-text/docs/class-tokens

Questions and suggestions are welcome.

--

Arsen Chaloyan
Author of UniMRCP
http://www.unimrcp.org

Arsen Chaloyan

unread,

Apr 26, 2020, 2:50:57 PM4/26/20

to UniMRCP

If you are interested in using speech adaptation boost with a custom GSR plugin built against v1p1beta1 API, then follow the instructions below.

Make sure you have all the latest packages/dependencies installed.

yum install unimrcp-gsr

Remove the GSR package.

rpm -e unimrcp-gsr

Upgrade Google APIs and install the custom GSR package.

rpm -Uvh http://www.unimrcp.org/data/gsr/unigoogleapis-1.1.1-1.el7.x86_64.rpm
rpm -Uvh http://www.unimrcp.org/data/gsr/unimrcp-gsr-v1p1beta1-1.7.1-1.el7.x86_64.rpm

The use of boost is explained earlier in this thread. The custom GSR package also supports alternate languages, which is another feature available in v1p1beta1 API.

https://cloud.google.com/speech-to-text/docs/multiple-languages

The alternate languages can be specified globally in umsgsr.xml, for example, as follows

<streaming-recognition
language="en-US"
alternate-languages="es-ES, de-DE"

Andres Ortiz

unread,

Jul 15, 2020, 10:11:22 AM7/15/20

to UniMRCP

Hi Arsen,

I am testing the GSR alternate-languages, it looks great from the google transcription perspective, but I don't get any indication in the result what the actual language is. We could use that information to inform the downstream engine to change language and change the TTS voice and language. For instance:

MRCP/2.0 471 RECOGNITION-COMPLETE 2 COMPLETE
Channel-Identifier: 1ac9da9949f842c7@speechrecog
Completion-Cause: 000 success
Content-Type: application/x-nlsml
Content-Length: 286

<?xml version="1.0"?>
<result>
<interpretation grammar="session:593bd5b2c38b168d7a4e800b72a7597b-general" confidence="0.94">
<instance>j'ai perdu ma carte de crédit</instance>
<language>fr-FR</language>
<input mode="speech">j'ai perdu ma carte de crédit</input>
</interpretation>
</result>

Thanks,

Andres

Arsen Chaloyan

unread,

Jul 20, 2020, 10:06:10 PM7/20/20

to UniMRCP

Right, the language would need to be returned with the results. The question is in which format. We should conform to NLSML, which has a certain XML schema. The right way would be to extend the format of the instance element. I'll take this issue into consideration going forward. Thanks for the note.

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unimrcp+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unimrcp/7fae5d75-364e-4a4c-b7c6-555e749bf3e6o%40googlegroups.com.

Reply all

Reply to author

Forward