Using Various Speech Recognition and Speech Synthesis Plugins

348 views

Skip to first unread message

Arsen Chaloyan

unread,

Apr 18, 2020, 10:15:51 PM4/18/20

to UniMRCP

Purpose

This post is intended to clarify the use of various speech recognition and/or speech synthesis plugins with a single instance of UniMRCP server.

Problem

There is a quite common requirement in loading various speech recognition and/or speech synthesis plugins into the same instance of UniMRCP server and use the plugins interchangeably at run-time.

The question is how a particular plugin or engine can be referenced from the MRCP client (IVR platform), given the protocol allows to specify only the resource name such as speechrecognizer or speechsynthesizer but not a particular engine name or any other parameter which could possibly be used to identify the engine.

Solution

We will observe a use case with Google and Azure SR and SS plugins as an example. The same approach can be used with any other plugins.

First, since we are interested in speech recognizer and speech synthesizer resources only, let's disable the remaining MRCP resources in unimrcpserver.xml. This is an optional step.

<resource-factory>
<resource id="speechsynth" enable="true"/>
<resource id="speechrecog" enable="true"/>
<resource id="recorder" enable="false"/>
<resource id="speakverify" enable="false"/>
</resource-factory>

Next comes the loading of plugins, which should look as follows in our case.

<plugin-factory>
<engine id="Google-SR-1" name="umsgsr" enable="true"/>
<engine id="Google-SS-1" name="umsgss" enable="true"/>
<engine id="Azure-SR-1" name="umsazuresr" enable="true"/>
<engine id="Azure-SS-1" name="umsazuress" enable="true"/>

</plugin-factory>

Now, there are two different approaches outlined below in referencing the engines from the MRCP client.

1. Specifying Engine per MRCP Session via Feature Tags

Availability: >= UniMRCP Server 1.6.0

Feature tags can be used to pass engine identifiers via the SIP Accept-Contact header per RFC 3841.

For example, the following SIP Accept-Contact header set in SIP INVITE allows to use Google SR and Azure SS plugins in the scope of an MRCP session being established.

Accept-Contact: *;speechrecog.engine="Google-SR-1";speechsynth.engine="Azure-SS-1"

where the feature tag speechrecog.engine specifies an identifier of the speech recognizer engine defined in the plugin factory above. Similarly, the feature tag speechsynth.engine references an identifier of the speech synthesizer resource.

The use of this approach is subject to availability. Platforms utilizing the UniMRCP client library may already have this feature supported or at least can easily be extended for this purpose.

2. Specifying Engines per UniMRCP Server Profile

Availability: >= UniMRCP Server 1.0.0

This approach is very common and can be used with all the IVR platforms. The approach has been available for many years. The idea is to define multiple UniMRCP server profiles by having each profile referenced particular speech recognizer and speech synthesizer engines.

For example:

<mrcpv2-profile id="uni2-google">
<sip-uas>SIP-Agent-Google-1</sip-uas>
<mrcpv2-uas>MRCPv2-Agent-1</mrcpv2-uas>
<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>

<resource-engine-map>
<resource id="speechrecog" engine="Google-SR-1"/>
<resource id="speechsynth" engine="Google-SS-1"/>
</resource-engine-map>

</mrcpv2-profile>

<mrcpv2-profile id="uni2-2">
<sip-uas>SIP-Agent-Azure-1</sip-uas>
<mrcpv2-uas>MRCPv2-Agent-1</mrcpv2-uas>
<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>

<resource-engine-map>
<resource id="speechrecog" engine="Azure-SR-1"/>
<resource id="speechsynth" engine="Azure-SS-1"/>
</resource-engine-map>
</mrcpv2-profile>

Detailed reference configuration of unimrcpserver.xml is provided attached.

Note that if you are using UniMRCP server 1.6.0 or below, then the following entries

<resource id="speechrecog" engine="SR-1"/>
<resource id="speechsynth" engine="SS-1"/>

need to be replaced with

<param name="speechsynth" value="SS-1"/>
<param name="speechrecog" value="SR-1"/>

unimrcpserver.xml

Message has been deleted

Arsen Chaloyan

unread,

Jul 20, 2020, 10:03:23 PM7/20/20

to UniMRCP

All the Google plugins depend on the gRPC and Protobuf libraries.

If you use specific versions of plugins built against different versions of dependencies, then this may result in the error you encountered. However, using the latest versions, you should be able to load GSS, GSR and GDF plugins into a single instance of UniMRCP server without any problems.

My guess is the conflict in your scenario is introduced by the Yandex SR plugin. Please note that this plugin is also based on gRPC but is built against an older version of the library.

FYI, newer versions of gRPC/Protobuf as well as Google and Yandex plugins will be released later in July or likely in August.

On Sun, Jul 12, 2020 at 10:35 AM Vasyl Garazd <vasyl....@gmail.com> wrote:

Dear Arsen,

If I try to create two profiles:
1) GSS + GSR
2) GSS + GDF

MRCP server failed to start with the message:
unimrcpserver: [libprotobuf FATAL google/protobuf/extension_set.cc:109] Multiple extension registrations for type "google.protobuf.MethodOptions", field number 72295728

If I disable GSR or GDF engine - it works.

Looks like it is not able to use both engines in one host?

Current profiles are:
<profiles>

<mrcpv2-profile id="uni2">
<sip-uas>SIP-Agent-1</sip-uas>

<mrcpv2-uas>MRCPv2-Agent-1</mrcpv2-uas>
<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>

</mrcpv2-profile>

<mrcpv2-profile id="uni2-google">
<sip-uas>SIP-Agent-Google-1</sip-uas>
<mrcpv2-uas>MRCPv2-Agent-1</mrcpv2-uas>
<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>
<resource-engine-map>

<resource id="speechrecog" engine="GSR-1"/>
<resource id="speechsynth" engine="GSS-1"/>
</resource-engine-map>
</mrcpv2-profile>

<mrcpv2-profile id="uni2-google-df">
<sip-uas>SIP-Agent-GoogleDialogFlow-1</sip-uas>

<mrcpv2-uas>MRCPv2-Agent-1</mrcpv2-uas>
<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>
<resource-engine-map>

<resource id="speechrecog" engine="GDF-1"/>
<resource id="speechsynth" engine="GSS-1"/>
</resource-engine-map>
</mrcpv2-profile>

<mrcpv2-profile id="uni2-yandex">
<sip-uas>SIP-Agent-Yandex-1</sip-uas>

<mrcpv2-uas>MRCPv2-Agent-1</mrcpv2-uas>
<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>
<resource-engine-map>

<resource id="speechrecog" engine="Yandex-SR-1"/>
<resource id="speechsynth" engine="Yandex-SS-1"/>
</resource-engine-map>
</mrcpv2-profile>


<mrcpv1-profile id="uni1">
<rtsp-uas>RTSP-Agent-1</rtsp-uas>

<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>

</mrcpv1-profile>


<mrcpv1-profile id="uni1-google">
<rtsp-uas>RTSP-Agent-Google-1</rtsp-uas>

<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>
<resource-engine-map>

<resource id="speechrecog" engine="GSR-1"/>
<resource id="speechsynth" engine="GSS-1"/>
</resource-engine-map>
</mrcpv1-profile>


<mrcpv1-profile id="uni1-google-df">
<rtsp-uas>RTSP-Agent-GoogleDialogflow-1</rtsp-uas>

<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>
<resource-engine-map>

<resource id="speechrecog" engine="GDF-1"/>
<resource id="speechsynth" engine="GSS-1"/>
</resource-engine-map>
</mrcpv1-profile>


<mrcpv1-profile id="uni1-yandex">
<rtsp-uas>RTSP-Agent-Yandex-1</rtsp-uas>

<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>
<resource-engine-map>

<resource id="speechrecog" engine="Yandex-SR-1"/>
<resource id="speechsynth" engine="Yandex-SS-1"/>
</resource-engine-map>
</mrcpv1-profile>

</profiles>

воскресенье, 19 апреля 2020 г., 5:15:51 UTC+3 пользователь Arsen Chaloyan написал:

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unimrcp+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unimrcp/9d8418e2-cd3c-49f4-b0b5-9c18c8298731o%40googlegroups.com.