Purpose
This post is intended to clarify the use of various speech recognition and/or speech synthesis plugins with a single instance of UniMRCP server.
Problem
There is a quite common requirement in loading various speech recognition and/or speech synthesis plugins into the same instance of UniMRCP server and use the plugins interchangeably at run-time.
The question is how a particular plugin or engine can be referenced from the MRCP client (IVR platform), given the protocol allows to specify only the resource name such as speechrecognizer or speechsynthesizer but not a particular engine name or any other parameter which could possibly be used to identify the engine.
Solution
We will observe a use case with Google and Azure SR and SS plugins as an example. The same approach can be used with any other plugins.
First, since we are interested in speech recognizer and speech synthesizer resources only, let's disable the remaining MRCP resources in unimrcpserver.xml. This is an optional step.
<!-- Factory of MRCP resources -->
<resource-factory>
<resource id="speechsynth" enable="true"/>
<resource id="speechrecog" enable="true"/>
<resource id="recorder" enable="false"/>
<resource id="speakverify" enable="false"/>
</resource-factory>
Next comes the loading of plugins, which should look as follows in our case.
<plugin-factory>
<engine id="Google-SR-1" name="umsgsr" enable="true"/>
<engine id="Google-SS-1" name="umsgss" enable="true"/>
<engine id="Azure-SR-1" name="umsazuresr" enable="true"/>
<engine id="Azure-SS-1" name="umsazuress" enable="true"/>
</plugin-factory>
Now, there are two different approaches outlined below
in referencing the engines from the MRCP client.
1. Specifying Engine per MRCP Session via Feature Tags
- Availability: >= UniMRCP Server 1.6.0
Feature tags can be used to pass
engine identifiers via the SIP Accept-Contact header per RFC 3841.
For example, the following SIP Accept-Contact header set in SIP INVITE allows to use Google SR and Azure SS plugins in the scope of an MRCP session being established.
Accept-Contact: *;speechrecog.engine="Google-SR-1";speechsynth.engine="Azure-SS-1"
where
the feature tag speechrecog.engine specifies an identifier of the speech recognizer engine defined in the plugin factory above. Similarly, the feature tag
speechsynth.engine references an identifier of the speech synthesizer resource.
The
use of this approach is subject to availability. Platforms utilizing
the UniMRCP client library may already have this feature supported or at
least can easily be extended for this purpose.
2.
Specifying Engines per UniMRCP Server Profile
-
Availability: >= UniMRCP Server 1.0.0
This
approach is very common and can be used with all the IVR platforms. The approach has been available for many years. The
idea is to define multiple UniMRCP server profiles by having each
profile referenced particular speech recognizer and speech synthesizer engines.
For example:
<!-- MRCPv2 default profile for Google SR/SS plugins
-->
<mrcpv2-profile id="uni2-google">
<sip-uas>SIP-Agent-Google-1</sip-uas>
<mrcpv2-uas>MRCPv2-Agent-1</mrcpv2-uas>
<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>
<resource-engine-map>
<resource id="speechrecog" engine="Google-SR-1"/>
<resource id="speechsynth" engine="Google-SS-1"/>
</resource-engine-map>
</mrcpv2-profile>
<!-- MRCPv2 default profile for Azure SR/SS plugins
-->
<mrcpv2-profile id="uni2-2">
<sip-uas>SIP-Agent-Azure-1</sip-uas>
<mrcpv2-uas>MRCPv2-Agent-1</mrcpv2-uas>
<media-engine>Media-Engine-1</media-engine>
<rtp-factory>RTP-Factory-1</rtp-factory>
<rtp-settings>RTP-Settings-1</rtp-settings>
<resource-engine-map>
<resource id="speechrecog" engine="Azure-SR-1"/>
<resource id="speechsynth" engine="Azure-SS-1"/>
</resource-engine-map>
</mrcpv2-profile>
Detailed reference configuration of unimrcpserver.xml is provided attached.
Note that if you are using UniMRCP server 1.6.0 or below, then the following entries