Using Lex V2 API

111 views
Skip to first unread message

Arsen Chaloyan

unread,
Feb 26, 2021, 6:21:03 PM2/26/21
to UniMRCP
Purpose
This post outlines main features and use cases of the new Lex V2 API consumed by the Lex 2.x plugin to UniMRCP server.

General Concepts
While the general concepts of the Lex V2 API are similar to the V1 API, the new API and the corresponding UniMRCP server plugin offer clear advantages including but not limited to:
  • Streaming conversation
  • Robust barge-in functionality
  • DTMF relay
  • Welcome messages
  • Dialog delegation
Everyone using V1 is encouraged to upgrade to V2. The plugin for V1 will receive maintenance updates only, while V2 will continue to be supported in the mainstream version of the plugin.

Supported IVR Platforms
The Lex V2 API can be utilized via UniMRCP server by all the major IVR platforms provided by Genesys, Avaya, Cisco and open source projects such as Asterisk and FreeSWITCH. This post provides general guidance for the use of Lex V2 over the MRCP framework without outlining any platform-specific differences.

AWS Credentials
There have been no changes in the way AWS credentials are supposed to be configured to consume the Lex service. The methods listed below are consistently supported by all the three AWS plugins (Lex, Polly, Transcribe) to UniMRCP server.
  • IAM user
  • Default credentials provider chain
  • STS profile credentials provider
Specifying Lex Bot
Parameters of the default Lex bot can be specified in the configuration file umslex.xml. For example:
<streaming-recognition
      language="en_US"
      region="us-west-2"
      bot-name="1ONTGGK2U4"
      alias="XYZALIASID"
/>

The user application can reference a particular bot per MRCP session by specifying the bot parameters in the SET-PARAMS and/or RECOGNIZE requests via one of the three methods below.
  • Vendor-Specific-Parameters
  • Metadata in an SRGS XML grammar
  • Attributes passed to the built-in grammar. 
All the options are documented in the Usage Guide. This post demonstrates the use of the built-in grammar. For example:

builtin:speech/transcribe?bot-name=1ONTGGK2U4;alias=XYZALIASID;aws-region=us-west-2

Starting Conversation
The user application is supposed to start a new conversation by sending an initial RECOGNIZE request with specific parameters to trigger one of the actions below.
  • Eliciting Intent
The user application can start a new conversation by specifying a custom welcome message and letting the bot eliciting the user intent. For example:

builtin:speech/transcribe?message=Welcome to travel agency;dialog-action=ElicitIntent
  • Triggering Intent
The user application can start a new conversation by triggering a particular intent. For example:

builtin:speech/transcribe?intent-name=BookCar;dialog-action=Delegate
  • Eliciting Slot
The user application can start a new conversation by eliciting a particular slot in an intent. 

builtin:speech/transcribe?intent-name=BookCar;slot-name=CarType;dialog-action=ElicitSlot

As a result, the initial RECOGNIZE request completes without a user interaction as soon as the bot responds. The response contains the prompt to be played to the caller.

Completing Conversation
The user application is supposed to place regular RECOGNIZE requests in a loop until the conversation completes.

builtin:speech/transcribe

As a result of each interaction, the NLSML instance element in the RECOGNITION-COMPLETE event holds a data structure in the JSON format returned by the bot. The Usage Guide contains a complete message exchange in Section 7 and a sequence diagram in Section 8.

Thank you for using UniMRCP
--
Arsen Chaloyan
Author of UniMRCP
http://www.unimrcp.org
Reply all
Reply to author
Forward
0 new messages