Getting a result from inline grammar in vxml using GSR

105 views
Skip to first unread message

Tony Curoso

unread,
Nov 1, 2023, 3:03:17 PM11/1/23
to UniMRCP
Hello, 
We are piloting a GSR integration for our IVR, which uses SSP as the vxml interpreter. Understanding the unimrcp does not support http grammars, I have moved our grammar to the vxml It gets defined successfully in unimrcp, but we get a no-match even when the transcribed speech from gsr matches one of the items in the grammar. 
With Lumenvox we use a <tag> to specify the result that gets returned, but that does not seem to make a difference here, so I removed it. Is there a standard way I should be formatting the grxml? Should we be referencing a builtin context?

Example of grammar in vxml file

<grammar xml:lang="en-US" root="jurorstatus" mode="voice">
  <rule id="jurorstatus">
<item>
<one-of>
<item>Juror Status</item>
<item>My Status</item>
<item>Status</item>
<item>Check My Status</item>
<item>Check Juror Status</item>
<item>Check My Juror Status</item>
</one-of>
</item>
</rule>
  </grammar>

Log snippet from unimrcp showing grammar definition successful and no match
CSeq: 10000446
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 713

DEFINE-GRAMMAR 10000348 MRCP/1.0
Speech-Language: en-US
Content-Type: application/grammar+xml
Content-Id: com.intel.ssp.gram-1
Content-Length: 559

<?xml version='1.0'?><grammar xmlns='http://www.w3.org/2001/06/grammar' version='1.0' mode='voice' root='R_wildcard_34' tag-format='semantics/1.0' xml:lang='en-US' xml:base='http://dev-portalsdb.aticorp.com/Streamwrite/Apps/Sandbox/JurySrTestUM/IVR/English/SR/CallFlow_Start.vxml'><rule id="R_wildcard_34" scope="private"><tag>var out="";</tag><ruleref special="GARBAGE"></ruleref><item><one-of><item><ruleref uri="../SR/Grammars/Repeat.grxml"></ruleref><tag>out="Repeat";</tag></item></one-of></item><ruleref special="GARBAGE"></ruleref></rule></grammar>


2023-11-01 10:19:34:045737 [INFO]   Process DEFINE-GRAMMAR Request <ea5d5f287dff44db@speechrecog> [10000348]
2023-11-01 10:19:34:045758 [INFO]   State Transition RECOGNIZED -> IDLE <ea5d5f287dff44db@speechrecog>
2023-11-01 10:19:34:045867 [INFO]   Process DEFINE-GRAMMAR Response <ea5d5f287dff44db@speechrecog> [10000348]
2023-11-01 10:19:34:045902 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [179 bytes]
RTSP/1.0 200 OK
CSeq: 10000446
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 65

MRCP/1.0 10000348 200 COMPLETE
Completion-Cause: 000 success


2023-11-01 10:19:34:047670 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [344 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000447
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 175

DEFINE-GRAMMAR 10000349 MRCP/1.0
Speech-Language: en-US
Content-Type: text/uri-list
Content-Id: com.intel.ssp.gram-2
Content-Length: 32

builtin:dtmf/digits?length=1


2023-11-01 10:19:34:047700 [INFO]   Process DEFINE-GRAMMAR Request <ea5d5f287dff44db@speechrecog> [10000349]
2023-11-01 10:19:34:047740 [INFO]   Process DEFINE-GRAMMAR Response <ea5d5f287dff44db@speechrecog> [10000349]
2023-11-01 10:19:34:047759 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [179 bytes]
RTSP/1.0 200 OK
CSeq: 10000447
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 65

MRCP/1.0 10000349 200 COMPLETE
Completion-Cause: 000 success


2023-11-01 10:19:34:050287 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [1023 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000448
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 997

DEFINE-GRAMMAR 10000350 MRCP/1.0
Speech-Language: en-US
Content-Type: application/grammar+xml
Content-Id: com.intel.ssp.gram-3
Content-Length: 843

<?xml version='1.0'?><grammar xmlns='http://www.w3.org/2001/06/grammar' version='1.0' mode='voice' root='R_wildcard_35' tag-format='semantics/1.0' xml:lang='en-US' xml:base='http://dev-portalsdb.aticorp.com/Streamwrite/Apps/Sandbox/JurySrTestUM/IVR/English/SR/CallFlow_Start.vxml'><rule id="R_wildcard_35" scope="private"><tag>var out="";</tag><ruleref special="GARBAGE"></ruleref><item><one-of><item><ruleref uri="../SR/Grammars/emergencia.grxml"></ruleref><tag>out="emergencia";</tag></item><item><ruleref uri="../SR/Grammars/citacion.grxml"></ruleref><tag>out="citacion";</tag></item><item><ruleref uri="../SR/Grammars/repitir.grxml"></ruleref><tag>out="repitir";</tag></item><item><ruleref uri=".
2023-11-01 10:19:34:050299 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [143 bytes]
./SR/Grammars/Repeat.grxml"></ruleref><tag>out="Repeat";</tag></item></one-of></item><ruleref special="GARBAGE"></ruleref></rule></grammar>


2023-11-01 10:19:34:050352 [INFO]   Process DEFINE-GRAMMAR Request <ea5d5f287dff44db@speechrecog> [10000350]
2023-11-01 10:19:34:050422 [INFO]   Process DEFINE-GRAMMAR Response <ea5d5f287dff44db@speechrecog> [10000350]
2023-11-01 10:19:34:050444 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [179 bytes]
RTSP/1.0 200 OK
CSeq: 10000448
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 65

MRCP/1.0 10000350 200 COMPLETE
Completion-Cause: 000 success


2023-11-01 10:19:34:051897 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [344 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000449
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 175

DEFINE-GRAMMAR 10000351 MRCP/1.0
Speech-Language: en-US
Content-Type: text/uri-list
Content-Id: com.intel.ssp.gram-4
Content-Length: 32

builtin:dtmf/digits?length=1


2023-11-01 10:19:34:051927 [INFO]   Process DEFINE-GRAMMAR Request <ea5d5f287dff44db@speechrecog> [10000351]
2023-11-01 10:19:34:051962 [INFO]   Process DEFINE-GRAMMAR Response <ea5d5f287dff44db@speechrecog> [10000351]
2023-11-01 10:19:34:051982 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [179 bytes]
RTSP/1.0 200 OK
CSeq: 10000449
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 65

MRCP/1.0 10000351 200 COMPLETE
Completion-Cause: 000 success


2023-11-01 10:19:34:054291 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [824 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000450
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 655

DEFINE-GRAMMAR 10000352 MRCP/1.0
Speech-Language: en-US
Content-Type: application/grammar+xml
Content-Id: com.intel.ssp.gram-5
Content-Length: 501

<?xml version='1.0'?><grammar xmlns='http://www.w3.org/2001/06/grammar' version='1.0' mode='voice' root='jurorstatus' xml:lang='en-US' xml:base='http://dev-portalsdb.aticorp.com/Streamwrite/Apps/Sandbox/JurySrTestUM/IVR/English/SR/CallFlow_Start.vxml'><rule id="jurorstatus" scope="private"><item><one-of><item>Juror Status</item><item>My Status</item><item>Status</item><item>Check My Status</item><item>Check Juror Status</item><item>Check My Juror Status</item></one-of></item></rule></grammar>


2023-11-01 10:19:34:054318 [INFO]   Process DEFINE-GRAMMAR Request <ea5d5f287dff44db@speechrecog> [10000352]
2023-11-01 10:19:34:054369 [INFO]   Process DEFINE-GRAMMAR Response <ea5d5f287dff44db@speechrecog> [10000352]
2023-11-01 10:19:34:054387 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [179 bytes]
RTSP/1.0 200 OK
CSeq: 10000450
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 65

MRCP/1.0 10000352 200 COMPLETE
Completion-Cause: 000 success



2023-11-01 10:19:34:055817 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [344 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000451
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 175

DEFINE-GRAMMAR 10000353 MRCP/1.0
Speech-Language: en-US
Content-Type: text/uri-list
Content-Id: com.intel.ssp.gram-6
Content-Length: 32

builtin:dtmf/digits?length=1


2023-11-01 10:19:34:055839 [INFO]   Process DEFINE-GRAMMAR Request <ea5d5f287dff44db@speechrecog> [10000353]
2023-11-01 10:19:34:055868 [INFO]   Process DEFINE-GRAMMAR Response <ea5d5f287dff44db@speechrecog> [10000353]
2023-11-01 10:19:34:055887 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [179 bytes]
RTSP/1.0 200 OK
CSeq: 10000451
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 65

MRCP/1.0 10000353 200 COMPLETE
Completion-Cause: 000 success


2023-11-01 10:19:34:060687 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [533 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000452
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 364

RECOGNIZE 10000354 MRCP/1.0
N-Best-List-Length: 1
DTMF-Interdigit-Timeout: 10
Confidence-Threshold: 20
Sensitivity-Level: 50
Speed-Vs-Accuracy: 50
DTMF-Term-Char: #
No-Input-Timeout: 4000
DTMF-Term-Timeout: 999999
Recognizer-Start-Timers: false
Content-Type: text/uri-list
Content-Length: 58

session:com.intel.ssp.gram-6
session:com.intel.ssp.gram-5
2023-11-01 10:19:34:060736 [INFO]   Process RECOGNIZE Request <ea5d5f287dff44db@speechrecog> [10000354]
2023-11-01 10:19:34:060761 [INFO]   Init Speech Detector: frame-duration=10 ms, frame-size=160, max-frame-count=1550, output-frame-count=20, vad-mode=2, noinput-timeout=4000 ms, input-timeout=30000 ms, start-timeout=50 ms, complete-timeout=1000 ms, incomplete-timeout=15000 ms, leading-silence=300 ms, trailing-silence=300 ms, interim-results=1, start-of-input=external <ea5d5f287dff44db>
2023-11-01 10:19:34:060795 [INFO]   Init DTMF Detector: interdigit-timeout=10 ms, term-timeout=999999 ms, term-char=#, length=1, min-length=0, max-length=0 <ea5d5f287dff44db>
2023-11-01 10:19:34:060826 [INFO]   gRPC Streaming Recognize <ea5d5f287dff44db@gsr>
2023-11-01 10:19:34:061432 [INFO]   Process RECOGNIZE Response <ea5d5f287dff44db@speechrecog> [10000354]
2023-11-01 10:19:34:061441 [INFO]   State Transition IDLE -> RECOGNIZING <ea5d5f287dff44db@speechrecog>
2023-11-01 10:19:34:061458 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [151 bytes]
RTSP/1.0 200 OK
CSeq: 10000452
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 37

MRCP/1.0 10000354 200 IN-PROGRESS


2023-11-01 10:19:41:829613 [INFO]   Speech Detector State Transition NO-INPUT -> IN-PROGRESS [7770 ms] <ea5d5f287dff44db>
2023-11-01 10:19:41:829644 [INFO]   Start Input Timer [30000 ms] <ea5d5f287dff44db>
2023-11-01 10:19:41:829923 [INFO]   Send Config <ea5d5f287dff44db@gsr>
{"streamingConfig":{"config":{"encoding":"LINEAR16","sampleRateHertz":8000,"languageCode":"en-US","maxAlternatives":1,"enableSpokenPunctuation":false,"enableSpokenEmojis":false},"singleUtterance":true,"interimResults":true}}
2023-11-01 10:19:42:597836 [INFO]   Received Response: status [1] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status"}],"stability":0.01,"resultEndTime":"0.840s","languageCode":"en-us"}],"speechEventTime":"0s","requestId":"6755126732038329247"}
2023-11-01 10:19:42:597875 [INFO]   Result[0]: stability [0.01] final [0] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:42:597882 [INFO]   Alternative[0]: confidence [0.00] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:42:597886 [INFO]   Set Result Flag [1000 ms] <ea5d5f287dff44db>
2023-11-01 10:19:42:597894 [INFO]   Process START-OF-SPEECH Event <ea5d5f287dff44db@speechrecog> [10000354]
2023-11-01 10:19:42:597971 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [217 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000452
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 49

START-OF-SPEECH 10000354 IN-PROGRESS MRCP/1.0


2023-11-01 10:19:43:124599 [INFO]   Received Response: status [1] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status"}],"stability":0.9,"resultEndTime":"1.410s","languageCode":"en-us"}],"speechEventTime":"0s","requestId":"6755126732038329247"}
2023-11-01 10:19:43:124646 [INFO]   Result[0]: stability [0.90] final [0] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:43:124657 [INFO]   Alternative[0]: confidence [0.00] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:43:320870 [INFO]   Received Response: status [1] type [END_OF_SINGLE_UTTERANCE] result-count [0] <ea5d5f287dff44db@gsr>
{"speechEventType":"END_OF_SINGLE_UTTERANCE","speechEventTime":"1.370s","requestId":"6755126732038329247"}
2023-11-01 10:19:43:328642 [INFO]   Received Response: status [1] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status","confidence":0.987629056}],"isFinal":true,"resultEndTime":"1.560s","languageCode":"en-us"}],"totalBilledTime":"2s","speechEventTime":"0s","requestId":"6755126732038329247"}
2023-11-01 10:19:43:328660 [INFO]   Result[0]: stability [0.00] final [1] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:43:328664 [INFO]   Alternative[0]: confidence [0.99] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:43:329656 [INFO]   Input Complete [stopped] size=28800 bytes, dur=1850 ms <ea5d5f287dff44db@gsr>
2023-11-01 10:19:43:449094 [INFO]   Received Response: status [0] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status","confidence":0.987629056}],"isFinal":true,"resultEndTime":"1.560s","languageCode":"en-us"}],"totalBilledTime":"2s","speechEventTime":"0s","requestId":"6755126732038329247"}
2023-11-01 10:19:43:449122 [INFO]   Result[0]: stability [0.00] final [1] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:43:449126 [INFO]   Alternative[0]: confidence [0.99] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:43:449223 [INFO]   Process RECOGNITION-COMPLETE Event <ea5d5f287dff44db@speechrecog> [10000354]
2023-11-01 10:19:43:449228 [INFO]   State Transition RECOGNIZING -> RECOGNIZED <ea5d5f287dff44db@speechrecog>
2023-11-01 10:19:43:449286 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [508 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000452
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 339

RECOGNITION-COMPLETE 10000354 COMPLETE MRCP/1.0
Completion-Cause: 001 no-match
Content-Type: application/x-nlsml
Content-Length: 200

<?xml version="1.0"?>
<result>
  <interpretation grammar="session:com.intel.ssp.gram-5" confidence="98">
    <instance></instance>
    <input mode="speech">status</input>
  </interpretation>
</result>

2023-11-01 10:19:43:744890 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [533 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000453
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 364

RECOGNIZE 10000355 MRCP/1.0
N-Best-List-Length: 1
DTMF-Interdigit-Timeout: 10
Confidence-Threshold: 20
Sensitivity-Level: 50
Speed-Vs-Accuracy: 50
DTMF-Term-Char: #
No-Input-Timeout: 4000
DTMF-Term-Timeout: 999999
Recognizer-Start-Timers: false
Content-Type: text/uri-list
Content-Length: 58

session:com.intel.ssp.gram-6
session:com.intel.ssp.gram-5
2023-11-01 10:19:43:745004 [INFO]   Process RECOGNIZE Request <ea5d5f287dff44db@speechrecog> [10000355]
2023-11-01 10:19:43:745093 [INFO]   Init Speech Detector: frame-duration=10 ms, frame-size=160, max-frame-count=1550, output-frame-count=20, vad-mode=2, noinput-timeout=4000 ms, input-timeout=30000 ms, start-timeout=50 ms, complete-timeout=1000 ms, incomplete-timeout=15000 ms, leading-silence=300 ms, trailing-silence=300 ms, interim-results=1, start-of-input=external <ea5d5f287dff44db>
2023-11-01 10:19:43:745162 [INFO]   Init DTMF Detector: interdigit-timeout=10 ms, term-timeout=999999 ms, term-char=#, length=1, min-length=0, max-length=0 <ea5d5f287dff44db>
2023-11-01 10:19:43:745209 [INFO]   gRPC Streaming Recognize <ea5d5f287dff44db@gsr>
2023-11-01 10:19:43:745368 [INFO]   Process RECOGNIZE Response <ea5d5f287dff44db@speechrecog> [10000355]
2023-11-01 10:19:43:745381 [INFO]   State Transition RECOGNIZED -> RECOGNIZING <ea5d5f287dff44db@speechrecog>
2023-11-01 10:19:43:745449 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [151 bytes]
RTSP/1.0 200 OK
CSeq: 10000453
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 37

MRCP/1.0 10000355 200 IN-PROGRESS


2023-11-01 10:19:49:269736 [INFO]   Speech Detector State Transition NO-INPUT -> IN-PROGRESS [5530 ms] <ea5d5f287dff44db>
2023-11-01 10:19:49:269835 [INFO]   Start Input Timer [30000 ms] <ea5d5f287dff44db>
2023-11-01 10:19:49:270038 [INFO]   Send Config <ea5d5f287dff44db@gsr>
{"streamingConfig":{"config":{"encoding":"LINEAR16","sampleRateHertz":8000,"languageCode":"en-US","maxAlternatives":1,"enableSpokenPunctuation":false,"enableSpokenEmojis":false},"singleUtterance":true,"interimResults":true}}
2023-11-01 10:19:50:245832 [INFO]   Received Response: status [1] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status"}],"stability":0.01,"resultEndTime":"1.060s","languageCode":"en-us"}],"speechEventTime":"0s","requestId":"14985036298983289"}
2023-11-01 10:19:50:245879 [INFO]   Result[0]: stability [0.01] final [0] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:50:245891 [INFO]   Alternative[0]: confidence [0.00] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:50:245897 [INFO]   Set Result Flag [1000 ms] <ea5d5f287dff44db>
2023-11-01 10:19:50:245896 [INFO]   Process START-OF-SPEECH Event <ea5d5f287dff44db@speechrecog> [10000355]
2023-11-01 10:19:50:245969 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [217 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000453
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 49

START-OF-SPEECH 10000355 IN-PROGRESS MRCP/1.0


2023-11-01 10:19:50:766597 [INFO]   Received Response: status [1] type [END_OF_SINGLE_UTTERANCE] result-count [0] <ea5d5f287dff44db@gsr>
{"speechEventType":"END_OF_SINGLE_UTTERANCE","speechEventTime":"1.360s","requestId":"14985036298983289"}
2023-11-01 10:19:50:769711 [INFO]   Input Complete [stopped] size=28800 bytes, dur=1850 ms <ea5d5f287dff44db@gsr>
2023-11-01 10:19:50:774983 [INFO]   Received Response: status [1] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status","confidence":0.227394596}],"isFinal":true,"resultEndTime":"1.540s","languageCode":"en-us"}],"totalBilledTime":"2s","speechEventTime":"0s","requestId":"14985036298983289"}
2023-11-01 10:19:50:774997 [INFO]   Result[0]: stability [0.00] final [1] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:50:775000 [INFO]   Alternative[0]: confidence [0.23] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:50:844192 [INFO]   Received Response: status [0] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status","confidence":0.227394596}],"isFinal":true,"resultEndTime":"1.540s","languageCode":"en-us"}],"totalBilledTime":"2s","speechEventTime":"0s","requestId":"14985036298983289"}
2023-11-01 10:19:50:844229 [INFO]   Result[0]: stability [0.00] final [1] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:50:844233 [INFO]   Alternative[0]: confidence [0.23] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:50:844335 [INFO]   Process RECOGNITION-COMPLETE Event <ea5d5f287dff44db@speechrecog> [10000355]
2023-11-01 10:19:50:844355 [INFO]   State Transition RECOGNIZING -> RECOGNIZED <ea5d5f287dff44db@speechrecog>
2023-11-01 10:19:50:844394 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [508 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000453
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 339

RECOGNITION-COMPLETE 10000355 COMPLETE MRCP/1.0
Completion-Cause: 001 no-match
Content-Type: application/x-nlsml
Content-Length: 200

<?xml version="1.0"?>
<result>
  <interpretation grammar="session:com.intel.ssp.gram-5" confidence="22">
    <instance></instance>
    <input mode="speech">status</input>
  </interpretation>
</result>
2023-11-01 10:19:51:161103 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [533 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000454
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 364


RECOGNIZE 10000356 MRCP/1.0
N-Best-List-Length: 1
DTMF-Interdigit-Timeout: 10
Confidence-Threshold: 20
Sensitivity-Level: 50
Speed-Vs-Accuracy: 50
DTMF-Term-Char: #
No-Input-Timeout: 4000
DTMF-Term-Timeout: 999999
Recognizer-Start-Timers: false
Content-Type: text/uri-list
Content-Length: 58

session:com.intel.ssp.gram-6
session:com.intel.ssp.gram-5
2023-11-01 10:19:51:161229 [INFO]   Process RECOGNIZE Request <ea5d5f287dff44db@speechrecog> [10000356]
2023-11-01 10:19:51:161289 [INFO]   Init Speech Detector: frame-duration=10 ms, frame-size=160, max-frame-count=1550, output-frame-count=20, vad-mode=2, noinput-timeout=4000 ms, input-timeout=30000 ms, start-timeout=50 ms, complete-timeout=1000 ms, incomplete-timeout=15000 ms, leading-silence=300 ms, trailing-silence=300 ms, interim-results=1, start-of-input=external <ea5d5f287dff44db>
2023-11-01 10:19:51:161375 [INFO]   Init DTMF Detector: interdigit-timeout=10 ms, term-timeout=999999 ms, term-char=#, length=1, min-length=0, max-length=0 <ea5d5f287dff44db>
2023-11-01 10:19:51:161421 [INFO]   gRPC Streaming Recognize <ea5d5f287dff44db@gsr>
2023-11-01 10:19:51:161571 [INFO]   Process RECOGNIZE Response <ea5d5f287dff44db@speechrecog> [10000356]
2023-11-01 10:19:51:161576 [INFO]   State Transition RECOGNIZED -> RECOGNIZING <ea5d5f287dff44db@speechrecog>
2023-11-01 10:19:51:161596 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [151 bytes]
RTSP/1.0 200 OK
CSeq: 10000454
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 37

MRCP/1.0 10000356 200 IN-PROGRESS


2023-11-01 10:19:56:759620 [INFO]   Speech Detector State Transition NO-INPUT -> IN-PROGRESS [5600 ms] <ea5d5f287dff44db>
2023-11-01 10:19:56:759655 [INFO]   Start Input Timer [30000 ms] <ea5d5f287dff44db>
2023-11-01 10:19:56:759807 [INFO]   Send Config <ea5d5f287dff44db@gsr>
{"streamingConfig":{"config":{"encoding":"LINEAR16","sampleRateHertz":8000,"languageCode":"en-US","maxAlternatives":1,"enableSpokenPunctuation":false,"enableSpokenEmojis":false},"singleUtterance":true,"interimResults":true}}
2023-11-01 10:19:57:589473 [INFO]   Received Response: status [1] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status"}],"stability":0.01,"resultEndTime":"0.860s","languageCode":"en-us"}],"speechEventTime":"0s","requestId":"6681632984811973316"}
2023-11-01 10:19:57:589553 [INFO]   Result[0]: stability [0.01] final [0] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:57:589569 [INFO]   Alternative[0]: confidence [0.00] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:57:589578 [INFO]   Process START-OF-SPEECH Event <ea5d5f287dff44db@speechrecog> [10000356]
2023-11-01 10:19:57:589583 [INFO]   Set Result Flag [1000 ms] <ea5d5f287dff44db>
2023-11-01 10:19:57:589695 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [217 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000454
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 49

START-OF-SPEECH 10000356 IN-PROGRESS MRCP/1.0


2023-11-01 10:19:58:063360 [INFO]   Received Response: status [1] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status"}],"stability":0.9,"resultEndTime":"1.430s","languageCode":"en-us"}],"speechEventTime":"0s","requestId":"6681632984811973316"}
2023-11-01 10:19:58:063415 [INFO]   Result[0]: stability [0.90] final [0] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:58:063426 [INFO]   Alternative[0]: confidence [0.00] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:58:256405 [INFO]   Received Response: status [1] type [END_OF_SINGLE_UTTERANCE] result-count [0] <ea5d5f287dff44db@gsr>
{"speechEventType":"END_OF_SINGLE_UTTERANCE","speechEventTime":"1.360s","requestId":"6681632984811973316"}
2023-11-01 10:19:58:259778 [INFO]   Input Complete [stopped] size=28800 bytes, dur=1850 ms <ea5d5f287dff44db@gsr>
2023-11-01 10:19:58:262375 [INFO]   Received Response: status [1] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status","confidence":0.987629056}],"isFinal":true,"resultEndTime":"1.550s","languageCode":"en-us"}],"totalBilledTime":"2s","speechEventTime":"0s","requestId":"6681632984811973316"}
2023-11-01 10:19:58:262403 [INFO]   Result[0]: stability [0.00] final [1] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:58:262412 [INFO]   Alternative[0]: confidence [0.99] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:58:338071 [INFO]   Received Response: status [0] type [SPEECH_EVENT_UNSPECIFIED] result-count [1] <ea5d5f287dff44db@gsr>
{"results":[{"alternatives":[{"transcript":"status","confidence":0.987629056}],"isFinal":true,"resultEndTime":"1.550s","languageCode":"en-us"}],"totalBilledTime":"2s","speechEventTime":"0s","requestId":"6681632984811973316"}
2023-11-01 10:19:58:338117 [INFO]   Result[0]: stability [0.00] final [1] language [en-us] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:58:338127 [INFO]   Alternative[0]: confidence [0.99] transcript [status] <ea5d5f287dff44db@gsr>
2023-11-01 10:19:58:338269 [INFO]   Process RECOGNITION-COMPLETE Event <ea5d5f287dff44db@speechrecog> [10000356]
2023-11-01 10:19:58:338293 [INFO]   State Transition RECOGNIZING -> RECOGNIZED <ea5d5f287dff44db@speechrecog>
2023-11-01 10:19:58:338358 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [508 bytes]
ANNOUNCE rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000454
Session: ea5d5f287dff44db
Content-Type: application/mrcp
Content-Length: 339

RECOGNITION-COMPLETE 10000356 COMPLETE MRCP/1.0
Completion-Cause: 001 no-match
Content-Type: application/x-nlsml
Content-Length: 200

<?xml version="1.0"?>
<result>
  <interpretation grammar="session:com.intel.ssp.gram-5" confidence="98">
    <instance></instance>
    <input mode="speech">status</input>
  </interpretation>
</result>

2023-11-01 10:20:02:111160 [INFO]   Receive RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [116 bytes]
TEARDOWN rtsp://192.168.168.187:01554/media/speechrecognizer RTSP/1.0
CSeq: 10000455
Session: ea5d5f287dff44db


2023-11-01 10:20:02:111252 [INFO]   Deactivate Session 0x7f7d74004f28 <ea5d5f287dff44db>
2023-11-01 10:20:02:111290 [INFO]   Terminate Session 0x7f7d74004f28 <ea5d5f287dff44db>
2023-11-01 10:20:02:111332 [INFO]   Close <ea5d5f287dff44db@gsr>
2023-11-01 10:20:02:111371 [NOTICE] GSR Usage: 0/1/2
2023-11-01 10:20:02:111389 [NOTICE] Usage [umsgsr] min [0] cur [0] max [1]
2023-11-01 10:20:02:119607 [INFO]   Close RTP Receiver 192.168.168.187:5068 <- 192.168.168.247:53000 [r:1182 l:0 j:957 p:550 d:0 i:0]
2023-11-01 10:20:02:119631 [INFO]   Remove RTP Session 192.168.168.187:5068
2023-11-01 10:20:02:120347 [NOTICE] Remove Session <ea5d5f287dff44db>
2023-11-01 10:20:02:120378 [INFO]   Session Terminated 0x7f7d74004f28 <ea5d5f287dff44db>
2023-11-01 10:20:02:120389 [NOTICE] Destroy Session <ea5d5f287dff44db>
2023-11-01 10:20:02:120558 [INFO]   Send RTSP Data 192.168.168.187:1554 <-> 192.168.168.247:64038 [62 bytes]
RTSP/1.0 200 OK
CSeq: 10000455
Session: ea5d5f287dff44db


2023-11-01 10:20:02:120587 [INFO]   Remove RTSP Session <ea5d5f287dff44db>
2023-11-01 10:20:02:120590 [NOTICE] Destroy RTSP Session <ea5d5f287dff44db>
2023-11-01 10:20:02:121045 [INFO]   RTSP Peer Disconnected 192.168.168.187:1554 <-> 192.168.168.247:64038
2023-11-01 10:20:02:121049 [INFO]   Close RTSP Connection 192.168.168.187:1554 <-> 192.168.168.247:64038
2023-11-01 10:20:02:121093 [NOTICE] Destroy RTSP Connection 192.168.168.187:1554 <-> 192.168.168.247:64038
Thanks for any feedback you have. 
Tony Curoso
StreamWrite

Michael Levy

unread,
Nov 1, 2023, 3:33:17 PM11/1/23
to uni...@googlegroups.com
You might want to share a snippet of your VXML.

I don't use the grammar processing of the Unimrcp plug-ins, so I may be off here. In older posts, Arsen has said he implements a subset of SRGS. I believe you have to work within the limits of the implementation. His example from the above post is shown in the example below.

Your recognizer result is:

<?xml version="1.0"?>
<result>
  <interpretation grammar="session:com.intel.ssp.gram-5" confidence="98">
    <instance></instance>
    <input mode="speech">status</input>
  </interpretation>
</result>

Notice, the transcription (input) is correct ("status"), but there is no semantic result (instance is empty).

Like I said, I don't use this grammar processor, but I think you need to specify semantic tags in your grammar. Follow the example below and add <tag> elements to identify the semantic result for each match.
Or, in your VXML try accessing the name$.utterance shadow variable to get the raw transcription.





SRGS:
<grammar mode="voice" root="boolean" version="1.0" xml:lang="en-US" xmlns="http://www.w3.org/2001/06/grammar">
    <meta name="scope" content="strict"/>
    <rule id="boolean">
        <one-of>
            <item>yes<tag>true</tag></item>
            <item>sure<tag>true</tag></item>
            <item>correct<tag>true</tag></item>
            <item>no<tag>false</tag></item>
            <item>not sure<tag>false</tag></item>
            <item>incorrect<tag>false</tag></item>
        </one-of>
    </rule>
</grammar>




--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unimrcp+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unimrcp/e3f8b471-3cee-40e5-971f-bb6ee842ad89n%40googlegroups.com.

Michael Levy

unread,
Nov 1, 2023, 3:42:45 PM11/1/23
to uni...@googlegroups.com
Also, review your VXML. You are defining six grammars and only using two. Do you have a VXML application page being pulled in that is defining the other grammars? it isn't bad to have them defined, but some of the grammars defined are not Unimrcp plug-in compatible. They use:
  1. special rules like GARBAGE 
  2. specify tag-format='semantics/1.0'
  3. reference external grammar files like: <ruleref uri="../SR/Grammars/Repeat.grxml"/>
  4. Ecma script SISR variables for output: <tag>out="repitir";</tag>
This is probably not what you intended.

Tony Curoso

unread,
Nov 2, 2023, 11:15:41 AM11/2/23
to UniMRCP
Yeah our vxml are weird, and this one is halfway between our old SR solution and GSR. I'll clean it up before I submit any more. One thing I noticed is the meta field is used on all sample grammars I have seen. We have not used that in our previous grammars so I am a little unclear on it. Is it necessary to use meta for SR to work with unimrcp/gsr?
Thanks. 

Michael Levy

unread,
Nov 2, 2023, 11:26:15 AM11/2/23
to uni...@googlegroups.com
Look at the examples in  https://groups.google.com/g/unimrcp/c/13vVrW9cYTs/m/VRgSoHqDCgAJ
Some of the Unimrcp plug-ins use data in the <meta> tags to give hints to the speech recognizer or to control aspects of the recognition that certain platforms don't allow you to control through MRCP.

Tony Curoso

unread,
Nov 2, 2023, 2:25:43 PM11/2/23
to UniMRCP
Thanks Michael. The link was very helpful. I was able to get the grammar working by using <meta name="scope" content="hint"/> and using a tag on each item. 
Question while I have you: we build our vxml to use http grammars, which I understand are not supported.  Do you know if there are any plans to support them in the future?
Thanks. 

Michael Levy

unread,
Nov 2, 2023, 2:39:34 PM11/2/23
to uni...@googlegroups.com
 we build our vxml to use http grammars, which I understand are not supported.  Do you know if there are any plans to support them in the future?

I don't think there is. My info comes from:
Originally, there was no intention to support SRGS with new generation speech recognition engines, as transcription results are commonly passed to various NLP APIs for further processing. However, since the issue with SRGS support came up many times on different occasions, basic support has been in place for all the SR plugins for quite some time. They may not have reached parity yet.
There are various use cases: some need SRGS to have an actual grammar enforced, the others pass phrases used as a hint for speech transcription, not to mention, that meta data in SRGS XML also allows to pass additional vendor-specific parameters, and this turns out to be the most commonly supported option.

I think the assumption is that people who are using modern speech recognizers are also using modern APIs for natural language and semantic processing. I don't think we should expect more SRGS features added to Unimrcp plug-ins.

Tony Curoso

unread,
Nov 6, 2023, 4:31:39 PM11/6/23
to UniMRCP
Thanks for the insight, Michael. Much appreciated. 
Reply all
Reply to author
Forward
0 new messages