freespeech speech and DTMF

305 views
Skip to first unread message

Wilmar Pérez

unread,
Jun 18, 2020, 2:46:54 PM6/18/20
to uni...@googlegroups.com
Hi all,

This post can be considered an extension of this one: MRCP does not stop prompt with DTMF.

I am running mrcp server 1.7.0 with Azure SS (1.11.0) and SR (1.13.0) plugins.  I am trying to use play_and_detect_speech so the caller can either say their choice or press a key.

It works fine for speech but I have not been able to make it work for tone detection.

This is what I am sending:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 play_and_detect_speech(custom/SD_select_payment_type.wav detect:unimrcp:unimrcpv2-azure {start-input-timers = false, no-input-timeout = 3000,recognition-timeout = 10000, speech-language = es-MX}builtin:dtmf/digits,builtin:speech/transcribe)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I can see the channel getting ready:

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2020-06-18 14:12:52.607169 [INFO] mrcp_client_session.c:455 (ASR-55) Raise App Response ASR-55 <df2978773b54465b> [2] SUCCESS [0]
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:1900 (ASR-55) RECOGNIZER channel is ready, codec = LPCM, sample rate = 8000
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:1584 (ASR-55) CLOSED ==> READY
2020-06-18 14:12:52.607169 [DEBUG] apt_consumer_task.c:141 () Wait for Messages [MRCP Client]
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:1062 (ASR-55) channel is ready
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] switch_core_media_bug.c:970 Attaching BUG to sofia/internal/40...@nukaklabs.int:5060
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:1465 (ASR-55) param = start-input-timers, val = false
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:1465 (ASR-55) param = no-input-timeout, val = 20000
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:1465 (ASR-55) param = recognition-timeout, val = 40000
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:1465 (ASR-55) param = speech-language, val = es-MX
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:3201 (ASR-55) grammar = builtin:dtmf/digits,builtin:speech/transcribe, name =
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:3218 (ASR-55) Grammar is URI
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:3290 (ASR-55) grammar is text/uri-list
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:2361 (ASR-55) Loading grammar 9a81e766-e8a2-4c18-831c-1427e9e21f2a, data = builtin:dtmf/digits,builtin:speech/transcribe
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:2526 (ASR-55) Disabling all grammars
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:2485 (ASR-55) Enabling grammar 9a81e766-e8a2-4c18-831c-1427e9e21f2a
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:2848 (ASR-55) "recognition-timeout": "10000"
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:2848 (ASR-55) "start-input-timers": "false"
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:2848 (ASR-55) "speech-language": "es-MX"
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:2848 (ASR-55) "no-input-timeout": "3000"

  --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 
Then I can see it detects the DTMF but it ignores it:

    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------    
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:3610 (ASR-55) RECOGNIZE IN PROGRESS
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:1584 (ASR-55) READY ==> PROCESSING
2020-06-18 14:12:52.607169 [DEBUG] apt_consumer_task.c:141 () Wait for Messages [MRCP Client]
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] switch_ivr_play_say.c:1492 Codec Activated L16@8000hz 1 channels 20ms
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.627166 [DEBUG] switch_core_io.c:448 Setting BUG Codec PCMU:0
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:59.587179 [DEBUG] switch_rtp.c:7963 RTP RECV DTMF 1:2080
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:59.587179 [DEBUG] mod_unimrcp.c:3463 (ASR-55) Queued DTMF: 1
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:59.587179 [INFO] switch_channel.c:522 RECV DTMF 1:2080
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:59.587179 [DEBUG] switch_ivr_async.c:4537 (sofia/internal/4003@ @nukaklabs.int:5060) IGNORE NON-TERMINATOR DIGIT 1

      --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------   


Can anyone please suggest what I can check or what I am missing in the command or configuration?

Thank you!

-- 
--------------------------------------------------------
Wilmar Pérez 

Arsen Chaloyan

unread,
Jun 20, 2020, 3:12:55 PM6/20/20
to UniMRCP
Hi Wilmar,

The following does not seem to be the intended output

f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:3201 (ASR-55) grammar = builtin:dtmf/digits,builtin:speech/transcribe, name =
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:3218 (ASR-55) Grammar is URI
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:3290 (ASR-55) grammar is text/uri-list
f4b31782-f311-41bf-b9f7-3d915cd46218 2020-06-18 14:12:52.607169 [DEBUG] mod_unimrcp.c:2361 (ASR-55) Loading grammar 9a81e766-e8a2-4c18-831c-1427e9e21f2a, data = builtin:dtmf/digits,builtin:speech/transcribe

The problem must be in the grammar separator. Use '\n' instead of ',' as follows

play_and_detect_speech(custom/SD_select_payment_type.wav detect:unimrcp:unimrcpv2-azure {start-input-timers = false, no-input-timeout = 3000,recognition-timeout = 10000, speech-language = es-MX}builtin:dtmf/digits\nbuiltin:speech/transcribe)

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unimrcp+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unimrcp/CAKEO2%3Dt5rhaCnYZ5N%2B2XiSDReD6Z-BVXVf%2BecVPNUWLGD1DbSg%40mail.gmail.com.


--
Arsen Chaloyan
Author of UniMRCP
http://www.unimrcp.org

Wilmar Pérez

unread,
Jun 21, 2020, 12:08:43 PM6/21/20
to uni...@googlegroups.com
Hi Arsen,

Thanks very much for your reply.   I did as you suggested but I do not think that is the intended output either.  This is what I see:

e43ecdcc-4d7d-4654-b21e-9706f71f21cb EXECUTE [depth=0] sofia/internal/40...@customer.nukaklabs.int:5060 play_and_detect_speech(custom/SD_select_payment_type.wav detect:unimrcp:unimrcpv2-azure {start-input-timers = false, no-input-timeout = 20000,recognition-timeout = 40000, speech-language = es-MX}builtin:dtmf/digits
e43ecdcc-4d7d-4654-b21e-9706f71f21cb builtin:speech/transcribe)

So, the \n is being taken but it literally jumps to the next line. I believe freeswitch treats it all as a line of text and \n is the special character for a new line.  I can still the same process of DTMF:

e43ecdcc-4d7d-4654-b21e-9706f71f21cb 2020-06-21 11:37:42.547174 [DEBUG] switch_rtp.c:7963 RTP RECV DTMF 1:2080
e43ecdcc-4d7d-4654-b21e-9706f71f21cb 2020-06-21 11:37:42.547174 [DEBUG] mod_unimrcp.c:3463 (ASR-87) Queued DTMF: 1
e43ecdcc-4d7d-4654-b21e-9706f71f21cb 2020-06-21 11:37:42.547174 [INFO] switch_channel.c:522 RECV DTMF 1:2080
e43ecdcc-4d7d-4654-b21e-9706f71f21cb 2020-06-21 11:37:42.567152 [DEBUG] switch_ivr_async.c:4537 (sofia/internal/40...@customer.nukaklabs.int:5060) IGNORE NON-TERMINATOR DIGIT 1


Do you have any pointers on where I can read a bit more about these formats?   freeswitch  documentation is, as most of the time, next to nothing. I am attaching the logs in case you have time to have a look. Maybe you can see something I am not seeing there.

Thanks again!

Best,

Wilmar.





--
--------------------------------------------------------
Wilmar Pérez 
freeswitch_1.log
unimrcpserver_1.log

Vahagn Kocharyan

unread,
Jun 24, 2020, 8:13:54 AM6/24/20
to uni...@googlegroups.com
HI Wilmar
for me it work fine. IGNORE NON-TERMINATOR DIGIT 1    output means that you don't have any digits specified like playback-terminator,it don't ignore your dtmf digits,
thanks
Vahagn

Wilmar Pérez

unread,
Jun 24, 2020, 12:01:07 PM6/24/20
to uni...@googlegroups.com
Thank you very much Vahagn for testing it. I will have to check the rest of my code then for I do not see the DTMF digits anywhere. What I am doing is checking  detect_speech_result, as in:

speech = session:getVariable('detect_speech_result');

Do you get the results somehow else?   

Also, do you know how to define the playback-terminator by chance?

Thanks!


--
--------------------------------------------------------
Wilmar Pérez 

Vahagn Kocharyan

unread,
Jun 24, 2020, 1:16:02 PM6/24/20
to uni...@googlegroups.com
i am using javascript i get like your example .
You must defined playback-terminator  if you want to terminate playback
you can do this
session.setVariable("playback-terminators","value")  

Vahagn Kocharyan

unread,
Jun 24, 2020, 1:58:16 PM6/24/20
to uni...@googlegroups.com
also you can get digits like this
function mycbsessiontypedataarg ) {
        if ( type == "dtmf" ) {
          data.digit;
          console_log("ERR",data.digit)
        }
        returntrue );
       
      }
session.collectInputmycb"dtmf"30000);

Vahagn Kocharyan

unread,
Jun 24, 2020, 2:04:00 PM6/24/20
to uni...@googlegroups.com
i am sorry for mistake
function mycbsessiontypedataarg ) {
        if ( type == "dtmf" ) {
          console_log("ERR","line 9 "+data.digit)
        }
        returntrue );
       
      }

session.collectInputmycb"dtmf"30000);

Wilmar Pérez

unread,
Jun 24, 2020, 2:43:18 PM6/24/20
to uni...@googlegroups.com
Hi Vahagn,

Thank you very much for the example. Sadly, collectInput is not available for LUA!  However, this got me into the right path! The trick is defining any digit (and * and #) as terminator:

session:setVariable('playback_terminators','0123456789*#');

Then when I press a digit play_and_detect_speech gets interrupted and the contents of detect_speech_result  is:

DIGIT: 1  -->  When I press 1!

With this I can now properly implement the logic!.  In fact it seems to me than passing the grammar as:

'builtin:speech/transcribe'

or

'builtin:dtmf/digits\nbuiltin:speech/transcribe'

have exactly the same result.

You would think this should all be implemented in play_and_detect_speech already! Well, I guess one could go ahead and play with the freeswitch code and recompile.

Thanks very much for your help!

Best,

Wilmar



--
--------------------------------------------------------
Wilmar Pérez 

Vahagn Kocharyan

unread,
Jun 24, 2020, 2:52:44 PM6/24/20
to uni...@googlegroups.com

Arsen Chaloyan

unread,
Jun 29, 2020, 4:15:48 PM6/29/20
to UniMRCP
Hi Wilmar,

Sorry for being late on this. I see that the issue seems to be clarified in the meantime. Not sure whether I could follow the conversation right, but there is one thing I've noticed in the logs which you provided in response to my previous post.

There is no "telephone-event" negotiated via SDP offer/answer. You would need to have "telephone-event" added to the codec list on boths sides

in FS mod_unimrcp 

<param name="codecs" value="PCMU PCMA L16/96/8000 telephone-event/101/8000"/>

and UniMRCP server

<codecs own-preference="false">PCMU PCMA L16/96/8000 telephone-event/101/8000</codecs>

Afterwards, reference the two grammars with the \n delimiter as suggested.



Wilmar Pérez

unread,
Jun 29, 2020, 5:25:23 PM6/29/20
to uni...@googlegroups.com
Hi Arsen,

Thank you very much for the clarification. It is indeed very useful.  I modified the configuration as per your suggestion and now I can see clearly differentiated the dtmf and speech captures:

++
<?xml version="1.0"?> <result> <interpretation grammar="builtin:speech/spYesNo" confidence="0.72"> <instance>1</instance> <input mode="speech"></input> </interpretation> </result> <?xml version="1.0"?> <result> <interpretation grammar="builtin:dtmf/digits" confidence="1.00"> <input mode="dtmf">9 1 1 1 2 9</input> <instance>911129</instance> </interpretation> </result>











By the way, this also means that I only need to set playback_terminators  as intended instead of doing the uncomfortable thing I was doing before:

session:setVariable('playback_terminators','#*');

Thanks very much!

Wilmar




--
--------------------------------------------------------
Wilmar Pérez 

Wilmar Pérez

unread,
Jun 30, 2020, 12:23:55 PM6/30/20
to uni...@googlegroups.com
Hi Arsen (or anyone else who wants to chime in),

I am almost done with this but I am missing something.  I am working with freeswitch/LUA.  This is what I am using for my audio capture function:

play_and_detect_speech(select_an_option.wav detect:unimrcp:unimrcpv2-azure {start-input-timers = false, no-input-timeout = 10000,recognition-timeout = 20000, speech-incomplete-timeout = 15000, speech-complete-timeout = 1000, speech-start-timeout = 50, speech-language = es-MX}builtin:dtmf/digits\nbuiltin:speech/transcribe);
caller_input = session:getVariable('detect_speech_result');

It is working well with dtmf and speech. There is only one more thing I would like to do: when using dtmf I would like the caller to be able to press # to signal when the entry is completed and go to the next prompt immediately.  What is happening now is that # does not really do anything. After the user presses # it waits until the timeouts are reached (in fact I have to clean it up from the result after to remove the #).  If I add session:setVariable('playback_terminators','#') before  play_and_detect_speech, the result is that whatever was captured is overridden by the  playback_terminator.  In other words, the detect_speech_result variable does not come back with the digits entered by the customer but with "DIGITS: #" instead.

Do you have any tricks in your bag of tricks I can use?

I hope the above is clear.

Thanks!

Wilmar
--
--------------------------------------------------------
Wilmar Pérez 

Arsen Chaloyan

unread,
Jul 11, 2020, 2:12:16 PM7/11/20
to UniMRCP
Hi Wilmar,

If I understand you right, you would need to set the header DTMF-Term-Char to "#". This is not a trick but the regular procedure, which should work well with all the plugins. You may also set the default value in the plugin configuration as well.

   <speech-dtmf-input-detector

      dtmf-term-char="#"


sin...@gmail.com

unread,
Jul 16, 2020, 9:52:51 AM7/16/20
to UniMRCP
do you have asterisk example?
I don't konw where to set  "session:setVariable('playback_terminators','0123456789*#');  "" in asterisk ,the asterisk only have mrcp.conf ....and i use agi to manager unimrcp.it works fine in voice detect...but also have the dtmf problem..

Arsen Chaloyan

unread,
Jul 20, 2020, 10:08:16 PM7/20/20
to UniMRCP
Use the option "dttc" with SynthAndRecog() and/or MRCPRecog() to specify the term digit.

Reply all
Reply to author
Forward
0 new messages