No audio on test trancription program

David Cunningham

unread,

Apr 25, 2024, 11:23:38 PMApr 25

to UniMRCP

Hello,

We have installed UniMRCP following the instructions at:

https://docs.unispeech.io/en/ums/asterisk/transcribe-polly

Our installation is pretty much as described in those instructions, only we're using our own test server instead of an AWS VM. We have working audio on normal SIP calls through our test server. However, when we route a call to the sample speech transcription AGI application agi_transcription.py it gets no audio at all.

Would anyone be able to advise us on where to look please? I will follow with some more details. Thank you in advance.

If we stop Asterisk and run the "umc" program to validate the installation, "run tsr1" appears to work:

MRCP/2.0 504 RECOGNITION-COMPLETE 1 COMPLETE
Channel-Identifier: 03c39ae4f30b4306@speechrecog
Completion-Cause: 000 success
Waveform-Uri: <http://localhost/utterances/umstranscribe-03c39ae4f30b4306-1.wav>;size=37760;duration=2360
Content-Type: application/x-nlsml
Content-Length: 212

However "run bsr1" gives an error:

MRCP/2.0 289 SPEAK 1
Channel-Identifier: 898ae48c90c0487c@speechsynth
Content-Type: application/ssml+xml
Content-Length: 158

<?xml version="1.0"?>
<speak version="1.0" xml:lang="en-US" xmlns="http://www.w3.org/2001/10/synthesis">
<p>
<s>Welcome to Uni MRCP.</s>
</p>
</speak>
2024-04-26 05:43:32:553822 [INFO] Receive MRCPv2 Data 82.166.176.31:40778 <-> 82.166.176.31:1544 [110 bytes]
MRCP/2.0 110 1 401 COMPLETE
Channel-Identifier: 898ae48c90c0487c@speechsynth
Completion-Cause: 004 error

With Asterisk running we route a call to agi_transcription.py and get no audio. In the Asterisk log we see:

[Apr 26 05:49:21] DEBUG[14498][C-00000000] app_synthandrecog.c: (ASR-0) Add prompt: Welcome to speech transcription application. Please speak.
[Apr 26 05:49:21] NOTICE[14498][C-00000000] app_synthandrecog.c: (ASR-0) Recognizing, Start-Input-Timers: 0
[Apr 26 05:49:21] DEBUG[14498][C-00000000] speech_channel.c: (ASR-0) No-Input-Timeout: 10000
[Apr 26 05:49:21] DEBUG[14498][C-00000000] speech_channel.c: (ASR-0) Speech-Complete-Timeout: 1500
[Apr 26 05:49:21] DEBUG[14498][C-00000000] speech_channel.c: (ASR-0) Speech-Incomplete-Timeout: 15000
[Apr 26 05:49:21] DEBUG[14498][C-00000000] src/apt_task.c: Signal Message to [MRCP Client] [0x15488c004430;4;0]
[Apr 26 05:49:21] DEBUG[14498][C-00000000] audio_queue.c: (TTS-0) Audio queue created
[Apr 26 05:49:21] DEBUG[14498][C-00000000] speech_channel.c: Created speech channel: Name=TTS-0, Type=SYNTHESIZER, Codec=PCMU, Rate=8000 on SIP/enswitch-local-00000000
[Apr 26 05:49:21] NOTICE[14498][C-00000000] src/mrcp_application.c: Create MRCP Handle 0x15488c0299a8 [ums2]
[Apr 26 05:49:21] NOTICE[14498][C-00000000] src/mrcp_client_session.c: Create Channel TTS-0 <new>
[Apr 26 05:49:21] DEBUG[14498][C-00000000] src/apt_task.c: Signal Message to [MRCP Client] [0x15488c007da0;4;0]
[Apr 26 05:49:21] DEBUG[14498][C-00000000] speech_channel.c: (TTS-0) channel is ready
[Apr 26 05:49:21] DEBUG[14498][C-00000000] src/apt_task.c: Signal Message to [MRCP Client] [0x15488c002210;4;0]
[Apr 26 05:49:21] ERROR[14498][C-00000000] app_synthandrecog.c: (TTS-0) Unable to send SPEAK request
[Apr 26 05:49:21] NOTICE[14498][C-00000000] app_synthandrecog.c: SynthAndRecog() exiting status: ERROR on SIP/enswitch-local-00000000
[Apr 26 05:49:21] VERBOSE[14498][C-00000000] res_agi.c: <SIP/enswitch-local-00000000>AGI Tx >> 200 result=0
[Apr 26 05:49:21] VERBOSE[14498][C-00000000] res_agi.c: <SIP/enswitch-local-00000000>AGI Rx << GET VARIABLE "RECOG_STATUS"
[Apr 26 05:49:21] DEBUG[14498][C-00000000] pbx_variables.c: Result of 'RECOG_STATUS' is 'ERROR'

And in /opt/unimrcp/log/unimrcpserver_current.log we see:

2024-04-26 05:49:21:655100 [NOTICE] SIP Call State 0x7f1de80080a8 [ready]
2024-04-26 05:49:21:655105 [INFO] Receive SIP Event [nua_i_active] Status 200 Call active [SIP-Agent-1]
2024-04-26 05:49:21:655601 [NOTICE] Accepted TCP/MRCPv2 Connection 82.166.176.31:1544 <-> 82.166.176.31:38722
2024-04-26 05:49:21:661263 [INFO] Receive MRCPv2 Data 82.166.176.31:1544 <-> 82.166.176.31:38722 [178 bytes]
MRCP/2.0 178 SPEAK 1
Channel-Identifier: 575a6135130a41cd@speechsynth
Content-Type: text/plain
Content-Length: 58

Welcome to speech transcription application. Please speak.
2024-04-26 05:49:21:661297 [INFO] Assign Control Channel <575a6135130a41cd@speechsynth> to Connection 82.166.176.31:1544 <-> 82.166.176.31:38722 [0] -> [1]
2024-04-26 05:49:21:661338 [INFO] Process SPEAK Request <575a6135130a41cd@speechsynth> [1]
2024-04-26 05:49:21:661423 [WARN] Failed to Select Voice <575a6135130a41cd@polly>
2024-04-26 05:49:21:661454 [INFO] Process SPEAK Response <575a6135130a41cd@speechsynth> [1]
2024-04-26 05:49:21:661502 [INFO] Send MRCPv2 Data 82.166.176.31:1544 <-> 82.166.176.31:38722 [110 bytes]
MRCP/2.0 110 1 401 COMPLETE
Channel-Identifier: 575a6135130a41cd@speechsynth
Completion-Cause: 004 error

Thanks again for any suggestions.

Ken Walker

unread,

May 9, 2024, 11:29:24 AMMay 9

to UniMRCP

Based on the log you posted, it looks like the voice selection is either missing, or not available on your polly instance:

2024-04-26 05:49:21:661423 [WARN] Failed to Select Voice <575a6135130a41cd@polly>

I would suggest double checking you voice selection against what is available on the server.

Ken

David Cunningham

unread,

May 9, 2024, 4:04:20 PMMay 9

to UniMRCP

Thank you Ken. We did find it was a problem with the voice selection, and fixing that solved the no audio problem.

Reply all

Reply to author

Forward