TTS AMR-WB results in noise only streams

27 views
Skip to first unread message

Gaurav Gangwar

unread,
Dec 2, 2022, 7:27:03 AM12/2/22
to UniMRCP
Hi,

We recently updated to unimrcp version 1.8.0 along with self-made plugin which uses Azure for TTS (16Bit-MonoPcm)
with the codec negotiating to PCMA/PCMU works fine and it is giving stable result.

Now we want to support AMR-WB codec.
We did configure and build the unimrcp as suggested.
./confgure --enable-amr-codec=yes
make && make install


We set the codec capability from LPCM to AMR-WB as below-
      mpf_codec_capabilities_add(&capabilities->codecs, MPF_SAMPLE_RATE_8000 | MPF_SAMPLE_RATE_16000, "AMR-WB");

We use MicroSIP softphone for testing which supports AMR-WB codec.

Below are the Observation - 
      We could only able to listen to the noise in call at microSIp end.
      We captured the pcap at unimrcp server and converted to the amr file which again results in noisy sound.
      
Script to convert RTP streams to AMR-WB - Here


Requesting help if some one can check the logs and pcap file, and point out the mistake.

Below are the logs
2022-12-02 12:02:32:008038 [INFO]   Remote SDP 0x7f61a40071d8 <new>
v=0
o=FreeSWITCH 1392396467980041114 3526428249278286825 IN IP4 10.143.16.59
s=-
c=IN IP4 10.143.16.59
t=0 0
m=application 9 TCP/MRCPv2 1
a=setup:active
a=connection:existing
a=resource:speechsynth
a=cmid:1
m=audio 50010 RTP/AVP 102 8 0
a=rtpmap:102 AMR-WB/16000
a=fmtp:102 octet-align=1
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=recvonly
a=mid:1

2022-12-02 12:02:32:008131 [NOTICE] Add Session <3370b980723911ed>
2022-12-02 12:02:32:008152 [INFO]   Receive Offer 0x7f61a40071d8 <3370b980723911ed> [c:1 a:1 v:0]
2022-12-02 12:02:32:008159 [INFO]   Found MRCP Engine [MS-Synth-1] for Resource [speechsynth] 0x7f61a40071d8 <3370b980723911ed>
2022-12-02 12:02:32:008213 [INFO]   Add Pending Control Channel <3370b980723911ed@speechsynth> [1]
2022-12-02 12:02:32:009689 [INFO]   Enable RTP Session 10.143.16.3:5010
2022-12-02 12:02:32:009752 [INFO]   Open RTP Transmitter 10.143.16.3:5010 -> 10.143.16.59:50010
2022-12-02 12:02:32:009762 [INFO]   Media Path 0x7f61a40071d8 Source->[AMR-WB/16000/1]->Bridge->[AMR-WB/16000/1]->Sink
2022-12-02 12:02:32:009853 [INFO]   Send Answer 0x7f61a40071d8 <3370b980723911ed> [c:1 a:1 v:0] Status OK
2022-12-02 12:02:32:009886 [INFO]   Local SDP 0x7f61a40071d8 <3370b980723911ed>
v=0
o=UniMRCPServer 0 0 IN IP4 10.143.16.3
s=-
c=IN IP4 10.143.16.3
t=0 0
m=application 1544 TCP/MRCPv2 1
a=setup:passive
a=connection:existing
a=channel:3370b980723911ed@speechsynth
a=cmid:1
m=audio 5010 RTP/AVP 102
a=rtpmap:102 AMR-WB/16000
a=fmtp:102 octet-align=1
a=sendonly
a=mid:1

2022-12-02 12:02:32:010184 [INFO]   Receive SIP Event [nua_i_state] Status 200 OK [SIP-Agent-1]
2022-12-02 12:02:32:010199 [NOTICE] SIP Call State 0x7f61a40071d8 [completed]
2022-12-02 12:02:32:010713 [INFO]   Receive SIP Event [nua_i_ack] Status 200 OK [SIP-Agent-1]
2022-12-02 12:02:32:010730 [INFO]   Receive SIP Event [nua_i_state] Status 200 OK [SIP-Agent-1]
2022-12-02 12:02:32:010734 [NOTICE] SIP Call State 0x7f61a40071d8 [ready]
2022-12-02 12:02:32:010738 [INFO]   Receive SIP Event [nua_i_active] Status 200 Call active [SIP-Agent-1]
2022-12-02 12:02:32:017952 [INFO]   Receive MRCPv2 Data 10.143.16.3:1544 <-> 10.143.16.59:49242 [208 bytes]
MRCP/2.0 208 SPEAK 1
Channel-Identifier: 3370b980723911ed@speechsynth
Content-Type: text/plain
Voice-Name: SwaraNeural
Content-Length: 63

My name is govind testing A M R codec and hope it will work now
2022-12-02 12:02:32:018050 [INFO]   Assign Control Channel <3370b980723911ed@speechsynth> to Connection 10.143.16.3:1544 <-> 10.143.16.59:49242 [0] -> [2]
2022-12-02 12:02:32:018108 [INFO]   Process SPEAK Request <3370b980723911ed@speechsynth> [1]
2022-12-02 12:02:32:018317 [INFO]   Process SPEAK Response <3370b980723911ed@speechsynth> [1]
2022-12-02 12:02:32:018350 [NOTICE] State Transition IDLE -> SPEAKING <3370b980723911ed@speechsynth>
2022-12-02 12:02:32:018381 [INFO]   Send MRCPv2 Data 10.143.16.3:1544 <-> 10.143.16.59:49242 [83 bytes]
MRCP/2.0 83 1 200 IN-PROGRESS
Channel-Identifier: 3370b980723911ed@speechsynth


2022-12-02 12:02:37:020788 [INFO]   Process SPEAK-COMPLETE Event <3370b980723911ed@speechsynth> [1]
2022-12-02 12:02:37:020862 [NOTICE] State Transition SPEAKING -> IDLE <3370b980723911ed@speechsynth>
2022-12-02 12:02:37:020926 [INFO]   Send MRCPv2 Data 10.143.16.3:1544 <-> 10.143.16.59:49242 [122 bytes]
MRCP/2.0 122 SPEAK-COMPLETE 1 COMPLETE
Channel-Identifier: 3370b980723911ed@speechsynth
Completion-Cause: 000 normal


Regards,
Gaurav Gangwar

Gaurav Gangwar

unread,
Dec 8, 2022, 4:51:38 AM12/8/22
to UniMRCP
 Hi Arsen,

Please help me debug the issue, 

Thanks
Gaurav Gangwar

Arsen Chaloyan

unread,
Dec 22, 2022, 8:32:03 PM12/22/22
to uni...@googlegroups.com
Hi Gaurav,

You seem to be on the right track in general. The problem must be in the way you supply AMR-WB frames to the MPF callback received from MS. As opposed to other codecs, G.722 and AMR-WB are frame-based codecs and you should pass a single 20 ms frame per callback invocation.

To be on the safe side, you may need to make one step at a time. I'd suggest reverting your change below, using LPCM within the plugin and letting MPF encode and pack the data in RTP packets. When you have that working and make sure there are no interop problems, you can move back to your original approach by properly supplying AMR-WB frames to MPF.

> We set the codec capability from LPCM to AMR-WB as below-
>      mpf_codec_capabilities_add(&capabilities->codecs, MPF_SAMPLE_RATE_8000 | MPF_SAMPLE_RATE_16000, "AMR-WB");

FYI, the interoperability has been verified with GVP in particular. I personally have not tried to use the microSIP softphone you referred to.



--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unimrcp+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unimrcp/2bcae054-6844-48c2-9390-1bf368cbda4an%40googlegroups.com.


--
Arsen Chaloyan
Author of UniMRCP
http://www.unimrcp.org
Reply all
Reply to author
Forward
0 new messages