TTS AMR-WB results in noise only streams

Skip to first unread message

Gaurav Gangwar

Dec 2, 2022, 7:27:03 AM12/2/22
to UniMRCP

We recently updated to unimrcp version 1.8.0 along with self-made plugin which uses Azure for TTS (16Bit-MonoPcm)
with the codec negotiating to PCMA/PCMU works fine and it is giving stable result.

Now we want to support AMR-WB codec.
We did configure and build the unimrcp as suggested.
./confgure --enable-amr-codec=yes
make && make install

We set the codec capability from LPCM to AMR-WB as below-
      mpf_codec_capabilities_add(&capabilities->codecs, MPF_SAMPLE_RATE_8000 | MPF_SAMPLE_RATE_16000, "AMR-WB");

We use MicroSIP softphone for testing which supports AMR-WB codec.

Below are the Observation - 
      We could only able to listen to the noise in call at microSIp end.
      We captured the pcap at unimrcp server and converted to the amr file which again results in noisy sound.
Script to convert RTP streams to AMR-WB - Here

Requesting help if some one can check the logs and pcap file, and point out the mistake.

Below are the logs
2022-12-02 12:02:32:008038 [INFO]   Remote SDP 0x7f61a40071d8 <new>
o=FreeSWITCH 1392396467980041114 3526428249278286825 IN IP4
c=IN IP4
t=0 0
m=application 9 TCP/MRCPv2 1
m=audio 50010 RTP/AVP 102 8 0
a=rtpmap:102 AMR-WB/16000
a=fmtp:102 octet-align=1
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000

2022-12-02 12:02:32:008131 [NOTICE] Add Session <3370b980723911ed>
2022-12-02 12:02:32:008152 [INFO]   Receive Offer 0x7f61a40071d8 <3370b980723911ed> [c:1 a:1 v:0]
2022-12-02 12:02:32:008159 [INFO]   Found MRCP Engine [MS-Synth-1] for Resource [speechsynth] 0x7f61a40071d8 <3370b980723911ed>
2022-12-02 12:02:32:008213 [INFO]   Add Pending Control Channel <3370b980723911ed@speechsynth> [1]
2022-12-02 12:02:32:009689 [INFO]   Enable RTP Session
2022-12-02 12:02:32:009752 [INFO]   Open RTP Transmitter ->
2022-12-02 12:02:32:009762 [INFO]   Media Path 0x7f61a40071d8 Source->[AMR-WB/16000/1]->Bridge->[AMR-WB/16000/1]->Sink
2022-12-02 12:02:32:009853 [INFO]   Send Answer 0x7f61a40071d8 <3370b980723911ed> [c:1 a:1 v:0] Status OK
2022-12-02 12:02:32:009886 [INFO]   Local SDP 0x7f61a40071d8 <3370b980723911ed>
o=UniMRCPServer 0 0 IN IP4
c=IN IP4
t=0 0
m=application 1544 TCP/MRCPv2 1
m=audio 5010 RTP/AVP 102
a=rtpmap:102 AMR-WB/16000
a=fmtp:102 octet-align=1

2022-12-02 12:02:32:010184 [INFO]   Receive SIP Event [nua_i_state] Status 200 OK [SIP-Agent-1]
2022-12-02 12:02:32:010199 [NOTICE] SIP Call State 0x7f61a40071d8 [completed]
2022-12-02 12:02:32:010713 [INFO]   Receive SIP Event [nua_i_ack] Status 200 OK [SIP-Agent-1]
2022-12-02 12:02:32:010730 [INFO]   Receive SIP Event [nua_i_state] Status 200 OK [SIP-Agent-1]
2022-12-02 12:02:32:010734 [NOTICE] SIP Call State 0x7f61a40071d8 [ready]
2022-12-02 12:02:32:010738 [INFO]   Receive SIP Event [nua_i_active] Status 200 Call active [SIP-Agent-1]
2022-12-02 12:02:32:017952 [INFO]   Receive MRCPv2 Data <-> [208 bytes]
MRCP/2.0 208 SPEAK 1
Channel-Identifier: 3370b980723911ed@speechsynth
Content-Type: text/plain
Voice-Name: SwaraNeural
Content-Length: 63

My name is govind testing A M R codec and hope it will work now
2022-12-02 12:02:32:018050 [INFO]   Assign Control Channel <3370b980723911ed@speechsynth> to Connection <-> [0] -> [2]
2022-12-02 12:02:32:018108 [INFO]   Process SPEAK Request <3370b980723911ed@speechsynth> [1]
2022-12-02 12:02:32:018317 [INFO]   Process SPEAK Response <3370b980723911ed@speechsynth> [1]
2022-12-02 12:02:32:018350 [NOTICE] State Transition IDLE -> SPEAKING <3370b980723911ed@speechsynth>
2022-12-02 12:02:32:018381 [INFO]   Send MRCPv2 Data <-> [83 bytes]
MRCP/2.0 83 1 200 IN-PROGRESS
Channel-Identifier: 3370b980723911ed@speechsynth

2022-12-02 12:02:37:020788 [INFO]   Process SPEAK-COMPLETE Event <3370b980723911ed@speechsynth> [1]
2022-12-02 12:02:37:020862 [NOTICE] State Transition SPEAKING -> IDLE <3370b980723911ed@speechsynth>
2022-12-02 12:02:37:020926 [INFO]   Send MRCPv2 Data <-> [122 bytes]
Channel-Identifier: 3370b980723911ed@speechsynth
Completion-Cause: 000 normal

Gaurav Gangwar

Gaurav Gangwar

Dec 8, 2022, 4:51:38 AM12/8/22
to UniMRCP
 Hi Arsen,

Please help me debug the issue, 

Gaurav Gangwar

Arsen Chaloyan

Dec 22, 2022, 8:32:03 PM12/22/22
Hi Gaurav,

You seem to be on the right track in general. The problem must be in the way you supply AMR-WB frames to the MPF callback received from MS. As opposed to other codecs, G.722 and AMR-WB are frame-based codecs and you should pass a single 20 ms frame per callback invocation.

To be on the safe side, you may need to make one step at a time. I'd suggest reverting your change below, using LPCM within the plugin and letting MPF encode and pack the data in RTP packets. When you have that working and make sure there are no interop problems, you can move back to your original approach by properly supplying AMR-WB frames to MPF.

> We set the codec capability from LPCM to AMR-WB as below-
>      mpf_codec_capabilities_add(&capabilities->codecs, MPF_SAMPLE_RATE_8000 | MPF_SAMPLE_RATE_16000, "AMR-WB");

FYI, the interoperability has been verified with GVP in particular. I personally have not tried to use the microSIP softphone you referred to.

You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

Arsen Chaloyan
Author of UniMRCP
Reply all
Reply to author
0 new messages