rtp package loss at the beginning of speech (?)

66 views
Skip to first unread message

bayram

unread,
Sep 24, 2009, 7:42:20 AM9/24/09
to UniMRCP
Hello Arsen,

I would like to ask you a question related to SR part of UniMRCP. One
of my work-mates integrated our SR with UniMRCP, and he started to do
some tests. But he reported that sometimes beginning of speech is lost
and because of that the recognition is not done correctly. He also
said that this situation occurs randomly, and he cannot re-produce the
mistake at will, though it appears quite often. Have you ever seen
such a case? Any idea? Or can you direct me in some way? What else can
I supply you about this?

Thanks beforehand,

Best wishes,

Bayram

achaloyan

unread,
Sep 24, 2009, 11:57:10 AM9/24/09
to uni...@googlegroups.com
Hello Bayram,

The only thing I suspect is network burst at the beginning of call/session. This is a common case, but still depends on many things. Are your client and server on the same LAN or on the public net?
Try to increase a bit playout delay in the jitter buffer to adapt the bursts and see if it helps
<param name="playout-delay" value="200"/>

I hope you can save the utterance in a file and make a network capture in parallel.
This should help identify the problem.
--
Arsen Chaloyan
The author of UniMRCP
http://www.unimrcp.org

bayram

unread,
Sep 25, 2009, 2:15:18 AM9/25/09
to UniMRCP
Thanks Arsen,

I will let you know about the result.

Regards,
Bayram

bayram

unread,
Sep 29, 2009, 10:08:13 AM9/29/09
to UniMRCP
Hello Arsen,

I told my friend about your e-mail, he tried it and it fixed it.

Thanks again,

Bayram

Arsen Chaloyan

unread,
Sep 29, 2009, 12:36:19 PM9/29/09
to uni...@googlegroups.com
Hello Bayram,

Thanks for getting back with the results.
As far as it helps, your friend may want to know a few more details.

200 msec playout delay in jitter buffer is acceptable, but anyway, it introduces additional end-to-end delay, which preferably should be as low as possible. Ideally jitter buffer should be adaptive and reflect network bursts on the fly. This will allow to keep small initial playout delay and increase it when actually needed. However so far I've seen nobody interested enough in this functionality.
In the meantime, your friend can configure two jitter buffers via profiles. For instance, one for sessions which come from LAN with small playout delay, and the other for sessions from WAN with more reliable playout delay.

Anthony Masse

unread,
Sep 30, 2009, 4:56:37 AM9/30/09
to uni...@googlegroups.com
Hi Arsen,

I'm working with at least ten different ASR MRCP client integrations : we will have to configure a jitter between 100 and 1000 ms to support all platforms (to manage client stack burst).

To avoid the end-to-end delay, we have modified our RTP reader stack (which doesn't support adaptative jitter). 

For us the best (and the easier) solution was to use the rtp sequence number (instead of timestamp) : 
   - no delay when no packet lost
   - just configure a timeout ( > 500 ms) to detect lost packets (and other things ...).


I don't say this method must be report in the uniMRCP stack but I believe the RTP receiver on server side (ASR datas) should process RTP packets ASAP with 
    - an init delay (4 or 5 packets),
    - and a maximum delay (to detect lost packets),

(ie : an adaptavice jitter between 0 and "maximum delay")

Anthony


2009/9/29 Arsen Chaloyan <acha...@gmail.com>

Arsen Chaloyan

unread,
Sep 30, 2009, 6:06:38 AM9/30/09
to uni...@googlegroups.com
Hi Anthony,

Thanks for your thoughts, I clearly understand what you mean and admit the method you described should work efficient enough in case of short continuous streams (typical utterances).
But thinking globally, considering not only ASR case, this method isn't acceptable, as it doesn't allow the exact reconstruction of streams. I mean the gaps in RTP streams (discontinuous transmission). Also that method doesn't produce real-time data in output, while RTP (Real-time Transport Protocol) is intended for real-time transmission.
One may argue, why RTP is used for ASR at all, wouldn't be it better to use HTTP instead. Please note, that the method you described can be considered as HTTP streaming and this is the core difference between those two protocols (RTP and HTTP).
I'd answer probably, they chose RTP, because MRCP is intened to be used in VoIP environment, where you process real-time data all the time (SIP,H323,PSTN,...)
Also suppose in one day, you'll need to process video streams too (I know it's not tomorrow for ASR). How will you synchronize audio and video streams without timestamps.

> (ie : an adaptavice jitter between 0 and "maximum delay")
I like this much, and current implementation in UniMRCP is about 8 hours behind that.

Hope my concerns are clear and acceptable too.

Anthony Masse

unread,
Oct 15, 2009, 2:53:22 AM10/15/09
to uni...@googlegroups.com
Hi Arsen,

I have some issues with MRCP client RTP stack. 

Have you a new jitter implementation ? 
If yes, I can wait and test with the new one
Else I will send you some logs.

Anthony
 

2009/9/30 Arsen Chaloyan <acha...@gmail.com>

Arsen Chaloyan

unread,
Oct 15, 2009, 3:34:07 AM10/15/09
to uni...@googlegroups.com
Hi Anthony,

I'm busy with pre-release arrangements, so no new functionality will be added during the days.
However I'm curious about the issues you may have.

Anthony Masse

unread,
Oct 15, 2009, 8:03:00 AM10/15/09
to uni...@googlegroups.com
The context :

the MRCP client RTP stack sends the 10th packets quickly and the other each 20 ms (Mulaw encoding).

I had some apt_log in the mpf_jitter_buffer.c to print more infos and modify the "Close RTP Receiver" log to print discarded_packets and ignored_packets values


1. with a  <param name="playout-delay" value="50"/> and  <param name="max-playout-delay" value="200"/>

(see unimrcpserver-0_case2.log file)

The 200th milliseconds are dropped and I have in the log :

2009-10-15 13:52:42:576028 3892 [INFO]   Close RTP Receiver 192.168.1.34:5000 <- 192.168.1.156:59022 [r:49 l:0 j:1054] [d:0 i:0]


2. with  a  <param name="playout-delay" value="50"/> and  <param name="max-playout-delay" value="1000"/>

(see unimrcpserver-0_case1.log file)

I received all the signal in my plug-in

----------------------

I don't understand the "available_frame_count" value (which becomes negative in the fisrt test).
Can you tell me the jitter unit ?



2009/10/15 Arsen Chaloyan <acha...@gmail.com>
unimrcpserver-0_case2.log
unimrcpserver-0_case1.log

Arsen Chaloyan

unread,
Oct 15, 2009, 9:57:24 AM10/15/09
to uni...@googlegroups.com
My comments below,

On Thu, Oct 15, 2009 at 5:03 PM, Anthony Masse <amasse...@gmail.com> wrote:
The context :

the MRCP client RTP stack sends the 10th packets quickly and the other each 20 ms (Mulaw encoding).

Well, as discussed before, it's a bit abnormal RTP stream, but I want to accept and normally process it.

I had some apt_log in the mpf_jitter_buffer.c to print more infos and modify the "Close RTP Receiver" log to print discarded_packets and ignored_packets values

OK.


1. with a  <param name="playout-delay" value="50"/> and  <param name="max-playout-delay" value="200"/>

(see unimrcpserver-0_case2.log file)

The 200th milliseconds are dropped and I have in the log :

Assuming max-playout-delay is 200ms, this is intended behavior, as cyclic buffer, can hold at most 200ms data. This is typical overflow.

2009-10-15 13:52:42:576028 3892 [INFO]   Close RTP Receiver 192.168.1.34:5000 <- 192.168.1.156:59022 [r:49 l:0 j:1054] [d:0 i:0]

I'd like to see some discarded packets either, but looking at the code, I see that write_prepare should return JB_DISCARD_TOO_EARLY in the mentioned case, but it doesn't. So it's mostly matter of wrong stats and I'll fix it. Anyway, you'll have loss in the speech.


2. with  a  <param name="playout-delay" value="50"/> and  <param name="max-playout-delay" value="1000"/>

(see unimrcpserver-0_case1.log file)

I received all the signal in my plug-in

It's clear, as you have jitter (cyclic)  buffer, which is capable to hold, accommodate up to 1sec data

----------------------

I don't understand the "available_frame_count" value (which becomes negative in the fisrt test).

Well, jitter buffer is a cyclic buffer. "available_frame_count" shows how many frames are still available (remain) in the buffer to write. If no frames available, buffer is full (overflow condition). This is a condition, when RTP transmitter on remote side sends packets faster than real-time. I can't say this is a typical case, because mostly we meet the condition, when buffer is empty (underrun) due to network fluctuations.

 
Can you tell me the jitter unit ?

See the section below
http://tools.ietf.org/html/rfc3550#appendix-A.8

Jitter as well as other entities involved in calculation of playout delay, variation, etc are based on timestamps.

Anthony Masse

unread,
Oct 15, 2009, 10:06:11 AM10/15/09
to uni...@googlegroups.com
Ok I undestand

Then "the available_frame_count" value should be 20 (instead of 15) the first time if max-playout-delay = 200ms or 100 (instead of 95) the first time if max-playout-delay = 1000 ms.

the issue is perhaps about the JB_DISCARD_TOO_EARLY which it is not return.

Anthony

2009/10/15 Arsen Chaloyan <acha...@gmail.com>

Arsen Chaloyan

unread,
Oct 15, 2009, 11:12:47 AM10/15/09
to uni...@googlegroups.com
On Thu, Oct 15, 2009 at 7:06 PM, Anthony Masse <amasse...@gmail.com> wrote:
Ok I undestand

Then "the available_frame_count" value should be 20 (instead of 15) the first time if max-playout-delay = 200ms or 100 (instead of 95) the first time if max-playout-delay = 1000 ms.

Yes,  200ms = 20x10ms. So we have 20 frames initially. However don't forget about initial playout delay (50ms = 5x10ms). We have static jitter buffer and should consider underrun either. Though you don't need any initial delay, if RTP packets are transmitted faster than real-time.

the issue is perhaps about the JB_DISCARD_TOO_EARLY which it is not return.
Exactly. I don't know why "too early" was not returned, probably I just missed it, at least it looks trivial now.
You should get discarded packets with the following fix
http://code.google.com/p/unimrcp/source/detail?r=1182

Please give a try,
Thanks.

Anthony Masse

unread,
Mar 15, 2010, 12:13:03 PM3/15/10
to uni...@googlegroups.com
Hi Arsen,

Some weeks ago, we had spoken about "discarded" packets,  Can you add the following information in your "Close RTP" log (discarded and ignored "packets"):

example : 2009-10-15 13:52:42:576028 3892 [INFO]   Close RTP Receiver 192.168.1.34:5000 <- 192.168.1.156:59022 [r:49 l:0 j:1054] [d:0 i:0]

I believe it will be a good information to debug RTP issue (from an unknown MRCP client)

Thanks
Anthony

2009/10/15 Arsen Chaloyan <acha...@gmail.com>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To post to this group, send email to uni...@googlegroups.com
To unsubscribe from this group, send email to unimrcp+u...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/unimrcp?hl=en
-~----------~----~----~----~------~----~------~--~---


Arsen Chaloyan

unread,
Mar 15, 2010, 3:41:50 PM3/15/10
to uni...@googlegroups.com
Hi Anthony,

Done in r1596.

Also, I've tried to provide the description for those who may be easily confused with short, non-descriptive tokens included in the trace.
http://code.google.com/p/unimrcp/source/detail?r=1596


--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To post to this group, send email to uni...@googlegroups.com.
To unsubscribe from this group, send email to unimrcp+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/unimrcp?hl=en.
Reply all
Reply to author
Forward
0 new messages