AEC does not work well with USB Camera Microphones on Windows PC. Loud echo is heard.

1,352 views
Skip to first unread message

Ronny Sthoeger

unread,
Jan 1, 2013, 3:34:55 AM1/1/13
to discuss...@googlegroups.com
We are using the webrtc as a whole package, configuring it with the external APIs.
We have tried setting the EcStatus to all options (Conference, Mobile, all options under Mobile) - but the echo does not go away.
 
When I enabled the webrtc aec debug dumps, I see that most of the time there is about 100-150 ms delay between the aec_near and aec_far audio, where the 'near' file is ahead of the 'far'. Don't know whether this points to a problem or not.
 
We are currently using r1869 - I know its not the latest one - were there improvements in the AEC since then?
Message has been deleted

Mauritz Jameson

unread,
Jan 3, 2013, 8:27:11 AM1/3/13
to discuss...@googlegroups.com
Can you post a capture of the aec_near and aec_far audio?

Mauritz Jameson

unread,
Jan 3, 2013, 8:27:34 AM1/3/13
to discuss...@googlegroups.com
Post a capture of the aec near and far audio.


On Tuesday, January 1, 2013 3:34:55 AM UTC-5, Ronny Sthoeger wrote:

Andrew MacDonald

unread,
Jan 3, 2013, 9:44:22 PM1/3/13
to discuss...@googlegroups.com
Ronny, could you file an issue for this? Please attach a dump enabled through the StartDebugRecording API:

Also, please post the OS and hardware details. Is this limited to a particular audio device or machine?


--
 
 
 

Ronny Sthoeger

unread,
Jan 6, 2013, 9:43:20 AM1/6/13
to discuss...@googlegroups.com

This happens on various OS - we've seen in on Windows7, and XP, with a few different webcams: Logitech 5000 and 9000, MS Cinema LifeCam.

High CPU "helps" producing the echo, but we've seen it at 20% and 40% CPU as well.


It is not 100% consistent. One device may be OK now but will generate echo tomorrow.

 

I am attaching the far and near files.
echo_files.rar

Mauritz Jameson

unread,
Jan 7, 2013, 8:49:06 AM1/7/13
to discuss...@googlegroups.com
Based on the files you uploaded, there's clearly something wrong. The far-end signal (loudspeaker signal) is ahead of the echo in the near-end signal (microphone signal).

If this is what the AEC "sees" then it won't be able to cancel the echo.
aec.PNG

jequalizer

unread,
Jan 7, 2013, 9:08:39 PM1/7/13
to discuss...@googlegroups.com
The delay between far and near end signal changes as a consequence of the sampling rate difference of between playback and recording clock.

Is webrtc AEC supposed to handle this problem, and how?

在 2013年1月7日星期一UTC+8下午9时49分06秒,Mauritz Jameson写道:

Ronny Sthoeger

unread,
Jan 8, 2013, 6:50:49 AM1/8/13
to discuss...@googlegroups.com

A couple of things:

1. I did notice that the far side is ahead of of the near side. Obviously the loudspeaker signal was heard before it went into the microphone. However, since these files reflect the location of the pointer in the far_buf, I thought that the problem is with the start point that the AEC algorithm chooses. It even makes some sense to me that the far end will be slightly ahead of the near end, but not so much ahead. 
Do you have any idea what could have gone wrong? Maybe it has to do with the usb driver? how we handle devices? could it be some mis-configuration of webRTC?

2. In our version drift compensation seems to be disabled (with #ifdef CLOCK_SKEW_COMP) - I enabled it just to see what the effect would be, and it helped some of the echo cases, but did not solve the problem altogether. I see that in later versions it is configurable thru external API. Do you know why it was disabled in the first place?

Thanks!

Mauritz Jameson

unread,
Jan 8, 2013, 8:05:27 AM1/8/13
to discuss...@googlegroups.com
Yes, obviously the microphone can't capture a signal before it's played out. However, if your dump is done correctly, the AEC "sees" the echo before it shows up in the speaker data. That is the problem! Try and delay the microphone signal as a quick fix to your problem - just to see if it helps. If the echo shows up earlier than its source, then - by definition - it's considered near-end speech - and hence it won't be cancelled out. 

Can you dump the data that you're feeding to the AEC and post that so we can compare that to the already posted data? If that data shows the same problem, the error is outside the AEC. If the data doesn't show the same problem, then there's an error in the AEC.

Mauritz Jameson

unread,
Jan 8, 2013, 8:10:15 AM1/8/13
to discuss-webrtc
Are you resampling the speaker signal, but not the microphone signal?

Are you using an FIR filter for resampling?

If you are, you need to figure out the delay of your FIR filter and
insert a corresponding delay in the mic signal path.
> >>>>https://code.google.com/p/webrtc/source/browse/trunk/webrtc/voice_eng...
>
> >>>> Also, please post the OS and hardware details. Is this limited to a
> >>>> particular audio device or machine?
>
> >>>> On Thu, Jan 3, 2013 at 5:27 AM, Mauritz Jameson <mjame...@gmail.com>wrote:
>
> >>>>> Post a capture of the aec near and far audio.
>
> >>>>> On Tuesday, January 1, 2013 3:34:55 AM UTC-5, Ronny Sthoeger wrote:
>
> >>>>>> We are using the webrtc as a whole package, configuring it with the
> >>>>>> external APIs.
> >>>>>> We have tried setting the EcStatus to all options (Conference,
> >>>>>> Mobile, all options under Mobile) - but the echo does not go away.
>
> >>>>>> When I enabled the webrtc aec debug dumps, I see that *most of the
> >>>>>> time *there is about 100-150 ms delay between the aec_near and

Ronny Sthoeger

unread,
Jan 8, 2013, 11:29:51 AM1/8/13
to discuss...@googlegroups.com
Hi Mauritz,

Thanks for the suggestions above, I'll try worknig with them and let you know how it goes.

As for the resampling - 
We didn't add any filters to the webRTC code so any filter being used would be from within the webRTC framework.
When I defined CLOCK_SKEW_COMP I guess it did resample the speakers - was I also supposed to change something on the microphone end?

Thanks

Mauritz Jameson

unread,
Jan 8, 2013, 2:53:48 PM1/8/13
to discuss-webrtc
This MATLAB script illustrates the problem:

clc
close all
clear all
speakerSignal = [1 zeros(1,1000)];
echoPathDelaySamples = 100;
micSignal = filter([zeros(1,echoPathDelaySamples) 1],1,speakerSignal);
% Resample speaker signal
resamplerCoeff = [zeros(1,300) 1];
speakerSignalResampled = filter(resamplerCoeff,1,speakerSignal);
plot(micSignal)
hold on
plot(speakerSignalResampled,'r--')
ylim([-2 2])
grid on
title('Red: Speaker Signal, Blue: Mic Signal')


If the speaker signal is resampled from FS to 16khz using a filter
implementation which introduces delay, and the mic signal is not
resampled (because it's already sampled at 16kHz),
you will see what you have seen.

If that resampling is done outside of WebRTC, then it's your problem
to solve.

If the resampling is done inside WebRTC, then it's a WEBRTC problem.

tao siping

unread,
Jan 8, 2013, 9:15:25 PM1/8/13
to discuss...@googlegroups.com
I think FIR filters in WebRtc should only introduce delay of several samples, cannot be tens of miliseconds. The resampling filter for CLOCK_SKEW_COMP is linear, no extra delay.

The far/near signal is interesting:
at 4.7'', far syncs well with nearInline image 1

then later, at 1'28'', near is ahead of far, and seems the delay keeps changing, delays at 1'28'' and 1'30'' are different
Inline image 4

I suggest you check the system_delay or knownDelay in echo_cancellation.c, or check the parameters transmitted to SetVQEData function.

Siping

--




image.png
image.png
Message has been deleted

Ronny Sthoeger

unread,
Feb 7, 2013, 6:59:19 AM2/7/13
to discuss...@googlegroups.com

Turns out that when the speakers use the default sampling rate of 44.1 (CD quality) webRTC doesn't handle it very well.
Other settings are still not perfect, but significantly better. Thanks for the help here.


By the way - I looked at the latest webRTC AEC code and I see that these lines were added in EstBufDelay (in addition to one other bug fix):

 // 3) Compensate for non-causality, if needed, by flushing one block.
  if (current_delay < PART_LEN) {
    current_delay += WebRtcAec_MoveFarReadPtr(aecpc->aec, 1) * PART_LEN;

Since the near end in our dumps is ahead of the far end (non-causal), this could very well help.
I am thinking about integrating just the specific AEC fixes into the current version we're using.

Considering how local these changes are, it does not seem too risky to me (setting maintenance issues aside).
However, I am not that experienced with the webRTC package and taking only part of the new package... well - I'd be happy to hear other people's thoughts about it.

Thanks

tao siping

unread,
Feb 8, 2013, 8:03:31 PM2/8/13
to discuss...@googlegroups.com
Hi Ronny,

What did you turn out for getting rid of the non-causal problem? I am very curious of that, could you please reveal more details?

Thanks,
Siping


--
 
---
You received this message because you are subscribed to the Google Groups "discuss-webrtc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrt...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Ronny Sthoeger

unread,
Feb 13, 2013, 3:38:32 AM2/13/13
to discuss...@googlegroups.com

Hi Siping,

When I change the devices' configuration to use the same sampling rate, or at the very least not to use 44.1 - it seems to solve the problem.
Enabling the drift compensation also helps (I also noticed that the drift compensation is disabled by IFDEF - too bad I didn't see your other thread before I investigated it all myself). So we don't encounter the non-causal issue so much anymore. I am not sure, but I think that when we do, it has to do with high CPU.

I am also hoping that the code that I quoted above would help in such situations (where it says: Compensate for non-causality, if needed, by flushing one block)

However - 
I am having problems implementing my conclusions about the sampling rate. I cannot ask users to change the device properties on their PC, and I can't find an API in webRTC that will allow me to control it.

In addition, I see that when the playback and recording device are being initialized in audio_device_core_win - the code tries to use 48000 but IsFormatSupported fails (with S_FALSE). 
I also don't like that the second choice after 48000 is 41000 or that no trial is made to match the recording and playback sample rate.
(In AudioDeviceWindowsCore::InitRecording and AudioDeviceWindowsCore::InitPlayout).

Did you find a way to avoid 44100 without manually setting the device?

Thanks,
Ronny

Andrew MacDonald

unread,
Feb 13, 2013, 1:02:14 PM2/13/13
to discuss...@googlegroups.com
I'm finally going to implement proper 44.1 kHz resampling in webrtc. You can follow this issue:

Caragea Silviu

unread,
Feb 13, 2013, 3:17:23 PM2/13/13
to discuss...@googlegroups.com
Andrew, there is any plan for fixing also the following issues related to echo in webrtc calls in the near future?

Kind regards,

Silviu

Andrew MacDonald

unread,
Feb 13, 2013, 4:01:51 PM2/13/13
to discuss...@googlegroups.com
Yes, I'm planning to investigate those issues as well.

tao siping

unread,
Feb 16, 2013, 12:00:26 AM2/16/13
to discuss...@googlegroups.com
Hi Ronny,

Looks this non-causal problem is quite relevant to 44.1 sampling rate. However, webrtc has code to handle clock drift by flushing ring buffer, even you did not enable drift compensation, so as I understood, clock drift should not induce non-causal, at least not very big non-causal (over 100ms non-causal in the samples you posted).

How did you reproduce this problem? What are the devices you used and what's the computer model, OS? I'd like to figure out what's going on if I can reproduce it.

Thanks,
Siping

noa.gr...@gmail.com

unread,
Feb 17, 2013, 4:27:11 AM2/17/13
to discuss...@googlegroups.com
Hi Andrew,
When are you planning to enter the fix? 
Thanks
Noa

tao siping

unread,
Feb 18, 2013, 9:36:14 PM2/18/13
to discuss...@googlegroups.com
Hi Joe,

WebRtc AEC default configure can only handle 80ms delay at most, here delay means the distance between near signal and far signal, and negative delay infers non-causal, so of course AEC doesn't work at all in your case. You can increase the LMS filter length to handle bigger delay by modifying the native code, but I think it will introduce other issue, such as filter divergence.

Does anybody know why choose 80ms as the default filter length?

I also noticed that Skype worked very well even the delay is over 300ms, only heard echo at the first few seconds, then free of echo.

Siping


On Mon, Feb 18, 2013 at 4:58 PM, Joe Bloggsian <joeblo...@gmail.com> wrote:
We are seeing (hearing!) significant problems with the WebRTC ecan too (in situations where skype works fine). One case that is interesting because I found a hack that fixes it is when we use the microphone on a TANDBERG precision-HD USB webcam (48Khz mono). With pretty much any pc and any playback device the ecan does not work (seemingly at all, ever) with this microphone. However if I go and change the code and add 65 or more to "msInSndCardBuf" in the call to WebRtcAec_Process then it works (echo disappears). I'm using the latest code, and windows / core audio. Using WEBRTC_AEC_DEBUG_DUMP it looks that (with the unmodified code) the near signal is delayed about 120ms compared to the far signal (so does not look like a causality issue)? Clearly manually hacking the delay in the code for each microphone is not really a workable solution so I'm looking for something better.  I have time to look at this some more here but was wondering if anyone had any pointers.

Joe Bloggsian

unread,
Feb 19, 2013, 4:51:48 AM2/19/13
to discuss...@googlegroups.com
Hi Siping
 
Thanks for the very useful observations! I tried to increase the filter length; what I did was to increase the NR_PART #define in aec_core.h from 12 - do you think this is the correct approach?
 
In the same test as before, with the default NR_PART=12 the echo disappears only with an additional manual delay of 65ms.
With NR_PART=24 the echo disappears if I add a manual delay of 20 or more
With NR_PART=36 there is no echo even with 0 additional delay.
 
In this particular test I didn't notice any problems with convergence with the longer filter. I am wondering if the main reason for such a short filter is to reduce CPU load or for faster convergence. On even a slow old intel Core laptop i don't see any CPU load issues so I guess its most likely the latter.
 
I wonder if it is possible to run some simple separate correlation between local and far engeries to determine the approximate delay dynamically so even in the short filter case it can be better centered? It seems this should be possible as after all it is quite easy for a human to see what the delay is by looking at the near and far waveforms.

Thanks!
Joe

tao siping

unread,
Feb 19, 2013, 8:51:53 PM2/19/13
to discuss...@googlegroups.com
Hi Joe,

NR_PART=36 means AEC can handle 36x4=144ms delay at most, which is enough for your case, so echo disappeared.

You are right, human can easily notice what the delay is, looking forward that you can work out a stable algorithm to estimate this delay.

Thanks,
Siping
Reply all
Reply to author
Forward
0 new messages