AAudio callback and thread affinity

263 views
Skip to first unread message

Flavio Antonioli

unread,
Jul 25, 2017, 12:19:19 PM7/25/17
to andro...@googlegroups.com
Glenn,

I watched the "Best practices" video and one thing I have not clear is the thread affinity issue. I've played around with Systrace and it looks like the AAudio callback is being scheduled in various cores. It does tend to stick to one, but as soon as the cpu gets busy it jumps to another core and that seems to be correlated to a later underrun. Since it's a callback, the system should take care of setting the affinity, not the app, am I right?

BTW today I got the update to DP4 and now it appears that Aaudio is running with 20ms roundtrip latency on my Nexus 6P, same as with OpenESLS, nice improvement over the 40 ms I was getting with DP3.

Thanks,
Flavio.


The OC developer preview releases of AAudio are not yet optimized for 
latency.
See the Known Issues section at the end of
https://developer.android.com/ndk/guides/audio/aaudio/aaudio.html
 
We hope to improve latency and other performance metrics
in both the final OC release, and in succeeding releases.
 
As Phil Burk mentioned in his presentation at I/O 2017,
AAudio is intended to be both easier to use than OpenSL ES, and
a more flexible platform to allow us to make further performance 
improvements.
See "Best Practices for Android Audio (Google I/O '17) 
<https://www.youtube.com/watch?v=C0BPXZIvG-Q>" around 22:30.

Glenn Kasten

unread,
Jul 25, 2017, 12:23:06 PM7/25/17
to android-ndk
Flavio,
Glad to hear that DP4 is working better for you.
As your main question relates to Don and Phil's portions of the video,
it's probably best for them to reply to that part.
Glenn

Phil Burk

unread,
Jul 25, 2017, 12:38:59 PM7/25/17
to andro...@googlegroups.com
Hello Flavio,

Thanks for trying out AAudio.

We are not currently setting thread affinity in AAudio. We are considering that for a future release.

I couldn't tell if you had tried setting core affinity in your app? It needs to be done in the callback if you are using a callback. 

>  but as soon as the cpu gets busy it jumps to another core and that seems to be correlated to a later underrun.

Yes, as you have seen, core migration can cause a delay. Eliminating core migration is an advantage of setting core affinity. But sometimes core affinity can be a problem if the core you want is being used by a higher priority task and you have to wait for it to finish. In that case it might be faster to migrate to a free core. So the decision to use core affinity is not clear cut.

I'd be interested in the results of your testing.

> it appears that Aaudio is running with 20ms roundtrip latency

That's good to hear. AAudio is heading in the right direction.

Let me know if you have other feedback. Does the API seem complete? Anything missing or not well documented?

Phil Burk


--
You received this message because you are subscribed to the Google Groups "android-ndk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to android-ndk+unsubscribe@googlegroups.com.
To post to this group, send email to andro...@googlegroups.com.
Visit this group at https://groups.google.com/group/android-ndk.
To view this discussion on the web visit https://groups.google.com/d/msgid/android-ndk/CAMe8b%3DqWQxAuOvf3UGmAgUvpfaq7ERBu2KyVjqguUc_S1vjwQQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Alex Cohn

unread,
Jul 27, 2017, 6:46:28 AM7/27/17
to android-ndk
as soon as the cpu gets busy it jumps to another core and that seems to be correlated to a later underrun

Maybe the correlation that you observe is the opposite: the CPU gets busy, AAudio tries to compensate that by jumping to another core, but the CPU load is too heavy, and in the end all cores get busy, so you get the underrun.

BR,
Alex

Don Turner

unread,
Jul 27, 2017, 9:41:31 AM7/27/17
to android-ndk
Hi Flavio, 

I gave the section of the talk at I/O which covered thread affinity. The behaviour you're experiencing sounds fairly typical, although to verify I'd need to see the systrace. 

CPU core migrations cause underruns which is why affining to a certain core can stop this. The problem is, if you affine to a core which has a lot of contention you may actually experience *more* underruns. The solution here is to use getExclusiveCores which will give you the CPU core id which is reserved for the foreground app and should have less contention. More info: https://developer.android.com/reference/android/os/Process.html#getExclusiveCores().


tl;dr: Set thread affinity in the first callback, subsequent callbacks use the same thread so will be affined to the same core

Of course, another solution is to increase your buffer size so that the occasional core migration doesn't cause an underrun. 

"Since it's a callback, the system should take care of setting the affinity, not the app, am I right?"

The system (aka the Android CPU governor) doesn't know that your app has a real-time deadlines, and there is no mechanism in Android for your app to tell it. This is why it's left to the app to set affinity itself. To be clear, setting thread affinity isn't an ideal long-term solution because it denies the CPU governor the flexibility to move threads around to evenly distribute workload. But hey, we live in the real world right? :)

Hope that's helpful, 

Don

Phil Burk

unread,
Jul 27, 2017, 12:05:43 PM7/27/17
to andro...@googlegroups.com
Hello Alex,

On Thu, Jul 27, 2017 at 3:46 AM, Alex Cohn <sasha...@gmail.com> wrote:
as soon as the cpu gets busy it jumps to another core and that seems to be correlated to a later underrun
Maybe the correlation that you observe is the opposite: the CPU gets busy, AAudio tries to compensate that by jumping to another core, but the CPU load is too heavy, and in the end all cores get busy, so you get the underrun.

AAudio does not currently do anything related to thread scheduling or deciding which core to run on.  That is all handled by the kernel.

Phil Burk

 

Alex Cohn

unread,
Jul 27, 2017, 12:45:31 PM7/27/17
to android-ndk
Sorry, I did not express myself clearly. I meant that AAudio thread is rescheduled to a different core by the kernel. Does this sound correctly now?

BR,
Alex

Phil Burk

unread,
Jul 27, 2017, 1:08:56 PM7/27/17
to andro...@googlegroups.com
 I meant that AAudio thread is rescheduled to a different core by the kernel. Does this sound correctly now?

Yep. That sounds right.

Thanks,
Phil



BR,
Alex

--
You received this message because you are subscribed to the Google Groups "android-ndk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to android-ndk+unsubscribe@googlegroups.com.
To post to this group, send email to andro...@googlegroups.com.
Visit this group at https://groups.google.com/group/android-ndk.

Flavio Antonioli

unread,
Jul 28, 2017, 11:33:39 AM7/28/17
to andro...@googlegroups.com
Hi Phil & Don,

I’ve experimented a bit with setting the thread affinity. On my Nexus 6P getExclusiveCores() returns an empty set, so I’m "attaching" to the core that the callback first gets called on, and that appears to work in the Systrace traces, but I haven’t been able to measure an actual benefit. My app has a hard time sustaining the default settings of buffer size 1x192 buffer at 48khz. It’s better with 2x192 (burst size is 192).

Regarding the API and its docs, I think the buffer size vs burst size is a little confusing and different from the way I am used to think about audio buffers, which is two buffers of a given size, with the audio driver alternating in passing me either of the buffers to fill.
I’m not really sure of what the AAudio burst size is and what it means to set the buffer size to 1x the burst size (as the Echo sample does). I don’t think it means that I’m communicating with the audio device with just one buffer (that would mean that I would write to the buffer while the device is using it), and it doesn’t mean that the buffer size is the size I get in the callback, because if I set the buffer size to 2x burst size the callback still gets called with 1x burst frames. But when switching from 1x to 2x the chance of underruns decreases and latency increases so something is happening behind the API.
Perhaps a technical note about the way buffers work in detail could be helpful in making us understand what we’re dealing with. It could also help with the issue of recording vs playback sync. Since I don’t know how many buffers (or other machinery) really are there and no timestamp (at least that I know of) is associated with audio buffers I don’t know how to place recorded frames relative to the frames that have been played. This makes mine and a few other recording apps do ‘latency calibration’ stuff that it’s not good for the user experience and in my experience not always a reliable method.

Thanks for listening.

Cheers,
Flavio.

Phil Burk

unread,
Jul 28, 2017, 12:06:07 PM7/28/17
to andro...@googlegroups.com
Hello Flavio,

Thanks for the report on your experience with setting the core affinity.

> Regarding the API and its docs, I think the buffer size vs burst size is a little confusing and different from the way I am used to think about audio buffers, 
> which is two buffers of a given size, with the audio driver alternating in passing me either of the buffers to fill.

Unfortunately the term "buffer" is too vague. In most of the audio framework we use circular buffers to pass data between threads or processes. We can read and write into the buffers. Generally the amount we write is less than the full size of the buffer. If we can fit two writes into the buffer then, traditionally, we might say "the buffer is double buffered", which is inherently confusing. 

In the AudioTrack API we refer to the BufferSizeInFrames. But we also refer to the AudioManager.PROPERTY_OUTPUT_FRAMES_PER_BUFFER. That is the optimal size for writing into the buffer. Many people confuse these two uses of the word "buffer" so we wanted to avoid that confusion in AAudio.

In AAudio, "buffer" means the large circular buffer that is written by one thread and read by another.

A "burst" is the number of frames processed at one time by the side closest to the hardware. For an output stream the framesPerBurst is the number of frames read by the system mixer or DSP at one time. If an app writes the same size  as the mixer reads then we can optimize the latency.

Traditionally people might say "the buffer is double buffered". In AAudio they would say that the buffer can contain two bursts.

We want to avoid confusion so please let us know what we can do to clarify this.

Thanks,
Phil Burk






Reply all
Reply to author
Forward
0 new messages