Nexus S and feature android.hardware.audio.low_latency

3,115 views
Skip to first unread message

Glenn Kasten

unread,
Dec 16, 2010, 3:51:45 PM12/16/10
to android-ndk
The Nexus S phone does not claim the
android.hardware.audio.low_latency feature, as seen by adb shell pm
list features. The upcoming Android Compatibility Definition Document
(CDD) will describe the requirements for this feature more precisely.

http://developer.android.com/reference/android/content/pm/PackageManager.html#FEATURE_AUDIO_LOW_LATENCY

Glenn Kasten

unread,
Dec 17, 2010, 7:50:43 PM12/17/10
to android-ndk
The Android Compatibility Definition for 2.3 (gingerbread) is now
published at
http://source.android.com/compatibility/index.html
(click on Getting Started / Current CDD). Though the CDD is directed
primarily at device manufacturers, section 5.3 Audio Latency should
also be of interest to app developers using the Android native audio
APIs based on Khronos Group OpenSL ES.

On Dec 16, 12:51 pm, Glenn Kasten <gkas...@android.com> wrote:
> The Nexus S phone does not claim the
> android.hardware.audio.low_latency feature, as seen by adb shell pm
> list features. The upcoming Android Compatibility Definition Document
> (CDD) will describe the requirements for this feature more precisely.
>
> http://developer.android.com/reference/android/content/pm/PackageMana...

dario

unread,
Dec 18, 2010, 3:56:23 AM12/18/10
to android-ndk
Thanks Glenn for the info.

Any chance that there would be a list of devices that would be likely
candidates for having the low_latency feature upon upgrading to 2.3?

-Dario

On Dec 17, 7:50 pm, Glenn Kasten <gkas...@android.com> wrote:
> The Android Compatibility Definition for 2.3 (gingerbread) is now
> published athttp://source.android.com/compatibility/index.html

mic _

unread,
Dec 18, 2010, 10:09:31 AM12/18/10
to andro...@googlegroups.com
The first batch of devices running Gingerbread are likely not to support this feature since the hardware for those devices was finalized a long time ago, and meeting these requirements on a platform that wasn't developed with the requirements in mind can be quite hard if not impossible.

Personally I think these requirements need to be more specific. What's the intended sound source / destination? Do you only need to meet these requirements for the internal mic(s) and the internal loudspeaker? What about wired accessories? Or A2DP? (that would probably be a hopeless task).

/Michael

--
You received this message because you are subscribed to the Google Groups "android-ndk" group.
To post to this group, send email to andro...@googlegroups.com.
To unsubscribe from this group, send email to android-ndk...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/android-ndk?hl=en.


Olivier Guilyardi

unread,
Dec 19, 2010, 4:49:33 AM12/19/10
to andro...@googlegroups.com
Hi,

I looked at the CDD section 5.3, and it says:

"continuous output latency of 45 milliseconds or less"

45ms is not low latency.

Implementing a drum pad, or some synth which react to touch events, is not
realistic over 10ms latency, unless you want users to laugh at your product.

Olivier

niko20

unread,
Dec 19, 2010, 6:23:55 PM12/19/10
to android-ndk
Ageed. Low latency should be defined as 15ms or less. At most 20ms but
45 is definitely not low latency.

niko20

unread,
Dec 19, 2010, 6:27:39 PM12/19/10
to android-ndk
Actually the warm output latency in the ccd is 10ms so i think it does
look reasonable already.

Olivier Guilyardi

unread,
Dec 20, 2010, 3:37:24 AM12/20/10
to andro...@googlegroups.com
I don't think so:

------------
"warm output latency" is defined to be the interval between when an application
requests audio playback and when sound begins playing, when the
audio system has been recently used but is currently idle (that is, silent)

"continuous output latency" is defined to be the interval between when an
application issues a sample to be played and when the speaker physically
plays the corresponding sound, while the device is currently playing back audio
------------

IIUC, "continuous" latency is the delay between the moment audio data is passed
to the OpenSL API, and the moment it gets out of the speaker.

"warm" latency just seems to be some startup time. It seems to be about resuming
playback, in which case an internal buffer may already carry some data.
Therefore these 10ms could be the real hardware latency, while the extra
continuous latency may come from intermediary software layers, I suppose.

Unless I'm wrong, continuous latency is the main one for us app developers.

Olivier

Robert Green

unread,
Dec 20, 2010, 4:30:25 AM12/20/10
to android-ndk
Low latency should be defined as an amount which allows for accurate
and believable synchronization with video. Most 2D apps can, in
theory, run at 60FPS, assuming some degree of efficiency of execution
on all parts. 60FPS = 0.016666ms = 16ms per frame. I believe 16ms
would be a more practical definition of "low latency" as it would
allow for accurate sync with video. 45ms seems like a number made up,
hoping that it's low enough to entice OEMs to target.

Comparisons:

Windows Audio (NOT DirectSound) has a latency lower than 45ms on my PC
DirectSound often is around 20ms for most AC97 hardware (so I've
read), but is still too high for real time audio manipulation.
Core Audio (apple) is advertised at being able to be .2 to 1.5ms!
ASIO drivers for audio production usually are below 11ms, typically
hitting 5ms.

These new devices have _plenty_ of horsepower to copy a small array of
bytes (2,822 for stereo 44.1k) over to a hardware buffer every 16ms
without interruption. Apple has been doing it since the first iPhone
without problem, and that is now only a fraction of the speed of the
current batch of hardware.

Also I must note that the current generation of phones is very, very
sad with how big the smallest continuous buffer is.

My Samsung Galaxy S (vibrant) is one of my highest latency devices,
despite being one of the fastest I test on. (sounds like 100ms)
My Nexus One is also very high. (sounds like 75ms)
My G1 actually has _less_ latency then both of those!
My Droid is decent. (sounds like 40ms)

When I run my own mixer, I've found that running 44.1khz stereo gives
the lowest latency via AudioTrack for most devices. It's because the
smallest allowable buffer size is the same for stereo and mono, so
obviously stereo will be half the latency. Still, the results I get
are unsuitable for most real-time games and ensuring a 45ms buffer
would _barely_ get in the ballpark of what would look like a realistic
sound response time for what you're seeing on-screen. I can't imagine
it would work well enough for a good rhythm game or a drum synth.

Just my 2 cents.

Olivier Guilyardi

unread,
Dec 20, 2010, 6:00:48 AM12/20/10
to andro...@googlegroups.com
On 12/20/2010 10:30 AM, Robert Green wrote:
> Low latency should be defined as an amount which allows for accurate
> and believable synchronization with video. Most 2D apps can, in
> theory, run at 60FPS, assuming some degree of efficiency of execution
> on all parts. 60FPS = 0.016666ms = 16ms per frame. I believe 16ms
> would be a more practical definition of "low latency" as it would
> allow for accurate sync with video. 45ms seems like a number made up,
> hoping that it's low enough to entice OEMs to target.

Video (visual) sync is not that crucial to me. For example, with a drum pad,
what's important is that the user doesn't notice any delay between the moment he
or she tap the screen and the moment the sounds gets out of the speaker. 10ms is
a *maximum* for this.

I'm not a game developer, but I suppose that similar constraints apply.

16ms, that's about 700 frames at 44.1Khz. Again, that really can't be called low
latency to me.

> Comparisons:

[...]

> Apple has been doing it since the first iPhone
> without problem, and that is now only a fraction of the speed of the
> current batch of hardware.

From what I read, you can get 5ms on the iPhone. That's certainly why serious
audio software companies such as Izotope are targeting this platform. Check
their iDrum app to understand what can't be done on Android currently.

[...]


> When I run my own mixer, I've found that running 44.1khz stereo gives
> the lowest latency via AudioTrack for most devices. It's because the
> smallest allowable buffer size is the same for stereo and mono, so
> obviously stereo will be half the latency. Still, the results I get
> are unsuitable for most real-time games and ensuring a 45ms buffer
> would _barely_ get in the ballpark of what would look like a realistic
> sound response time for what you're seeing on-screen. I can't imagine
> it would work well enough for a good rhythm game or a drum synth.

I agree.

Is there an OS design problem?

--
Olivier

mic _

unread,
Dec 20, 2010, 9:53:25 AM12/20/10
to andro...@googlegroups.com
I haven't looked at the OpenSL implementation, but one thing I'll point out is this: avoid starting and stopping audio outputs frequently. The AudioPolicyManager will assume that it needs to reroute the audio when output starts, and each reroute can take a relatively long time (as for why it assumes this; that's partly because applications implicitly can make changes in the AudioHardware state without the policy manager's knowledge).

/Michael

niko20

unread,
Dec 20, 2010, 10:37:26 AM12/20/10
to android-ndk
Unfortunately this is a problem I think we'll have to live with for a
while. There is no drum pad app where users don't say "too much
latency", and it's not in our control. Even if we find a device which
has low latency the next one may not meet the same standard. It's just
the nature of the beast (Open SL , open standards). To target the
largest audience for audio apps that require low latency we'd probably
have to go with iPhone for quite a while yet.

-niko

On Dec 20, 8:53 am, mic _ <micol...@gmail.com> wrote:
> I haven't looked at the OpenSL implementation, but one thing I'll point out
> is this: avoid starting and stopping audio outputs frequently. The
> AudioPolicyManager will assume that it needs to reroute the audio when
> output starts, and each reroute can take a relatively long time (as for why
> it assumes this; that's partly because applications implicitly can make
> changes in the AudioHardware state without the policy manager's knowledge).
>
> /Michael
>
> > android-ndk...@googlegroups.com<android-ndk%2Bunsubscribe@googlegr oups.com>
> > .

Olivier Guilyardi

unread,
Dec 20, 2010, 11:24:15 AM12/20/10
to andro...@googlegroups.com
I quite agree. Considering both this CDD and device fragmentation, one must be
careful about which app and features to develop when working on Android.

If you take the iPhone iDrum app which I mentioned, most of it can actually be
implemented on Android. The only feature which is problematic is the pad which
allows to play over, or record, a drum sequence in real time.

And that's quite a central feature, but the app would be usable without this
pad. Now, whether it would be popular is another question... I suppose it would.

Olivier

Ross Bencina

unread,
Dec 21, 2010, 6:54:46 AM12/21/10
to android-ndk
On Dec 20, 8:30 pm, Robert Green <rbgrn....@gmail.com> wrote:
> Low latency should be defined as an amount which allows for accurate
> and believable synchronization with video.

Robert, I disagree with this point but I think we agree in general.
Depending on which standards body you believe, acceptable A/V sync
latency is between 50ms and 140ms (the most recent standards that I
know of relate to Digital Broadcast TV). I believe frame rate is not
really relevant -- it's all about ear-eye perception and "lip sync"

As a long time professional audio developer I think it's an absolute
disgrace that Google are equating 45ms with "low latency." I strongly
disagree that the A/V sync requirement has anything to do with "low
latency" -- it is a requirement for sure, it should be a requirement
for _all_ Android devices, not those that claim "low latency". Heck,
every system I've programmed in the last 10 years could achieve 45ms
playback latency.

> Windows Audio (NOT DirectSound) has a latency lower than 45ms on my PC

WMME is legacy, even on Windows XP. The main latency is caused by the
"kernel mixer" presumably the equivalent of the AudioFlinger on
Android. Kernel mixing on Windows 7 is much lower latency.

> DirectSound often is around 20ms for most AC97 hardware (so I've
>read), but is still too high for real time audio manipulation.

Agreed. WASAPI on Windows Vista and Windows 7 achieves lower than 4ms.

> Core Audio (apple) is advertised at being able to be .2 to 1.5ms!

Yes. Well below 3ms on iOS too in my experience.

> ASIO drivers for audio production usually are below 11ms, typically
> hitting 5ms.

Agreed.


There are many applications that have more stringent latency
requirements than A/V synchronisation. Three use-cases I can think of
right now are:

1. Triggering sound from a user input. This could be in games where
the game reacts to gameplay, or a musical instrument where the user
"plays" the instrument and expects the instrument to respond
immediately.

2. Processing audio input into audio output "audio effects processors"
-- plugging your guitar in to your computer to use it as an effects
box for example.

3. Providing an input or output endpoint for an audio session where
the other end of the connection is somewhere else on the network
(Networked telepresence and musical performance come to mind).


From memory, the upper limit of the perceptual threshold for fused
actuation->result for music performance is around 8ms. That means any
audio effect device capturing audio from a microphone and playing out
out a speaker needs to perform the whole round trip capture->process-
>playback in less than 8ms to be considered musically transparent (you
don't notice there's a computer involved). You will find these kind of
latencies on commercial electronic musical instruments where there is
a mechanical actuator and an analog audio output. Latencies for
digital audio effects units (where the input is an analog audio signal
and the output is also an analog signal) can (and should) be lower.
There was a paper published a few years ago that identified 25ms as
the upper threshold for round-trip audio transport for networked
musical performance (that's capture+network+playback latency).

"low latency" already has a well defined meaning in the audio software
world. On Linux for example, people have been working for years to get
single-direction latency below 2ms for audio applications (see for
example http://lowlatency.linuxaudio.org/). On MacOS and iOS these
kind of latencies are easily achievable. On Windows 7, < 4ms is also
easily achievable (on older versions of Windows it is achievable if
your code talks direct to the WDM layer).

In my experience, based on customer feedback, music and audio
processing software users now consider anything more than 5ms latency
to be "high latency".

To cite another data point: IEEE P802.1p QoS standard stipulates that
VOIP traffic be routed with less than 10ms latency. I don't personally
consider that low latency, but it gives you an idea of what the IEEE
think.

If Android was "just a phone OS" then I might be willing to let all
this slide, but the reality is that Android will be supporting a lot
more than Phones in the coming year and applications that meet or
surpass what is possible on the desktop should be enabled. And
besides, iOS can already deliver latencies below 2ms so the bar has
already been set for what "low latency" is.

If anyone is interested I can look up the references for this stuff.

And if anyone knows who to ping at Google to complain please let me
know.

Thanks!

Ross Bencina

unread,
Dec 21, 2010, 7:01:52 AM12/21/10
to android-ndk
On Dec 21, 3:24 am, Olivier Guilyardi <l...@samalyse.com> wrote:
> I quite agree. Considering both this CDD and device fragmentation, one must be
> careful about which app and features to develop when working on Android.

Well the main problem is the CDD no? Google should fix it to something
realistic like 2ms or 5ms for each direction.

The latency is most likely not in the hardware anyway. It may be in
the drivers, and it could easily be in the intervening software stack
(remember 10ms latency was possible on Windows 95, but then MS
inserted "kmixer" and added 40ms of latency until Windows Vista)..
this is something Google could show leadership on and make sure the
Android audio path was low-latency (like Apple does).

I'd be interested if anyone has tried to bypass AudioFlinger and deal
direct with AudioHardwareInterface as a test to see where the latency
is.



Olivier Guilyardi

unread,
Dec 21, 2010, 4:31:19 PM12/21/10
to andro...@googlegroups.com
On 12/21/2010 01:01 PM, Ross Bencina wrote:
> On Dec 21, 3:24 am, Olivier Guilyardi <l...@samalyse.com> wrote:
>> I quite agree. Considering both this CDD and device fragmentation, one must be
>> careful about which app and features to develop when working on Android.
>
> Well the main problem is the CDD no? Google should fix it to something
> realistic like 2ms or 5ms for each direction.

This CDD makes it hard to deal with device fragmentation.

It says that with 45ms or less, the device can be considered to feature
android.hardware.audio.low_latency.

Therefore, you may have some devices which support much lower latency, suitable
for advanced games and audio apps. But some device may be around the 45ms limit
and still report themselves as featuring low latency.

In this situation, the corresponding <uses-feature> tag is unreliable.

I believe that things should be called by their name, and 45ms is not low
latency. Maybe it was in the 50's, but I can't tell, I wasn't born.

10ms sounds ok to me. You can't expect a professional zero-latency DAW anyway.

--
Olivier
.

Glenn Kasten

unread,
Dec 21, 2010, 7:44:53 PM12/21/10
to android-ndk
I can't comment on unreleased devices.

The CDD is an evolving document with each Android release. The
requirements are adjusted as the platform and devices advance.

I appreciate everyone's feedback about the CDD audio latency section
in particular -- the numbers, definitions, test conditions,
performance of various devices, application use cases, etc.

To clarify a point in the CDD ... the difference between warm output
latency and continuous output latency represents the maximum time that
can ascribed to user-mode buffers. The warm output latency represents
the maximum time that can be ascribed to device driver and hardware
buffers alone, that is when the user-mode buffers are empty.

Audio latency is clearly a topic of great interest, and your inputs
are valuable.

Ross Bencina

unread,
Dec 22, 2010, 5:05:20 AM12/22/10
to android-ndk
On Dec 22, 8:31 am, Olivier Guilyardi <l...@samalyse.com> wrote:
> 10ms sounds ok to me. You can't expect a professional zero-latency DAW anyway.

To me, if the current CDD said low-latency=10ms I would think "yeah
that's ok, but it isn't really low latency."

I don't expect a professional DAW. But I do expect things to keep up
with standard desktop audio APIs (e.g. CoreAudio, WASAPI) these, and
iOS, have significantly lower latency than 10ms, and you can bet that
QNX has pretty good latency too.

Setting the requirement at 45ms is so far from real low latency that
there is room for all levels of the system to fail to deliver (from
Dalvik, to NDK/OpenSL, to AudioFlinger, libaudio.so and friends, the
kernel etc). That's way too many subsystems to blame when things go
wrong. The best thing would be to provide documented latency
requirements at each level of the system. Kernel latency for example
will be affected by other drivers, not just audio. Seems like there
are already goals for Dalvik GC pause times. At the moment it seems
hard to get info about internal buffering bottlenecks inside
AudioFlinger and OpenSL implementations without diving into the
source.

Ross

Ross Bencina

unread,
Dec 22, 2010, 5:17:22 AM12/22/10
to android-ndk
On Dec 22, 11:44 am, Glenn Kasten <gkas...@android.com> wrote:
> To clarify a point in the CDD ... the difference between warm output
> latency and continuous output latency represents the maximum time that
> can ascribed to user-mode buffers. The warm output latency represents
> the maximum time that can be ascribed to device driver and hardware
> buffers alone, that is when the user-mode buffers are empty.

Hi Glenn

Nice to hear from you here.

I'm not really sure I understand the buffering model that underlies
your statements above (or the CDD) is this documented somewhere? I'm
used to thinking of audio buffering as a ring buffer or double buffer
(some kind of shared buffer anyway) where the user is writing new data
into the buffer and the hardware is playing it back out of the other
end of the buffer (at least conceptually, perhaps there is a driver
thread copying it into a hardware buffer). Latency is usually
determined by how big the buffer needs to be to mask client and driver
scheduling jitter. I'm not sure where your concept of user-mode
buffers fits into this picture -- are they some immutable concept
within the AudioFlinger architecture?

Thanks :-)

Ross.

niko20

unread,
Dec 22, 2010, 3:49:19 PM12/22/10
to android-ndk
The hardware should just allow reading straight from a user buffer.
Ill take care of keeping it full. The would be the lowest latency.

niko20

unread,
Dec 22, 2010, 3:54:18 PM12/22/10
to android-ndk
I mean I don't think its rocketscience . Just turn on the audio
hardware when a play command comes in. It then just starts reading
from a user mode buffer and outputting straight to the d/a converter
to the speaker. If we have access directly to the last buffer in the
layers we should be able to get low latency

niko20

unread,
Dec 22, 2010, 3:57:19 PM12/22/10
to android-ndk

Don't mean to spam butas an example at 44khz if I ask for audiobuffer
min size I get 16384 samples. That's 326ms. That value to me seems
like an intermediate buffer somewhere in audioflinger or something.

Olivier Guilyardi

unread,
Dec 22, 2010, 6:02:08 PM12/22/10
to andro...@googlegroups.com
On 12/22/2010 11:05 AM, Ross Bencina wrote:
> On Dec 22, 8:31 am, Olivier Guilyardi <l...@samalyse.com> wrote:
>> 10ms sounds ok to me. You can't expect a professional zero-latency DAW anyway.
>
> To me, if the current CDD said low-latency=10ms I would think "yeah
> that's ok, but it isn't really low latency."

Yeah, that's what I mean. A compromise, but written "10ms or less". Maybe that
some capable manufacturers would then provide true low latency. I don't known
what's under OpenSL, so I can't tell.

With the Java API, latency is apparently caused by extra buffers and IPC in what
stands between the app and the hardware. I'm not sure but I think that, as it
relates to telephony, sound is a critical part of the platform. Whereas you get
a rather direct access to the GPU, the audio subsystem seems over-encapsulated.

Plus, last time I checked, patches were not accepted for the audio stack
(critical again). So, commenting this CDD is quite the only thing that we can do
currently.

Apart maybe from releasing an Android variant where all audio is handled by
JACK. That would be something :)

--
Olivier

Ross Bencina

unread,
Dec 23, 2010, 12:18:13 AM12/23/10
to android-ndk
On Dec 23, 10:02 am, Olivier Guilyardi <l...@samalyse.com> wrote:
> With the Java API, latency is apparently caused by extra buffers and IPC in what
> stands between the app and the hardware.

I'm not sure native API audio reduces the IPC burden, although I am
yet to see an architecture diagram that explains how OpenSL is
supposed to be related to AudioFlinger. The most recent diagram I have
seen is this one, which is presumably pre-gingerbread:
http://www.kandroid.org/android_pdk/audio_sub_system.html

I don't know if there is a more official google link for that(Glenn ?)

If someone can post a link to an example of how to do simple low-
latency audio i/o with OpenSL that would be super helpful. Just an
audio pass-through or even playing a sawtooth wave like this year's
Google I/O "Advanced audio techniques" video would be enough.


>I'm not sure but I think that, as it
> relates to telephony, sound is a critical part of the platform.

I think for this reason google have left this detail to the platform
partners (so platform partners make telephony work well, and
everything we can do from Java or NDK sucks). Compare this to iOS
where media playback is a core device priority.

>Plus, last time I checked, patches were not accepted for the audio stack
>(critical again). So, commenting this CDD is quite the only thing that we can do
>currently.

Yes I read this too. Glenn can you please update us on the audio stack
roadmap and how we can get involved?



>Whereas you get
> a rather direct access to the GPU, the audio subsystem seems over-encapsulated.

Wow, what a can of worms that statement is. I agree, but I think it's
it's a mistake to think sound APIs should be like graphics APIs. Audio
hardware acceleration does not have the same status as graphics
acceleration in the marketplace -- otherwise we'd all be programming
in standardised audio dsp languages (equivalent of shader langauges)
and getting the hardware to compile the code on the fly and execute it
efficiently. That would be great, but that's not what we have. So, we
mainly write audio code for the CPU (if there are audio coprocessors,
they are specialised to certain tasks like codec acceleration). Based
on this, an audio API should look more like a direct frame buffer, a
way to get sound out of the device with low latency. That's what we're
asking for -- it's not an obscure request, it's what audio programmers
need to do what they do, and what they have already on other
platforms. Hardware codec acceleration can be handled by a library
that sits on the side, not something that must sit between the app and
the hardware.

I don't think it's so much a problem of over-encapsulation as it is a
poor choice of abstractions and layering. Audio programmers are used
to the audio equivalent of direct frame buffer access (or a good HAL
abstraction of it) as this is generally required for low latency. It
just doesn't make sense to put too much extra stuff in between the
hardware and the user app unless it is required for a particular
application. Things like OpenSL seem to me to be middle-ware -- good
for putting something together quickly, and they provide an
abstraction for HW vendors to support hardware acceleration of codecs
or effects. But like most middle-ware, it is not something you can
easily standardise for all use cases. Each middle-ware might be good
for it's target, but the platform needs a low-latency audio interface
to hardware so people can bypass middle-ware or use different middle-
ware.

One thing I've noticed, is that when application programmers start
learning how to do real-time audio programming on normal (non hard-
real-time) OSes there is a steep learning curve because they don't
understand what's required for real time code (no locks, no memory
allocation, no blocking apis, etc). I've been there, I've seen this on
many mailing lists on many platforms (ALSA, JACK, CoreAudio,
PortAudio, etc)... everyone goes through that stage. That isn't going
to change any time soon, because real time audio requires either a
real-time OS or code that avoids non-real-time features of a non-real-
time host OS. Perhaps it's kind of scary for google to offer an API
that actually allows proper real-time audio because it breaks with the
simplicity of what normal (non-audio) developers understand (cf
AudioTrack::write()). People like the idea of a read()/write() stream
interface for audio because it's easy to understand and elegant. But
in reality, a real-time audio data pump needs to do 2 things well: (1)
move the data to/from the hardware efficiently, (2) notify the user in
a timely manner to pump data by avoiding scheduling jitter. Read()/
write() interfaces don't directly answer requirement (2), at least on
non-real-time platforms they don't. The API implementation needs fine
grained control of the thread scheduling of the user proc that does
audio evaluation. Read()/write() interface give control of that to the
client and limit the mechanisms the API implementation can use. Oh,
and another problem with read()/write() is it isn't well suited to
full-duplex.

I haven't seen another low-latency API that uses read(),write() (rumor
has it that the Irix one worked well, but they had time constrained
threads I think) and I've seen people argue for read()/write()
interfaces because they simplify concurrency-related coding, but then
switch to async callbacks once they understand the issues. For Java,
where there are GC pauses and you need to buffer things anyway, I
think the read()/write() interface makes sense (unless the VM can
service async interrupts). For C++/NDK it makes no sense, and right
now, as I understand it, the audio flinger uses read()/write() at the
HardwareAudioInterface boundary so I'm hoping we get some clarity on
where that is heading and how it relates to OpenSL and real low-
latency.

Of course, you can sidestep all this if you defined "low-latency" as
45ms ;)

Sorry for another long post but I think Android is important enough
for this not to get f**kd up yet again...

Cheers

Ross.




Ross Bencina

unread,
Dec 23, 2010, 12:22:42 AM12/23/10
to android-ndk
On Dec 23, 7:54 am, niko20 <nikolatesl...@yahoo.com> wrote:
> I mean I don't think its rocketscience . Just turn on the audio
> hardware when a play command comes in. It then just starts reading
> from a user mode buffer and outputting straight to the d/a converter
> to the speaker. If we have access directly to the last buffer in the
> layers we should be able to get low latency

I think we need to be careful about what we're talking about here.
You're talking about exclusive direct access to the audio hardware.
That's a great way to get lowest latency, but you loose services that
a phone needs to provide (like ringing when a call comes in).
Generally OSes these days have a mixer layer that sits between apps
and the hardware. This can still be done with low latency.

I'm not asking for direct hardware access. I'm just asking for a low
latency raw audio buffer api.

Ross Bencina

unread,
Dec 23, 2010, 12:24:23 AM12/23/10
to android-ndk
On Dec 23, 7:49 am, niko20 <nikolatesl...@yahoo.com> wrote:
> The hardware should just allow reading straight from a user buffer.
> Ill take care of keeping it full. The would be the lowest latency.

You probably want to OS to take care of scheduling your feeder thread
too (ie an asynchronous callback interface for buffer fill/process)
otherwise you're going to have portability issues trying to manage the
scheduling yourself.

Robert Green

unread,
Dec 23, 2010, 4:29:52 AM12/23/10
to android-ndk
So far my favorite style of API is one in which you implement a
callback and tell the OS to start pulling audio and how much to pull
each time (so maybe 2048 bytes).. Then the OS runs its own thread
calling your "fillBuffer" method for chunks of 2048 bytes.

It's a super easy way to make for a reliable system in which you don't
depend on the 3rd party app to push any data, you just pull from it.

After that it's got the bytes from your app and any other system
services so after a quick additive mix/resample/interleave, it's
straight to /dev/pcm and we're done.

Glenn Kasten

unread,
Dec 23, 2010, 10:45:04 PM12/23/10
to android-ndk
The lowest level APIs in Android native audio are based on the "buffer
queue" concept from OpenSL ES, which itself was influenced by other
audio APIs. The documentation for Android native audio, in android-ndk-
r5/docs/opensles/index.html, discusses audio output and input via PCM
buffer queues. For an example of low-level PCM audio output, please
see the buffer queue parts of the sample code at android-ndk-r5/
samples/native-audio. In addition to the Android-specific
documentation and sample code, you'll also want to read about buffer
queues in the Khronos Group OpenSL ES 1.0.1 specification.

As someone else noted, audio output via PCM buffer queues is routed
through the system mixer aka "AudioFlinger" prior to output to the
device driver layers, but it does not pass through a decode layer.

This group android-ndk is primarily for application developers using
the Android NDK. However, since there is a lot of interest here in
improving the platform, I'll make a few comments here on that
topic ...

Regarding "patches not accepted", I'm not aware of any previous
policy, so I can't comment on that.

However, to quote a well-worn phrase, the current state is "quality
patches welcomed :-)". Generally "quality" means good code that is
easy to review, well tested, won't break the platform portability or
compatibility, etc. I suggest that contributors read
http://source.android.com/source/index.html to become familiar with
building the platform and the patch process. Also see
http://source.android.com/community/ especially "Open Source Project
discussions". Then find a group (e.g. android-contrib), post your
proposal there and review it with other contributors before investing
a large amount of time. When it makes sense, submit a smallish patch
to start. There's no guarantee that a given patch will be accepted,
but if you have reviewed your proposal early, gotten positive
feedback, and then submit a series of small high quality patches, it
is more likely that they can be accepted.

I can't comment on the audio stack roadmap, but if you post a proposal
early on android-contrib, someone should alert you if the proposal is
in an area that is likely to conflict with other work.

I encourage contributors who want to make Android low-level audio
better to make improvements that use or extend the OpenSL ES based
APIs, and the parts of the stack below those APIs (i.e. AudioFlinger).
We're unlikely to accept a contribution that introduces yet another
native audio API.

Angus Lees

unread,
Dec 23, 2010, 11:14:32 PM12/23/10
to andro...@googlegroups.com
Glenn, any comment on reducing the CDD requirement for declaring a
device as "low latency" from the current 45ms to (strawman) 8ms?
Since we don't have any such devices on the market yet, now seems the
right time to address this particular aspect of the problem.

Otherwise we are designing in a need for a rather silly
"android.hardware.audio.actually_low_latency" so app authors can
differentiate genuinely low latency audio devices from these 45ms
ones.

How was the current 45ms threshold chosen?
Choosing the limit around some human perception threshold rather than
comparison to competing platforms seems the right thing to do - since
then the classes of features it enables will be constant over time,
regardless of hardware improvements.

- Gus

> --
> You received this message because you are subscribed to the Google Groups "android-ndk" group.
> To post to this group, send email to andro...@googlegroups.com.

> To unsubscribe from this group, send email to android-ndk...@googlegroups.com.

Ross Bencina

unread,
Dec 24, 2010, 6:11:19 AM12/24/10
to android-ndk
To follow up on my questions:

After some more digging I discovered that the OpenSL ES interface's
Buffer Queue abstractions provides a callback notification interface.

An example of this is in the latest NDK at:
android-ndk-r5\samples\native-audio\jni\native-audio-jni.c
A lot of the questions I raised above are answered in the R5 NDK docs
at
android-ndk-r5/docs/opensles/index.html

there's an online copy of that page you can read at the link below
(google: that would be a useful doc to have online somewhere :-) :
http://mobilepearls.com/labs/native-android-api/opensles/index.html

As I think the CDD mentions too, it says:
>>>>
As OpenSL ES is a native C API, non-Dalvik application threads which
call OpenSL ES have no Dalvik-related overhead such as garbage
collection pauses. However, there is no additional performance benefit
to the use of OpenSL ES other than this. In particular, use of OpenSL
ES does not result in lower audio latency, higher scheduling priority,
etc. than what the platform generally provides. On the other hand, as
the Android platform and specific device implementations continue to
evolve, an OpenSL ES application can expect to benefit from any future
system performance improvements.
<<<<

So I guess the CDD latency guidelines Glenn posted earlier are to be
interpreted in the context of OpenSL buffer queues. Correct me if I'm
wrong (since I'm clutching at straws trying to understand what the CDD
latency numbers actually mean), but if the warm-start latency is 10ms,
and the continuous latency is 45ms, that implies to me that there is
an allowance for 35ms of scheduling jitter in the OpenSL buffer queue
notification callbacks. I don't know much about the Android kernel,
but on iOS, scheduling jitter for a real-time thread is around 200us..
so 35ms seems pretty high. Can someone who understands this better
explain what that 35ms covers please?

Glenn, can I suggest that the CDD reference concrete test cases in
both Java and the NDK that are supposed to exhibit the required
latency, that way we can understand exactly what is being guaranteed.
I am a little unclear whether the 45ms continuous figure includes a
margin for masking Dalvik GC pauses. It would make sense to have
separate requirements/constraints on the platform (Dalvik) and the C-
level and NDK capabilities.

In the end, a latency guarantee will require the vendor to make sure
that no driver will cause audio related native threads (in kernel or
userspace) to miss a deadline, that intermediate buffering can be
dimensioned to accommodate low latency, and that all inter-thread and
inter-process communication mechanisms are non-blocking (lock-free
FIFOS or similar mechanisms).

I am alarmed to see the following in the above cited document:
>>>>
But this point is asynchronous with respect to the application. Thus
you should use a mutex or other synchronization mechanism to control
access to any variables shared between the application and the
callback handler. In the example code, such as for buffer queues, we
have omitted this synchronization in the interest of simplicity.
However, proper mutual exclusion would be critical for any production
code.
<<<<

Employing mutexes will almost certainly cause priority inversion at
some stage and glitch audio rendering. The usual (and simplest) safe
technique to communicate with asynchronous audio callbacks are lock-
free fifo command queues. Try-locks are may also be an option in some
cases. I've already mentioned the learning curve involved in writing
real-time audio software on non-real time OSes.. clearly whoever wrote
that document hasn't traversed it.

Thanks

Ross.







Olivier Guilyardi

unread,
Dec 24, 2010, 6:53:31 AM12/24/10
to andro...@googlegroups.com
On 12/23/2010 06:18 AM, Ross Bencina wrote:
> On Dec 23, 10:02 am, Olivier Guilyardi <l...@samalyse.com> wrote:
>
>> I'm not sure but I think that, as it
>> relates to telephony, sound is a critical part of the platform.
>
> I think for this reason google have left this detail to the platform
> partners (so platform partners make telephony work well, and
> everything we can do from Java or NDK sucks). Compare this to iOS
> where media playback is a core device priority.

Agreed. And aside of that, Apple has been making critical professional audio
hardware and software for decades. They clearly have the skills.

And, there are such skilled people amongst free audio software developers, but
for some reason it doesn't benefit to Android.

> I don't think it's so much a problem of over-encapsulation as it is a
> poor choice of abstractions and layering.

I agree.

> One thing I've noticed, is that when application programmers start
> learning how to do real-time audio programming on normal (non hard-
> real-time) OSes there is a steep learning curve because they don't
> understand what's required for real time code (no locks, no memory
> allocation, no blocking apis, etc). I've been there, I've seen this on
> many mailing lists on many platforms (ALSA, JACK, CoreAudio,
> PortAudio, etc)... everyone goes through that stage.

That's correct. I once submitted a JACK-related patch to the FFmpeg project. It
was accepted, but the FFmpeg devs, although really good at what they do, had
quite a lot of trouble understanding the requirements and semantics of realtime
audio, no memory allocations, lock-free ringbuffers, etc...

And what's confusing is that they are audio codecs experts, but that doesn't
make them application-level realtime audio devs.

And I'm afraid that such skills seem to be missing in the Android teams.

> Of course, you can sidestep all this if you defined "low-latency" as
> 45ms ;)

Reading Glenn's answer about the difference between "warm" and "continuous"
latency, it seems that 35ms of these 45ms come from software layers, flinger and
the like.

> Sorry for another long post but I think Android is important enough
> for this not to get f**kd up yet again...

I agree, it's really time to improve this poor audio situation, but the more it
goes, the more I think there are critical design flaws in the OS.

Also, I looked at the OpenSL API and I clearly don't understand why so much work
is being put into bells and whistle such as reverb, etc.. Whereas the bare
minimum, reliable low latency pcm input/output, is not provided.

I'm sorry if I'm a bit harsh, but I've been working with Android audio APIs for
over a year now, and I feel like telling the truth.

That said, happy holidays to everyone!

--
Olivier

Olivier Guilyardi

unread,
Dec 24, 2010, 7:00:53 AM12/24/10
to andro...@googlegroups.com
On 12/24/2010 12:11 PM, Ross Bencina wrote:

> I am alarmed to see the following in the above cited document:

So am I!

> But this point is asynchronous with respect to the application. Thus
> you should use a mutex or other synchronization mechanism to control
> access to any variables shared between the application and the
> callback handler. In the example code, such as for buffer queues, we
> have omitted this synchronization in the interest of simplicity.
> However, proper mutual exclusion would be critical for any production
> code.
> <<<<

Yeah, of course, locking a mutex in an audio process callback. This is a newbie
audio development mistake.

> Employing mutexes will almost certainly cause priority inversion at
> some stage and glitch audio rendering. The usual (and simplest) safe
> technique to communicate with asynchronous audio callbacks are lock-
> free fifo command queues. Try-locks are may also be an option in some
> cases. I've already mentioned the learning curve involved in writing
> real-time audio software on non-real time OSes.. clearly whoever wrote
> that document hasn't traversed it.

I completely agree. This situation is not professional, it's 100% amateurism.

--
Olivier

Olivier Guilyardi

unread,
Dec 24, 2010, 7:07:33 AM12/24/10
to andro...@googlegroups.com

Okay, I'm a bit harsh here, sorry. But really, all of this isn't serious.

--
Olivier

Olivier Guilyardi

unread,
Dec 24, 2010, 8:22:10 AM12/24/10
to andro...@googlegroups.com
On 12/24/2010 04:45 AM, Glenn Kasten wrote:

> Regarding "patches not accepted", I'm not aware of any previous
> policy, so I can't comment on that.

Here we go:
http://groups.google.com/group/android-ndk/msg/9f6e46b39f4f1fae

> However, to quote a well-worn phrase, the current state is "quality
> patches welcomed :-)". Generally "quality" means good code that is
> easy to review, well tested, won't break the platform portability or
> compatibility, etc.

Thanks for the update. That's interesting. I know we're not on android-contrib,
but is the audio stack git head up-to-date?

--
Olivier

Robert Green

unread,
Dec 25, 2010, 3:25:58 PM12/25/10
to android-ndk
Just to throw my 2 cents in here again.. I agree that bells and
whistles mean nothing without low latency audio. If there were only 1
priority of what to do with Android audio, it would be to guarantee
developers a low latency spot to dump PCM data. End of story!
Everything else can be developed on top of that, but without it, all
the effects in the world are mute because users will be experiencing
unsynchronized AV.

Also whoever posted that his audio was high latency because of Java
was incorrect IMO. It's not java's fault that the OEM put a 100ms min
buffer on the audio track for their device. My sound mixer's latency
is acceptable on a Droid and horribly bad on a Galaxy S and somewhere
in between on a Nexus One, showing that it's mostly the OEM's
implementations and decisions. In the current Audiotrack
implementation, all Java has to do is copy a byte array back to native
and that certainly doesn't take over 1ms, so I don't accept that as
the defining problem, though of course it would be better to keep it
native in the first place.

Anyway, point is. Latency matters more than anything IMO. With low
latency PCM input, I don't need an effect library, a mixer, a
soundfont player, or anything. I can use open source stuff, write my
own, or someone can add it later to the project. Without it, nothing
will sound right, no matter how hard we try! I'm not trying to do
anything particularly special with audio, either. I just want to
produce high quality games!

Robert Green

unread,
Dec 26, 2010, 6:11:13 AM12/26/10
to android-ndk
And when I said "mute," I meant moot. All the audio-talk was messing
with my vocab. :)

Olivier Guilyardi

unread,
Dec 26, 2010, 6:55:59 AM12/26/10
to andro...@googlegroups.com
Robert, I posted this link so that Glenn /learns/ about the "previous policy"
applied by his own team, regarding patches to the audio stack...

I didn't mean to focus on the Java vs native discussion. It's OT here IMO.

About bells and whistles in OpenSL, effect libraries, etc.. I do agree with you.
All of these application-level features are out of scope, when the current audio
API fails to provide reliable core I/O functionality.

I am sure that working with so many manufacturers is a big challenge, and in
this context all efforts should IMO focus on providing reliable audio
input/output, instead of bringing extra complexity and high-level features to
the audio stack.

Olivier

Le 25/12/10 21:25, Robert Green a �crit :