Contact emails
hong...@chromium.org, rt...@chromium.org
Spec
http://webaudio.github.io/web-audio-api/
http://webaudio.github.io/web-audio-api/#BaseAudioContext
http://webaudio.github.io/web-audio-api/#idl-def-AudioContextPlaybackCategory
Summary
Add an optional property bag argument for the AudioContext constructor to specify the playback category allowing the developer to give a hint on the desired buffering/latency.
Motivation
Currently, WebAudio will use the lowest latency possible for the audio device for the best interactive behavior. However, for some use-cases such as media playback, this causes unnecessary power and/or CPU utilization.
The playbackCategory is a hint from the developer that such a latency is not required, allowing the developer to tradeoff latency for power/cpu. Chrome will make the actual selection internally based on the category.
Also, currently WebAudio's low latency can interfere with WebRTC, causing glitches in some cases. The "balanced" category allows WebAudio to interoperate with WebRTC without introducing glitches.
Interoperability and Compatibility Risk
Compatibility risk is low because this is backward compatible change. Old applications will get still get the lowest latency as will new if the playback category is not specified.
Interoperability risk is moderate because the actual latency used is left up to the browser to determine. But this is true today, even without the playback category; the actual latency has never been specified.
Ongoing technical constraints
No technical constraints, but choosing the correct buffering for each category may need some tweaking for the various platforms.
Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?
Yes.
OWP launch tracking bug
Link to entry on the feature dashboard
https://www.chromestatus.com/features/5678699475107840
Requesting approval to ship?
Yes
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
> What will the 3 playbackCategory states actually map to, is it simply three different constants for the buffer length?
That is correct.
> How will "Balance audio output latency and stability/power consumption" workThis 'balanced' category is for the use case that does not require the lowest possible buffer size, but the reasonable audio latency for the realtime communication. (i.e. WebRTC) The actual latency can be varied between the browser vendors.> and what does "stability" refer to?I believe this is sort of a hand-wavy expression of 'no-glitch-in-the-audio-stream' so it means the larger buffer size, but I think we need to make some clarification on the spec.
On Fri, Dec 4, 2015 at 9:00 AM, Hongchan Choi <hong...@chromium.org> wrote:> What will the 3 playbackCategory states actually map to, is it simply three different constants for the buffer length?That is correct.It's a bit more than that. It should cause the audio device (and hardware) to call back less often (but with request for more data) so it can sleep longer when the category is not "interactive".> How will "Balance audio output latency and stability/power consumption" workThis 'balanced' category is for the use case that does not require the lowest possible buffer size, but the reasonable audio latency for the realtime communication. (i.e. WebRTC) The actual latency can be varied between the browser vendors.> and what does "stability" refer to?I believe this is sort of a hand-wavy expression of 'no-glitch-in-the-audio-stream' so it means the larger buffer size, but I think we need to make some clarification on the spec.Yeah. Blame the editors for not catching that because I don't really know what that is really supposed to mean, but Hongchan is probably correct in his interpretation.
On Fri, Dec 4, 2015 at 8:35 PM, 'Raymond Toy' via blink-dev <blin...@chromium.org> wrote:+cwilso, in case he's not reading blink-devOn Fri, Dec 4, 2015 at 11:12 AM, Philip Jägenstedt <phi...@opera.com> wrote:On Fri, Dec 4, 2015 at 6:07 PM, 'Raymond Toy' via blink-dev <blin...@chromium.org> wrote:On Fri, Dec 4, 2015 at 9:00 AM, Hongchan Choi <hong...@chromium.org> wrote:> What will the 3 playbackCategory states actually map to, is it simply three different constants for the buffer length?That is correct.It's a bit more than that. It should cause the audio device (and hardware) to call back less often (but with request for more data) so it can sleep longer when the category is not "interactive".> How will "Balance audio output latency and stability/power consumption" workThis 'balanced' category is for the use case that does not require the lowest possible buffer size, but the reasonable audio latency for the realtime communication. (i.e. WebRTC) The actual latency can be varied between the browser vendors.> and what does "stability" refer to?I believe this is sort of a hand-wavy expression of 'no-glitch-in-the-audio-stream' so it means the larger buffer size, but I think we need to make some clarification on the spec.Yeah. Blame the editors for not catching that because I don't really know what that is really supposed to mean, but Hongchan is probably correct in his interpretation.So, you have just sent this Intent to Implement (and Ship), but is this really the API you would like to implement? It seems to me that an API that maps directly to how you will implement it would be better, i.e. an API where you say that it's OK to have x seconds of total delay, defaulting to zero, and you also have a way to see what the actual delay is going to be, as the UA may adjust it to fit in some min/max bounds. Picking three values of x to correspond to three vague labels with no way of telling what the resulting delay is doesn't sound like fun to work with.All good and valid comments.The actual discussion is here: https://github.com/WebAudio/web-audio-api/issues/348, especially comments starting at https://github.com/WebAudio/web-audio-api/issues/348#issuecomment-53919330In a nutshell, we originally proposed a float value to specify the actual buffer size, which the UA could adjust if needed to meet whatever internal requirements. This was frowned upon and people wanted something more descriptive and less precise.Thanks! In that thread, Chris Wilson is making the argument I would. The only criticism that I find convincing is where Jer Noble (Apple) says "setting the buffer size to a high value could conceivably be counterproductive (to power consumption) on certain UAs." However, I think we can solve this:If you say new AudioContext({ acceptableLatency: 10 }) or similar and 10 seconds would be detrimental for power consumption, then clamp it to the optimal latency for the platform, and have a new AudioContext.actualLatency or similar to report the clamped value. Would this match how you intend to implement the current API proposal under the hood?
As with imageSmoothingQuality, I'm quite skeptical about vague hints, and I think it's worth avoiding if at all possible. In this case it does seem possible, because internally "interactive" maps to the lowest reasonable latency and "playback" to the highest reasonable latency, so a UA could clamp to that same range with a more explicit API.There are references to TPAC discussion in that thread, so if there's extra context I'm missing, please clue me in :)
--
To be clear here, does "keep things as they are" mean we're going for the playback category approach?
Even so, the script is not equipped with enough information to make an informed decision. This API seems to suggest that "moar latency == moar efficiency", when that relationship is definitely not linear and may not even be true.
Here's what I bet would happen if this API was standardized: your average WebAudio using page author would tune the latency value for his favorite browser and device, regardless of what affect that value has on the performance of other browsers and other devices.
Instead, with a more declarative API, each UA could pick a latency value which fits the local maxima for performance while meeting the general requirements for the selected "class" of playback.
Thank you, Raymond.As I understand it, the audio going through the graph to AudioDestination is processed somehow, and each node can introduce a delay as well. Does baseLatency include that time?
My impression is that this definition of baseLatency relies on the fact that WebAudio has some partial knowledge of what system rebuffering down the road looks like. As Hongchan said, the actual system can have multiple layers of rebuffering and other processing delays, and baseLatency represents only the rebuffering taking place at the edge of WebAudio and audio rendering logic. Is this understanding correct?
If the actual latency is (baseLatency + X), where X is unknown, how is it actually used to synchronize the visuals on the screen with the sounds?
So baseLatency is supposed to be the inherent delay (if any) between the time the graph has been rendered until it get sent out to the browser, more or less.
If the actual latency is (baseLatency + X), where X is unknown, how is it actually used to synchronize the visuals on the screen with the sounds?This is where the output time stamp comes in. It produces an estimate of baseLatency + X and matches that with Performance.Now time. Knowing these two values, the user can adjust the timing of the visuals to the audio time. This is explained in great (and complicated) detail in webaudio issue 12.
So baseLatency is supposed to be the inherent delay (if any) between the time the graph has been rendered until it get sent out to the browser, more or less.How our decision to move rebuffering out of AudioDestination node to the browser rendering logic affects this statement? Will baseLatency always be 128, or what we name baseLatency then?
If the actual latency is (baseLatency + X), where X is unknown, how is it actually used to synchronize the visuals on the screen with the sounds?This is where the output time stamp comes in. It produces an estimate of baseLatency + X and matches that with Performance.Now time. Knowing these two values, the user can adjust the timing of the visuals to the audio time. This is explained in great (and complicated) detail in webaudio issue 12.Thanks for the reference, it's a lot of stuff to process :)Isn't estimate of (baseLatency + X) the same as estimating just some Y which does not require baseLatency knowledge? Why would the user need to know baseLatency itself?
Thanks for the reference, it's a lot of stuff to process :)Isn't estimate of (baseLatency + X) the same as estimating just some Y which does not require baseLatency knowledge? Why would the user need to know baseLatency itself?This is how the output time stamp is spec'd. It's just Y. But internally we would need to compute Y using baseLatency. I think.If baseLatency is the interactive event delay, the user may want to know.
We should probably ask Mozilla and Microsoft how they're interpreting baseLatency.
Thanks for the reference, it's a lot of stuff to process :)Isn't estimate of (baseLatency + X) the same as estimating just some Y which does not require baseLatency knowledge? Why would the user need to know baseLatency itself?This is how the output time stamp is spec'd. It's just Y. But internally we would need to compute Y using baseLatency. I think.If baseLatency is the interactive event delay, the user may want to know.If this value can't be used directly by the user, why the user would care about its change?
Output time stamp changes dynamically with baseLatency change, right? Isn't this knowledge enough?
Olga
Olga
/Henrik
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.
Time to revive this intent to implement and ship. (Should we continue on this thread or create a new one?)The spec has been updated (a while ago) to include both a category enum and an explicit (double) value.
We (well, andrew.macpherson@soundtrap.com, really) have started to implement this.
Just realized there's an issue with the spec on this. These are currently defined for a BaseAudioContext. But these dont' make sense for an OfflineAudioContext, so these should be defined only for an AudioContext.Filed https://github.com/WebAudio/web-audio-api/issues/1097 for this.
On Wed, Nov 30, 2016 at 9:42 AM, Raymond Toy <rt...@google.com> wrote:
Time to revive this intent to implement and ship. (Should we continue on this thread or create a new one?)The spec has been updated (a while ago) to include both a category enum and an explicit (double) value.
We (well, andrew.m...@soundtrap.com, really) have started to implement this.