BYOAB (bring your own audio buffers) to the objc native api

chris hinkle

unread,

Jun 23, 2020, 6:32:31 PM6/23/20

to discuss-webrtc

Hi all,

I'm looking to use the objc native api (I hope I'm using the correct terminology) to build an audio chat application targeting the iOS and macOS platforms (future plans exist for Android, Windows and web as well.) The WebRTC library (consumed as WebRTC.framework, built from source, for each platform) does an excellent job and provides a near "link and play" experience for API consumers to build a WebRTC application with minimal effort. However the API hides the all of details of the audio pipeline. I've surmised from previous discussions in this group that the path forward to working with audio at lower levels is to customize the WebRTC library, specifically to implement a custom "Audio Device Module."

I have a few questions:

Is my assumption correct about having to customize the library in order to read and write audio buffer data to and from the peer connection interface? Or did I miss some API somewhere?
If my assumption is correct, is there a recent example or tutorial which could help me?
Could the API benefit from supporting this functionality natively, and would it be hard to do so? I'm happy to contribute, and would love some help.

It's useful to understand the use cases/justification for such functionality. Here are a few:

Audio Session management (on iOS). The underlying implementation (audio_device_ios) relies on some properties of AVAudioSession to be set in concert with the initialization and setup of various artifacts of the audio device. Many API consumers probably have very simple requirements for AVAudioSession, but those looking to provide the most considered audio experience and integrate with the latest platform capabilities will need to exploit all of the faculties of AVAudioSession, and keeping those details hidden might frustrate that effort. I am aware there is an interface to influence the configuration of the AVAudioSession under the hood, but allowing for a way around it might relieve some of the pressure for the library maintainers to act as a liaison between this minority group of API consumers and the underlying AVAudioSession configuration. Also, an API consumer might find it useful to influence the SDP offer generation without effecting an AVAudioSession category change.
Audio processing of captured and rendered audio: Add a pitch-shifter to the microphone before it's encoded and delivered to liven up a boring conversation. Add some custom EQ the rendered stream to correlate with sound effects emitted from the app.
Alternative sources or audio: Is captured audio always a microphone? A music app could send the output of a software synth or audio sampler. Sound effects could be added to a user's captured microphone stream.
Recording of captured or received media: Podcasting is big these days and a popular technique is to record each end of a session (at full resolution, uncompressed) so that a conversation held remotely can be easily integrated into a broadcast-quality work.

Audio support on macOS and iOS has a long history, but the current state of affairs is pretty great. AVAudioEngine is supported on both platforms complete with the Voice Processing audio unit (AEC, AGC) and as of iOS 13 and macOS 10.15 supports realtime audio i/o. All we need is a place to direct the bytes from one high level API (e.g. AVAudioEngine) to another high level API (e.g. WebRTC) via platform-agnostic, realtime-capable means (e.g. PCM data.)

Thanks!

Byoungchan Lee

unread,

Jun 24, 2020, 2:09:01 AM6/24/20

to discuss-webrtc

Implementing ADM is a nontrivial task, so I hope that there exist interfaces mimic to the WebAudio. Then, users of WebRTC can easily customize audio inputs and outputs.

Jacob Sologub

unread,

Jan 7, 2023, 10:01:32 AM1/7/23

to discuss-webrtc

Hi Chris,

Did you ever figure out how to do this? Also looking for a way to feed my own Audio Buffers to a peer connection.

Thanks,

js

guest271314

unread,

Jan 7, 2023, 12:27:14 PM1/7/23

to discuss-webrtc

Alternative sources or audio: Is captured audio always a microphone? A music app could send the output of a software synth or audio sampler. Sound effects could be added to a user's captured microphone stream.

It is possible to programmatically set the default device to one other than a microphone by remapping a device to be recognized as a microphone - on Linux. See https://github.com/edisionnano/Screenshare-with-audio-on-Discord-with-Linux; https://github.com/guest271314/captureSystemAudio#pulseaudio-module-remap-source. This is how I do that on Linux

pactl load-module module-remap-source \
master=@DEFAULT_MONITOR@ \
source_name=speakers source_properties=device.description=Speakers \
&& pactl set-default-source speakers

I don't have experience with this on iOS and macOS however the requirement does appear to be possible, see https://www.buildtoconnect.com/help/how-to-record-system-audio; https://github.com/WebAudio/web-audio-api/issues/2478#issuecomment-1052712946.

Reply all

Reply to author

Forward