High Fidelity Spatial Audio

0 views
Skip to first unread message

Mette Florida

unread,
Jul 25, 2024, 12:41:52 AM7/25/24
to Accord.NET Framework

Today, thousands of industry-leading developers and audio engineers use spatial audio to create powerful and immersive experiences. The best part is that the use cases for spatial audio are virtually limitless. From videoconferencing to virtual events, spatial audio has the potential to change the way people consume audio content and leverage real-time voice technologies forever.

high fidelity spatial audio


Download Filehttps://byltly.com/2zMHzr



High Fidelity is one of the few spatial audio providers that gives developers and audio engineers full control over the entire audio environment. Thousands of industry-leading app development teams rely on us to deliver the following features and benefits:

High Fidelity can process hundreds of sound sources in real-time and in less than 15 minutes, you can launch a simple web application using immersive audio. The number of use cases for spatial audio are virtually limitless.

Our Spatial Audio API allows you to integrate real-time voice communication for hundreds of users without any additional bandwidth requirements. Additionally, access full developer control to easily manipulate positioning, room attenuation, and other aspects of the audio environment. Are you ready to get started?

What is spatial audio? And why is it better? For starters, conventional VOIP audio is like using walkie-talkies. Only one person can talk at a time, and it sounds like everything is coming from a single source because it's mono.

Spatial audio recreates the way we hear sound in real life. Spatial audio delivers sounds so each source comes from a defined location in space. This makes each sound clearer and easier for the listener to understand.

Accurate rendering of 3D spatial audio for interactive virtual auditory displays requires the use of personalized head related transfer functions (HRTFs). We present a new approach to compute personalized HRTFs for any individual based on combining state-of-the-art image-based 3D modeling with an efficient numerical simulation pipeline. Our 3D modeling framework enables capture of the listener's head and torso using consumer-grade digital cameras to estimate a high resolution non-parametric surface representation of the head, including extended vicinity of the listener's ear. We leverage sparse structure from motion and dense surface reconstruction techniques to generate a 3D mesh. This mesh is used as input to a numeric sound propagation solver which uses acoustic reciprocity along with Kirchhoff surface integral representation to efficiently compute the personalized HRTF of an individual. The overall computation takes tens of minutes on multi-core desktop machine. We have used our approach to compute personalized HRTFs of few individuals and present preliminary evaluation. To the best of our knowledge, this is the first commodity technique that can be used to compute personalized HRTFs in a lab or home setting.

Alok Meshram, Ravish Mehra, Hongsheng Yang, Enrique Dunn, Jan-Michael Frahm, and Dinesh Manocha.
P-HRTF: Efficient Personalized HRTF Computationfor High-Fidelity Spatial Sound

Merging Technologies, with its Anubis interface and Pyramix DAW, provides an infrastructure that opens the door to uncompromising high class audio production. In combination with Dear Reality's spatializer and virtual monitoring plugins, this also applies to the forward-looking field of spatial audio.

Merging Technologies is one of the world's foremost manufacturers of high-resolution digital audio recording systems. The company develops professional audio hardware and software products acclaimed by some of the finest audio engineers worldwide.

Pyramix is a digital audio workstation used for decades by professional studios and engineers for music production, mastering, TV, and film post-production. Pyramix has been pushing the boundaries since the beginning of digital audio.

With the addition of Merging to the Sennheiser Group, we are now able to supply a full solution for spatial audio and everyday audio productions, from microphones for recording, mic-pres, processing, and software, all the way to headphones and monitors.

The high-fidelity six-speaker sound system consists of two pairs of dual force-canceling woofers and two tweeters. Enjoy a robust and high-quality audio experience, including spatial audio support for videos and songs with Dolby Atmos.

The enjoyment of reproduced sound and music is a prime pleasure for many, and the high-fidelity reproduction of binaural audio is integral to many applications in augmented and virtual reality. This thesis introduces a framework for binaural headphone auralization of sound systems, together with an in-depth analysis and proposed solutions to address sources of coloration within the signal chain.

The framework includes a novel method for binaural auralization of microphone array impulse responses. Employing a hybrid parametric approach, it utilizes causal multichannel Wiener filtering to synthesize the directional response of the ear, as described by head-related transfer functions (HRTFs), using the microphone array and a model of its acoustic properties. A time-domain polynomial matrix framework is employed for filter computations and direct and reflected sound is treated separately. Results demonstrate a small perceptual difference to reference measured binaural room impulse responses.

Additionally, the thesis addresses the impact of binaural measurement uncertainty and proposes a new measurement technique for HRTFs and headphone transfer functions (HpTFs). The method is based on a cardioid microphone array for open ear canal measurements. Results indicate that the method significantly reduces measurement uncertainty compared to omnidirectional measurements in the ear canal.

Moreover, a phase pre-processing method for HRTFs is introduced that reduces spatial phase variability of the HRTF set at high frequencies while retaining correct interaural coherence for diffuse sound. It is demonstrated that the HRTF phase pre-processing greatly reduces spectral coloration in headphone simulation of amplitude panning on virtual speakers. The method also improves performance in binaural rendering of microphone array recordings.

Finally, the thesis presents a comprehensive model for addressing coloration at the ear-signal level inherent in amplitude panning on speaker arrays. The analysis focuses on pairwise panning on symmetrical speaker setups and monaural correction filters are proposed that are robust to head movements around the sweet spot. The proposed filters are found to mitigate the phantom source elevation effect in stereophonic panning and enhance the perceived spectral similarity between discrete and panned sound sources, with effectiveness contingent on the speaker setup geometry.

Binaural room auralization involves Binaural Room Impulse Responses (BRIRs). Dynamic binaural synthesis (i.e., head-tracked presentation) requires BRIRs for multiple head poses. Artificial heads can be used to measure BRIRs, but BRIR modeling from microphone array room impulse responses (RIRs) is becoming popular since personalized BRIRs can be obtained for any head pose with low extra effort. We present a novel framework for estimating a binaural signal from microphone array signals, using causal Wiener filtering and polynomial matrix formalism. The formulation places no explicit constraints on the geometry of the microphone array and enables directional weighting of the estimation error. A microphone noise model is used for regularization and to balance filter performance and noise gain. A complete procedure for BRIR modeling from microphone array RIRs is also presented, employing the proposed Wiener filtering framework. An application example illustrates the modeling procedure using a 19-channel spherical microphone array. Direct and reflected sound segments are modeled separately. The modeled BRIRs are compared to measured BRIRs and are shown to be waveform-accurate up to at least 1.5 kHz. At higher frequencies, correct statistical properties of diffuse sound field components are aimed for. A listening test indicates small perceptual differences to measured BRIRs. The presented method facilitates fast BRIR data set acquisition for use in dynamic binaural synthesis and is a viable alternative to Ambisonics-based binaural room auralization.

Accurate binaural rendering requires accurate reproduction of binaural signals at the eardrum, which in turn requires adequate binaural technology. We propose a method to measure head-related & headphone transfer functions with a two-microphone array in the ear canal. By implementing a cardioid directional pattern, the forward and reverse propagating sound pressure components are measured separately, thus avoiding the influence of standing waves in the ear canal on the measurements. The method is useful in filter design for individualized binaural rendering that, compared with the blocked-canal method, does not assume acoustically 'open' headphones to be used. The method also mitigates the excessive sensitivity to microphone position of regular open-canal measurements. Validation measurements are conducted using a natural scale replica ear and a MEMS microphone array.

Pre-processing of HRTF phase has proved useful to improve binaural rendering of order-limited spherical harmonics (SH) signals. The adjustment is typically applied at high frequencies and allows to reduce magnitude errors for directional sound field components. This article proposes a practical method for HRTF phase pre-processing using linear phase above a cutoff frequency, and a direction-dependent phase offset to maintain correct diffuse-field interaural coherence. Two applications are discussed - filter design for binaural rendering of microphone array or SH-signals, and reduced coloration in virtual source panning on virtual speaker setups. Factors influencing the perceptual transparency of the phase modification are evaluated subjectively and objectively.

This paper presents a comprehensive model for ear-signal level coloration in stereo amplitude panning, enabling the calculation of monaural correction filters that equalize the average coloration over a small area around the sweet spot. The model takes into account the speaker setup geometry, listener Head-Related Transfer-Functions (HRTFs), the employed pan-law, the direct-to-reflected sound ratio, and the correlation between the speaker signals at the listening position. Coloration in diffuse sound reproduction is also investigated. The coloration model is validated using binaural room impulse response measurements, and the correction filters are found to effectively reduce the difference in composite ear power spectrum between a discrete and virtual center source. A listening test on the perceived spectral difference between these two cases, with stereo setups in front of and behind the listener, indicate that the correction filter improves timbral similarity between a virtual and discrete center source for rear speaker panning. The test also indicates that remaining unmodeled coloration sources are large, especially for front panning. However, a second listening test finds that the correction filter improves accuracy of perceived direction in front panning by mitigating the phantom image elevation effect.

4a15465005
Reply all
Reply to author
Forward
0 new messages