[LAP24] Question about Task 2 threshold

71 views
Skip to first unread message

Yoshiki Masuyama

unread,
May 31, 2024, 10:44:57 AM5/31/24
to SONICOM LAP Challenge
Hi all,

I would like to clarify how the threshold for Task 2 is determined.

The description says that it is based on the performance of the baseline methods for upsampling from 5 or 80 observations. Fig. 1 indicates that HRTFs were upsampled to 828 positions while the SONICOM spatial grid contains 793 positions. What is the difference in the number of positions?

As another question, did the baseline methods pass the threshold even in the most sparse case, i.e., with only 3 measurements?

Kind regards,
Yoshiki Masuyama

Aidan Hogg

unread,
Jun 4, 2024, 10:01:25 AM6/4/24
to SONICOM LAP Challenge

Hi Yoshiki,


Sorry for the delay. 


First, in answer to your question regarding the discrepancy between the number of positions in the SONICOM HRTF dataset. The discrepancy arises from the repetition of the top measurement at the 90-degree elevation. This is due to the measurement rig setup, which uses a loudspeaker arch where the participant is rotated and measured every 5 degrees, resulting in the 90-degree elevation measurement being measured multiple times (36 times in total). Initially, the SONICOM HRTF dataset retained these 35 redundant measurements, and the original upsampling paper referenced those HRTFs with 828 transfer functions. The redundant positions have since been removed, leaving only the measurements from the 793 unique positions.


Second, in answer to your question regarding the thresholds, they were based on the fact that an HRTF selection method will achieve around 7 to 8 LSD with no personalisation. Therefore, we believe the thresholds are quite generous, and if they are not met, the HRTFs will probably not be very realistic. 


The challenge is meant to encourage methods that can deal with sparse measurements, such as ML techniques, rather than the baselines, which will fail in this scenario. So, the baselines' performance was not used to calculate the threshold values, and in fact, the baselines do not pass the thresholds.

For transparency, below is the performance of the two baselines in terms of their mean LSD:


Barycentric baseline:


100 positions: 3.67

19 positions: 4.85

5 positions: 7.69

3 positions: 8.37


SHT baseline (a vanilla approach with no pre-processing):


100 positions: 4.19

19 positions: 5.39

5 positions: 16.83

3 positions: 16.46


Note that both baselines fail to meet the threshold of 7.4 at the low sparsity levels. This is not that surprising as they can only perform a weighted sum of the existing points without any prior knowledge.


Please also note a couple of things: 

  1. Although there will be an overall winner, we will also mention the 'winner' for each sparsity level. 

  2. You are not limited to upsampling with a single algorithm. We are asking (only) for 12 SOFA files. So, you can use different methods depending on the sparsity level.

 

I hope this helps to clarify any confusion. Let me know if you have any more questions.


Cheers,

Aidan, along with the rest of the LAP team


---------------------------------------------------------

Dr Aidan Hogg

Lecturer at Queen Mary University of London

Honorary Research Associate at Imperial College London


Centre for Digital Music 

Electronic Engineering and Computer Science
Queen Mary University of London
327 Mile End Road, London
E1 4NS, U.K

Email: a.h...@qmul.ac.uk

Personal Website: aidanhogg.uk

QMUL Group Website: c4dm.eecs.qmul.ac.uk
Imperial Group Website: axdesign.co.uk
Imperial Website: imperial.ac.uk/people/a.hogg

Yoshiki Masuyama

unread,
Jun 4, 2024, 10:19:02 AM6/4/24
to Aidan Hogg, SONICOM LAP Challenge
Hi Aidan,

Thank you so much for your clarification. I understood the meaning of 828 in the plots.

Also, I'm happy to see the detailed LSD of the baseline as a number.
Could you also share similar numbers for ITD? The ITD threshold is also challenging as 31.6 μs is less than two samples with 48 kHz sampling.
I may misunderstand something. Please feel free to correct me.

Kind regards,
Yoshiki Masuyama

2024年6月4日(火) 23:01 'Aidan Hogg' via SONICOM LAP Challenge <sonicom-la...@googlegroups.com>:
--
The IEEE Signal Processing Society is sponsoring the Listener Acoustic Personalisation Challenge.
---
You received this message because you are subscribed to the Google Groups "SONICOM LAP Challenge" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sonicom-lap-chal...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/sonicom-lap-challenge/22e0ed06-ebb1-4696-b77c-62a800c247f4n%40googlegroups.com.

Aidan Hogg

unread,
Jun 5, 2024, 7:03:11 AM6/5/24
to SONICOM LAP Challenge

Hi Yoshiki,


We agree with you that 31.6 μs is very strict (probably too strict), and the vanilla baseline approaches often fail to pass it. In light of this, we have decided to increase the ITD threshold to 62.5 μs.


This is mainly because it does seem unfair to disqualify entries only because the ITD is two samples out. Also, we will still be ranking the ITD performance.


The new threshold value of 62.5 μs (3 samples) has been chosen because the maximum ITD is approximately 660 μs when a sound source is placed at 90° azimuth, and the normal human detection threshold for an ITD is around 10 μs. Thus, the new threshold is around 10% of the average maximum ITD value instead of 5%.


We will update the description document and evaluation code to reflect this new ITD threshold and will send out an announcement shortly. We will also update the plots (in the description document) to reflect the LAP challenge configuration.


In the meantime, as requested, please see the ITD results for the two baselines below:


Barycentric baseline:


100 positions: 

Mean ITD Error: 33.609

Mean ILD Error: 0.564

Mean LSD Error: 3.298 


19 positions: 

Mean ITD Error: 37.544

Mean ILD Error: 1.798

Mean LSD Error: 4.865


5 positions: 

Mean ITD Error: 36.757

Mean ILD Error: 4.683

Mean LSD Error: 7.699


3 positions: 

Mean ITD Error: 49.634

Mean ILD Error: 6.861

Mean LSD Error: 8.368


SHT baseline (a vanilla approach with no pre-processing):


100 positions: 

Mean ITD Error: 42.126

Mean ILD Error: 0.587

Mean LSD Error: 4.195


19 positions: 

Mean ITD Error: 42.455

Mean ILD Error: 1.685

Mean LSD Error: 5.388


5 positions: 

Mean ITD Error: 43.542

Mean ILD Error: 7.923

Mean LSD Error: 16.831


3 positions: 

Mean ITD Error: 51.818

Mean ILD Error: 9.140

Mean LSD Error: 16.469


Thank you so much for your help with this, and we apologise again for the last-minute adjustments. Given this is the first time we are running the LAP challenge, we are trying to be as flexible and open as possible. 


Thank you for being accommodating.

Cheers,

Aidan, along with the rest of the LAP team


---------------------------------------------------------

Dr Aidan Hogg

Lecturer at Queen Mary University of London

Honorary Research Associate at Imperial College London


Centre for Digital Music 

Electronic Engineering and Computer Science
Queen Mary University of London
327 Mile End Road, London
E1 4NS, U.K

Email: a.h...@qmul.ac.uk

Personal Website: aidanhogg.uk

QMUL Group Website: c4dm.eecs.qmul.ac.uk
Imperial Group Website: axdesign.co.uk
Imperial Website: imperial.ac.uk/people/a.hogg


Reply all
Reply to author
Forward
0 new messages