librosa requires waveform format in floating points

1,047 views
Skip to first unread message

Big Penguin

unread,
Apr 1, 2022, 1:33:32 PM4/1/22
to physionet-challenges

Dear Organizers,

I am using librosa to extract audio features which are then used for predictions, load_wav_file() method in helper_code.py uses scipy method to read wav files, which returns a list of numpy arrays with non floating point values, when i give these numpy arrays as input to librosa it returns error  "librosa.util.exceptions.ParameterError: Audio data must be floating-point" , now according to rules i am not allowed to edit the helper_code.py, is there anything we can do to resolve this issue ?

Regards,
WaqarAhmad.

PhysioNet Challenge

unread,
Apr 1, 2022, 1:38:27 PM4/1/22
to physionet-challenges
Dear WaqarAhmad,

WAV files can use different data types to encode the signal data. The WAV files in the Challenge data use 16-bit integers.

You can convert an array of 16-bit integers to an array of double precision floating point numbers using the command y = np.asarray(x, dtype=np.float64), where x is the signal data from the load_wav_file function or a similar function.

You do not need to change any of the commands in the helper_code.py file to make this change. These functions are provided for your convenience, and to help run everyone's code in a consistent way. If you need a similar function that works in a different way, or if you want to further process the outputs of some of these functions, then you can define your own functions in your code.

Best,
Matt
(On behalf of the Challenge team.)

https://PhysioNetChallenges.org/
https://PhysioNet.org/

Please post questions and comments in the forum. However, if your question reveals information about your entry, then please email challenge at physionet.org. We may post parts of our reply publicly if we feel that all Challengers should benefit from it. We will not answer emails about the Challenge to any other address. This email is maintained by a group. Please do not email us individually.

wenh06

unread,
Apr 2, 2022, 10:32:58 AM4/2/22
to physionet-challenges
the `recordings` are read by `scipy.io.wavfile.read`, with the following possible data types:
=====================  ===========  ===========  =============
    WAV format            Min          Max       NumPy dtype
=====================  ===========  ===========  =============
32-bit floating-point  -1.0         +1.0         float32
32-bit PCM             -2147483648  +2147483647  int32
16-bit PCM             -32768       +32767       int16
8-bit PCM              0            255          uint8
=====================  ===========  ===========  =============
Note that 8-bit PCM is unsigned.

I use the following function to do conversion (or normalization to [-1,1] which is the usual practice for autio data)
def _to_dtype(data:np.ndarray, dtype:np.dtype=np.float32) -> np.ndarray:
    """
    """
    if data.dtype == dtype:
        return data
    if data.dtype in (np.int8, np.uint8, np.int16, np.int32, np.int64):
        data = data.astype(dtype) / (np.iinfo(data.dtype).max+1)
    return data
the result is at least correct for int16 type data (i.e. identical to data read using librosa or torchaudio):
dtype.png

PhysioNet Challenge

unread,
Apr 2, 2022, 10:40:18 AM4/2/22
to physionet-challenges
Thanks, Wen Hao. This is a helpful way to convert between integer and floating point representations of the recordings. Yes, 16 bit integers can be represented exactly with either single or double precision floating point numbers, and dividing a floating point number by a power of two just flips a couple of bits in the binary representation of the number.

On a related note, we wanted to emphasize that teams should not exchange research ideas with each other until after the end of the Challenge (where they are welcome, and indeed encouraged, to continue to work on the data). Collaboration before the end of the Challenge will result in disqualification, but teams can always ask us beforehand if something is allowed or not. We enforce these rules because we want as many *independent* solutions as possible. This reduces the chances that, as a research community, we arrive at an erroneous group-think solution.

This advice about how to convert an integer into a floating point number is fine, but we want to be careful not to discuss data normalization and other data preprocessing steps before the end of the Challenge because they are fundamental parts of your approaches. Of course, questions and comments about the data, the scoring metric, the computational environment for running the entries, etc. are welcome and encouraged, and contribute to the quality of the Challenges and to our Challenge community.

Best,
Matt
(On behalf of the Challenge team.)

https://PhysioNetChallenges.org/
https://PhysioNet.org/

Please post questions and comments in the forum. However, if your question reveals information about your entry, then please email challenge at physionet.org. We may post parts of our reply publicly if we feel that all Challengers should benefit from it. We will not answer emails about the Challenge to any other address. This email is maintained by a group. Please do not email us individually.

Reply all
Reply to author
Forward
0 new messages