Librosa sample values seem to differ from Scipy

493 views

Skip to first unread message

Matthew Waller

unread,

Jul 24, 2018, 9:21:45 AM7/24/18

to librosa

So I'm trying to get the samples from a wave file and I noticed that it's a different value depending on whether I use scipy or librosa.

sampleFloats, fs = librosa.load('hi.wav', sr=48000)

print('{0:.15f}'.format(sampleFloats[len(sampleFloats)-1]))

from scipy.io.wavfile import read as wavread

# from python_speech_features import mfcc

[samplerate, x] = wavread('hi.wav') # x is a numpy array of integer, representing the samples

# scale to -1.0 -- 1.0

if x.dtype == 'int16':

nb_bits = 16 # -> 16-bit wav files

elif x.dtype == 'int32':

nb_bits = 32 # -> 32-bit wav files

max_nb_bit = float(2 ** (nb_bits - 1))

samples = x / (max_nb_bit + 1.0) # samples is a numpy array of float representing the samples

print(samples[len(samples)-1])

The print statements read:

0.001251220703125

0.001274064182641886

The sample rate for the file is 48000.

Why might they be different? Is librosa using a different normalization?

Brian McFee

unread,

Aug 2, 2018, 1:06:35 PM8/2/18

to librosa

The key differences between librosa.load and a direct wave read are:

float conversion
mono downmixing (by default)
sample rate conversion

If you have a mono file, loaded at the native sampling rate (you can force this with `sr=None` in load), then any differences should be down to the floating point conversion. If it's stereo, or at some other sampling rate, then those will also cause changes in the sample values.

The other thing that could happen is a difference in codec, since librosa uses audioread under the hood, which in tern multiplexes over different decoders (ffmpeg, gstreamer, scipy, mad, etc). So that could also be a source of variation that's less easy to pin down.

Reply all

Reply to author

Forward

0 new messages