Librosa sample values seem to differ from Scipy

493 views
Skip to first unread message

Matthew Waller

unread,
Jul 24, 2018, 9:21:45 AM7/24/18
to librosa
So I'm trying to get the samples from a wave file and I noticed that it's a different value depending on whether I use scipy or librosa.

    sampleFloats, fs = librosa.load('hi.wav', sr=48000)
    print('{0:.15f}'.format(sampleFloats[len(sampleFloats)-1]))
    
    from scipy.io.wavfile import read as wavread
    # from python_speech_features import mfcc
    
    [samplerate, x] = wavread('hi.wav') # x is a numpy array of integer, representing the samples 

    # scale to -1.0 -- 1.0
    if x.dtype == 'int16':
        nb_bits = 16 # -> 16-bit wav files
    elif x.dtype == 'int32':
        nb_bits = 32 # -> 32-bit wav files
    max_nb_bit = float(2 ** (nb_bits - 1))
    samples = x / (max_nb_bit + 1.0) # samples is a numpy array of float representing the samples 
    
    print(samples[len(samples)-1])

The print statements read:

    0.001251220703125
    0.001274064182641886

The sample rate for the file is 48000. 

Why might they be different? Is librosa using a different normalization?

Brian McFee

unread,
Aug 2, 2018, 1:06:35 PM8/2/18
to librosa
The key differences between librosa.load and a direct wave read are:
  1. float conversion
  2. mono downmixing (by default)
  3. sample rate conversion
If you have a mono file, loaded at the native sampling rate (you can force this with `sr=None` in load), then any differences should be down to the floating point conversion.  If it's stereo, or at some other sampling rate, then those will also cause changes in the sample values.

The other thing that could happen is a difference in codec, since librosa uses audioread under the hood, which in tern multiplexes over different decoders (ffmpeg, gstreamer, scipy, mad, etc).  So that could also be a source of variation that's less easy to pin down.
Reply all
Reply to author
Forward
0 new messages