What is the unit of audio sample in librosa?

873 views
Skip to first unread message

mkj...@gmail.com

unread,
Jul 9, 2018, 8:25:00 AM7/9/18
to librosa
These days, I'm using librosa. As a basic step to load audio files, one can use the function below.

    librosa.core.load()
  1. 1. Then an audio file is represented as audio time series. I think each value of the time series is an amplitude of audio. However, I wonder what the unit of the amplitudes is.
  2. 2. Also, what is the relationship among amplitude, power, dB, and energy?

Brian McFee

unread,
Jul 9, 2018, 12:48:19 PM7/9/18
to librosa
On Monday, July 9, 2018 at 8:25:00 AM UTC-4,  wrote:
These days, I'm using librosa. As a basic step to load audio files, one can use the function below.

    librosa.core.load()
  1. 1. Then an audio file is represented as audio time series. I think each value of the time series is an amplitude of audio. However, I wonder what the unit of the amplitudes is.

When reading integer-valued samples (eg from a .wav file), load uses `util.buf_to_float` to convert integer samples to fractions of the corresponding MAX_INT range.  So if you have a 16bit wav, a sample value of 8192 would map to 8192 / (2**15 - 1) = +0.25.  The values coming out of load do not directly have a unit, though if your input file has some known unit of measurement X, then load's output will be in X / MAX_INT.
 
  1. 2. Also, what is the relationship among amplitude, power, dB, and energy?

These terms are not always used consistently, but you can think of it as follows:
  • amplitude is the height of the waveform at any sample position [n]
  • power is the sum of squared amplitude over a window [n:n+w]
  • energy is the square-root of power.  You can get this from feature.rmse()
  • dB is (roughly) log10 of power**10 (compared to a reference point).  You can get this from energy or power via amplitude_to_db() or power_to_db().  (Apologies for the inconsistent naming here.)
It gets a little tricky because we usually look at power/energy/dB in the spectral domain and confined to a specific frequency band, not the entire window.  But I hope that clarifies things for you!
Reply all
Reply to author
Forward
0 new messages