delta and delta-delta features

Richard Mushi

unread,

Mar 8, 2023, 2:27:41 AM3/8/23

to librosa

Hello Brian

Sorry, I have questions regarding delta(differential) and delta-delta(accelerator) features.

Suppose someone applies signal waveform directly to the librosa.feature.delta function, what is he/she going to obtain?

Regards

Richard

Vincent Lostanlen

unread,

Mar 8, 2023, 3:26:27 AM3/8/23

to Richard Mushi, librosa

Hello Richard,

You will get the first and second derivative of the waveform, as you’d expect.

Some knowledge about Z-transforms tells us that this is tantamount to EQ-ing the signal so as to boost higher frequencies: linear in frequency for delta and quadratic in frequency for delta-delta.

The reason why we have delta and delta-delta in librosa is that np.diff makes no attempt at handling boundary effects. So if x has length T, np.diff(x) has length T-1, and np.diff(x, 2) has length T-2. This becomes a problem for feature engineering because one typically wants to stack features (e.g., MFCC) with its first and second derivative.

Meanwhile, with librosa, we guarantee that delta leaves the input length (T) unchanged. We do so by relying on a numerically stable implemention of n-th order differentiation, known as Savitzky-Golay filtering (see “width” keyword argument). This filter has the property of smoothing the signal around the current point before estimating the derivative.

https://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter

We also pad the input beyond its boundaries (see “mode” keyword argument) so as to compensate for the reduction in length.

Since 2018, our implementation is a thin wrapper around scipy.signal.savgol_filter, preceded by some input checks.

https://librosa.org/doc/main/_modules/librosa/feature/utils.html#delta

The relevant piece of conversation between Brian and me is here:

https://github.com/librosa/librosa/pull/663#issuecomment-364429625

I hope this helps!

Vincent.

--
You received this message because you are subscribed to the Google Groups "librosa" group.
To unsubscribe from this group and stop receiving emails from it, send an email to librosa+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/librosa/f31bf23f-7a19-42ba-a533-d1b53f201fe0n%40googlegroups.com.

Richard Mushi

unread,

Mar 8, 2023, 3:59:07 AM3/8/23

to librosa

Dear Vicent

Thank you for your valuable input.

I will deeply appreciate if you can give me more description about, ""EQ-ing the signal so as to boost higher frequencies: linear in frequency for delta and quadratic in frequency for delta-delta""?

With thanks

Richard

Vincent Lostanlen

unread,

Mar 8, 2023, 4:46:35 AM3/8/23

to Richard Mushi, librosa

Hello Richard,

All of this is explained in detail in the “Differentiator” page of Wikipedia

https://en.wikipedia.org/wiki/Differentiator

See the “Frequency response” subsection in particular.

Sincerely,

Vincent.

To view this discussion on the web visit https://groups.google.com/d/msgid/librosa/a3ef82a9-35b0-4589-b41a-0464c42e3705n%40googlegroups.com.

Richard Mushi

unread,

Nov 13, 2023, 5:05:50 PM11/13/23

to Vincent Lostanlen, librosa

Dear Friends

suppose that you have an audio signal with a sampling rate of 2000 Hz and you want to calculate chroma_CQT using librosa. What will the values of the following parameters be? hop_length, n_chroma, n_octaves, fmin, and bins_per_octave?

Also, give me an idea how do you find them.

To view this discussion on the web visit https://groups.google.com/d/msgid/librosa/63F3EB15-D557-4DEF-A917-D69B5A8DB883%40nyu.edu.

Vincent Lostanlen

unread,

Nov 14, 2023, 1:25:28 AM11/14/23

to Richard Mushi, librosa

Hello,

Are you sure it’s 2000 Hz? That sounds quite low for musical audio.

hop_length, n_chroma, n_octaves, fmin, and bins_per_octave?

Difficult to say without knowing your application but some defaults are given in the docs, which assumes a sample rate of 22050 Hz

librosa.feature.chroma_cqt — librosa 0.10.1 documentation

librosa.org

The hop_length in the docs is 23 milliseconds. Since you have a lower sample rate, consider reducing hop_length to maybe 32 samples (16 milliseconds) or 64 samples (32 milliseconds).

All the other parameters can stay the same. At least it’s worth trying that way and seeing what happens.