Onset detection results in time greater than the duration of the file

39 views
Skip to first unread message

pavl...@gmail.com

unread,
May 31, 2017, 12:34:49 PM5/31/17
to madmom-users
I was wondering what happens here:

  proc = madmom.OnsetPeakPickingProcessor(threshold=17, pre_max=0.1,
  post_max
=0.1, pre_avg=0.2, post_avg=0.2, smooth=0.1)
  sodf
= madmom.SpectralOnsetProcessor(onset_method='superflux',
    filterbank
=LogarithmicFilterbank, num_bands=24, log=np.log10, norm=True)(filename)

The above code is used to detect onset times in an audio file. The resulting list is:

[  4.00000000e-02   9.47000000e+00   2.02000000e+01   3.74300000e+01
   
4.84300000e+01   5.58700000e+01   6.08700000e+01   7.07200000e+01
   
7.88800000e+01   8.60600000e+01   9.73300000e+01   1.09080000e+02]

As you can see the last element is 109 which is 1:49. However, the audio file only lasts for 1:48.

At a later time I convert those times to frames and do pitch detection on them, and I get errors for this particular onset, as it is outside the audio duration.

Any ideas?

Thanks.

Sebastian Böck

unread,
May 31, 2017, 2:43:07 PM5/31/17
to madmom-users, pavl...@gmail.com
I suspect that the signal is a bit longer than 1:48, closer to 1:49, but whatever software you use to report this length simply truncates it. An exact measure is the number of samples in the signal.

When this signal is split into overlapping frames, it can happen that only very few samples are left and thus a new frame is created. FramesSignal stops as soon as every sample of the signal was covered by at least one frame. Please see the 'end' parameter description in the "Notes" section of the FramedSignal documentation.

The location of a frame is relative to its centre, thus is could happen that the reported centre is outside the signal. One could argue that this is not intuitive, but it is the expected behaviour. We could consider adding a 'truncate' option to the 'end' parameter which stops as soon as the centre is outside the signal, though.

One thing still surprises me a bit. Are you talking about madmom frames or how do you compute the frames for pitch estimation?

P.S. it is not really considered good programming style to import something with a name usually used for something different. In your case you refer to or import madmom.features.onsets simply as madmom. 

pavl...@gmail.com

unread,
May 31, 2017, 2:55:14 PM5/31/17
to madmom-users, pavl...@gmail.com
Makes sense. Thanks.

No I am not talking about Madmom frames. I use Madmom for Onset Detection (and Chord Detection) and then this to convert to frames and then do pitch detection. 

Sebastian Böck

unread,
May 31, 2017, 3:21:35 PM5/31/17
to madmom-users, pavl...@gmail.com
That's rather complicated, since seconds can be easy converted to frame indices by simply multiplying them by the 'fps' used.

Have you considered using madmom's FramedSignal? There's basically no need to do everything multiple times. You can use a FramedSignal directly to be fed into SpectralOnsetDetection.

How do you estimate the pitches then?

P.S. you can expect pitch estimation to be included in madmom within the next months
Reply all
Reply to author
Forward
0 new messages