Hmm, I still struggle a bit with using the ODF/Peak functions correctly in a realtime environment. There are several problems:
- realtime processing yields different results: when I feed the complete signal at once compared to feeding it "streamed" (in chunks of hop_size)
- it seems to me SpectralOnsetProcessor does not save any internal state and therefore calculates the features way too often, making it notably slower
- the closest results I got with looking at how to calculate the onset-detection-function in a way it matches the result when feeding the whole signal at once as close as possible, which resulted in this code:
fs = 2048
hs = 512
odf = SpectralOnsetProcessor('superflux', filterbank=LogFB, num_bands=36, log=np.log10, sample_rate=w.rate, frame_size=fs, hop_size=hs)
peak_sf = OnsetPeakPickingProcessor(fps=w.rate/hs, threshold=0.75, pre_max=0, combine=0.15, online=True, pre_avg=0)
print(peak_sf(odf(signal)).round(3))
o = peak_sf(odf(signal[0:fs]), reset=True)
offset = int(fs/hs) + 1 # 5
index = offset - 2 # 3
onsets = []
for i in range(int(signal.shape[0]/512)-5): # 3.24s
st = i*hs/48000
f = odf(signal[i*hs:(i+offset)*hs])[index:index+1]
onsets += peak_sf(f, reset=False).round(3).tolist()
print(onsets)
Output:
[1.739, 2.816, 3.819, 4.875, 5.973, 7.029, 8.096, 9.227, 10.304, 11.552, 12.811, 14.752, 15.403]
[1.781, 2.859, 3.861, 4.917, 6.016, 7.072, 8.139, 9.269, 10.347, 11.595, 12.853, 14.795, 15.445]
As you can see the "streamed" version is always ~42ms off
My annotated onsets are between those two but much closer to the first one (maybe 5-10ms later)
So I guess this is O.K. for my use case, but I still wonder whether there is a better way of doing this, so streamed processing matches the "non-streamed" result and without slowing down? It should be possible, right?
Cheers
Silvan Laube