Hi Brian,
Answers in line, but I think no. 4 points to the issue:
A few questions about your setup:
1) what are your dependency versions? (numpy, scipy, etc)
numpy==1.8.2
scikit-learn==0.14.1
scipy==0.15.1
2) do you have scikits.samplerate installed?
No, it would not load so I get a message about falling back to scipy.signal
3) what are your beat tracking parameters?
Not sure what this means, I have "stolen" the example from the tutorial, i.e.
hop_length=64
...
y,sr=librosa.load(filename)
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr, hop_length=hop_length)
4) are you running the entire loop in python, or spinning a new python instance for each track?
I was running the entire loop in python and I think this is where the issue is. I have now written a python script which accepts the name of a track and only does BPM detection on that track and then shuts down. I have written a wrapper around that to feed filenames in and that seems to work ok now without spawning the myriad [kworker] processes I was seeing and performance is fine. To use a java expression, I wonder if, for the whole loop being in python, I need to manually collect garbage between each run????
5) is there something special about track 244? Is it much longer than the others?
No, just the average and for the other runs the second track was the point where performance fell off and that was different every time.