RNN performance vs CNN

Björn Lindqvist

unread,

Apr 27, 2019, 1:45:51 PM4/27/19

to madmom-users

Hi,

I have been following the tutorial here and implemented the same network:

https://github.com/slychief/ismir2018_tutorial/blob/master/Part_3b_RNN_Onset_Detection.ipynb

As suggested by the tutorial, I have trained and cross-validated eight
neural networks. Each took between 300 to 400 epochs to train. To my
surprise, the F1-score is 89.9. That is much higher than the 87.3
figure reported in "Enhanced peak picking for onset detection with
recurrent neural networks" and comparable to the 90.3 figure in
"Improved musical onset detection with Convolutional Neural Networks"

My question is if this improvement is expected? Do you think I have
perhaps trained my network incorrectly?

--
mvh/best regards Björn Lindqvist

Sebastian Böck

unread,

Apr 29, 2019, 5:40:21 AM4/29/19

to madmom-users

Hi,

it's quite possible that you experience/obtain better results than those given in the notebook. This might be due to other initialisation of the weights. However, the results usually differ by less than 0.5%, so I'd be a bit suspicious.

Things to be checked:

- did you test the files with the 'correct' network? I.e. files must not overlap between train/test split

- did you use the same network topology/size?

- did you use the same optimisation algorithm?

- did you use another backend?

- which version of keras did you use?

If you can reproduce the better results, I'd be interested in what might have changed w.r.t. that notebook.

Björn Lindqvist

unread,

Apr 30, 2019, 1:14:09 PM4/30/19

to Sebastian Böck, madmom-users

Hi!

I haven't used the code in the notebook verbatim but have instead
adapted it for my own needs. In particular, I implemented some
functions to resume training in case I had to abort a training run. My
training regimen is almost as given in the notebook, except that I
shuffle the data (only once, of course) before model training. I think
that is more interesting than using the pre-generated folds in
the onsets_ISMIR_2012/splits/ directory.

Right now I'm rerunning the experiments to see if my score was the
result of a fluke or of bugs.

Btw, I have noticed that both the recurrent and convolutional neural
network in Madmom are quite different from those described in your
articles. How come? For instance, the CNN uses tanh activations on the
convolutional layers, but you got a higher score using relu.

Den mån 29 apr. 2019 kl 11:40 skrev Sebastian Böck
<sebastian.bo...@gmail.com>:

> --
> You received this message because you are subscribed to the Google Groups "madmom-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to madmom-users...@googlegroups.com.
> To post to this group, send email to madmom...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/madmom-users/fa9ee38c-f804-4d6a-9076-0d61b8c21850%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Sebastian Böck

unread,

Apr 30, 2019, 2:28:10 PM4/30/19

to madmom-users

The models in madmom are not exactly those of the papers, but rather those submitted to MIREX. The changes, however, are stated in the code. In case of the RNNs, these are mostly pre-processing changes, and the CNN model is tat of the 2013 paper and that used tanh activations.

Reply all

Reply to author

Forward