Robby Nevels
unread,Dec 18, 2020, 1:09:17 PM12/18/20Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Magenta Discuss, Ian Simon, Magenta Discuss, Curtis Hawthorne, Robby Nevels
I ran some initial experiments with this idea, and the results look promising! Attached is a graph of validation NLL loss and erroneous events over epoch while training. The errors were computed each epoch by sampling the model for 1000 sequential steps after summing how many times "note-on-while-note-already-on" and "note-off-before-note-on" occurs.
Teal is just passed the one hot event vector. Blue is passed the one hot event concatenated with a 128-element vector containing the notes that are currently on and off. When conditioned with this vector, loss drops faster, and errors disappear earlier! I also preprocessed the conditioning vector in a way that makes it fast to retrieve and apply data augmentation, so training didn't take any longer.
The models themselves aren't based on PerformanceRNN/LSTMs, so the results might be less dramatic when I apply this to them. But I noticed even a large PerformanceRNN has errors after training for a long time, so I still expect to see some benefit. There's a few other things I want to check and test, and then I'll write something up with more details. Let me know if there's anything you'd suggest including.
Thanks for the great idea, Ian!
Robby