Hi, this is Dasaem from Korea.
Expressive performance modeling, which takes score input and generates humanized MIDI is the exact topic that I’ve been researching about since my Ph.D.
My colleague and I have manually collected about 200 pieces (in MusicXML, counted in movement level) and 1000 performances (in MIDI from Yamaha e-competition). The notes were aligned by automatic algorithm developed by Eita Nakamura. Of course, many performances include wrong notes, but we could train the model at a certain level by masking those misaligned notes during the loss calculation. This dataset was later forked by other researchers and published as ASAP dataset.
We’ve published those research in ICML and ISMIR 2019.
The model is uploaded on GitHub(
https://github.com/jdasam/virtuosoNet), but the documentation of the repository and readability of the code is so horrible and do not reflect recent refactoring and updates.
After some blank period, we are currently preparing our next publication with the same topic, so the repository will be updated as the new publication is ready.
Meanwhile, if you want to try some, I can send you the rendered result using the updated model if you send me MusicXML (uncompressed one, preferably). We used MusicXML as an input because we wanted to include notations such as slur, dynamic, and tempo markings. The model can also handle multi-track scores, even symphonies. But the model will perform it as a piano piece anyway.
You can find some demos from my homepage, but those demos were post-processed to improve the quality.
The demo I uploaded on twitter is a raw output of the model
-Dasaem