Humanizing MIDI files

Matt Montag

unread,

Feb 23, 2020, 9:35:19 PM2/23/20

to Magenta Discuss

Hi folks, new here. Not sure if this has been done before. Appreciate any links if so!

Can the Performance RNN be used to impose human performance characteristics to straight-ahead sequenced MIDI files? (Especially if constrained to solo piano.)

Or would this require a completely different architecture?

Matt

Ian Simon

unread,

Feb 24, 2020, 12:42:54 PM2/24/20

to Matt Montag, Magenta Discuss

Hi Matt, Performance RNN (or Music Transformer) as is cannot humanize an existing score. One way to approach the problem would be to collect a data set of pairs of MIDI files -- one quantized and one humanized -- of the same piece, and train a model to map quantized -> humanized. This could work with essentially the same model architecture, but different training data. There are also approaches with different model architectures that might work.

But, long story short, we do not currently have such a model available.

-Ian

--
Magenta project: magenta.tensorflow.org
To post to this group, send email to magenta...@tensorflow.org
To unsubscribe from this group, send email to magenta-discu...@tensorflow.org
---
To unsubscribe from this group and stop receiving emails from it, send an email to magenta-discu...@tensorflow.org.

Carlos Martorell

unread,

Feb 24, 2020, 12:46:01 PM2/24/20

to Ian Simon, Matt Montag, Magenta Discuss

Hi Matt,

I understand you want to solve the problem via a NN, but do you know this max plugin from James Holden?

https://maxforlive.com/library/device/2466/group-humanizer

Salut!

Carlos

Jesse Engel

unread,

Feb 24, 2020, 1:05:27 PM2/24/20

to Carlos Martorell, Ian Simon, Matt Montag, Magenta Discuss

Also, it's good to mention that this is exactly what the Drumify plugin does for drum loops.

Best,
Jesse

Richard JE Cooke

unread,

Jun 22, 2022, 8:15:42 AM6/22/22

to Magenta Discuss, Ian Simon, Magenta Discuss, matt....@gmail.com

How many pieces would you need to do this? It sounds like a holy grail for millions of composers. Do you know of any other research into this?

I'm also wondering how exactly you'd match the notation to the recording. Recordings would have missing notes, extra notes, wrong notes, huge changes in timing. Would the poor little AI get confused.

Adam Roberts

unread,

Jun 22, 2022, 9:44:44 AM6/22/22

to Richard JE Cooke, Magenta Discuss, Ian Simon, matt....@gmail.com

GrooVAE does this for drum beats: https://magenta.tensorflow.org/groovae

Dasaem Jeong

unread,

Jun 22, 2022, 10:15:08 AM6/22/22

to Magenta Discuss, Magenta Discuss, matt....@gmail.com

Hi, this is Dasaem from Korea.

Expressive performance modeling, which takes score input and generates humanized MIDI is the exact topic that I’ve been researching about since my Ph.D.

My colleague and I have manually collected about 200 pieces (in MusicXML, counted in movement level) and 1000 performances (in MIDI from Yamaha e-competition). The notes were aligned by automatic algorithm developed by Eita Nakamura. Of course, many performances include wrong notes, but we could train the model at a certain level by masking those misaligned notes during the loss calculation. This dataset was later forked by other researchers and published as ASAP dataset.

We’ve published those research in ICML and ISMIR 2019.

The model is uploaded on GitHub(https://github.com/jdasam/virtuosoNet), but the documentation of the repository and readability of the code is so horrible and do not reflect recent refactoring and updates.

After some blank period, we are currently preparing our next publication with the same topic, so the repository will be updated as the new publication is ready.

Meanwhile, if you want to try some, I can send you the rendered result using the updated model if you send me MusicXML (uncompressed one, preferably). We used MusicXML as an input because we wanted to include notations such as slur, dynamic, and tempo markings. The model can also handle multi-track scores, even symphonies. But the model will perform it as a piano piece anyway.

You can find some demos from my homepage, but those demos were post-processed to improve the quality.

https://jdasam.github.io/archive/

The demo I uploaded on twitter is a raw output of the model

https://twitter.com/DasaemJ/status/1462018760450011142?s=20&t=1BWGrz_zcIf3pz-_atQgww

-Dasaem

2022년 6월 22일 수요일 오후 9시 15분 42초 UTC+9에 richard....@gmail.com님이 작성:

Richard JE Cooke

unread,

Jun 23, 2022, 3:44:32 AM6/23/22

to Magenta Discuss, jdasam...@gmail.com, Magenta Discuss, matt....@gmail.com

Thank you so much for posting this! I will look into the code and the dataset. And please reply here when you finish your phd, I'd love to read it.

Wow, it would be amazing if we could get to the point where you could just open a tensorflow.js webpage, drop in your musicxml/midi file, and get back a humanized version. Composing would be sooo fast.

Reply all

Reply to author

Forward