Generating new music from custom MIDI or MusicXML dataset

457 views
Skip to first unread message

Robinson McClellan

unread,
May 12, 2023, 12:54:49 PM5/12/23
to Magenta Discuss
Hi all,

I'm interested in finding ways to train an AI model on a small set of MIDI or MusicXML files, to generate new music in the style of the samples. I was in touch with an engineer at MuseNet in 2019 and we started a project, but since then have not been able to reach them. I'm impressed with the results from the Magenta Transformer Colab (heard here) but am having trouble getting the notebook to work. In general, it seems like there hasn't been as much activity on MuseNet or Magenta in the past couple of years (though I could be wrong). 

Now, with AI developments moving so fast, I wonder if those projects are being updated, or if there are new tools and options being developed. I'm curious about different models, RNN, GAN, transformer, ideally something at GPT-4 levels of smarts, or similar.

Or, if a similar process of new music generation from a custom dataset is available but only for audio data (as opposed to MIDI or MusicXML), that could also work. I'm checking out Jukebox and a couple of others.  

At this stage I would be interested in finding someone who could do most of the process for me, in conversation with me. I also hope to develop the skills to build a model myself.

I would love to connect with anyone interested in this type of project. 

Thank you!
Robin

Juan Carlos Piñeros

unread,
May 12, 2023, 1:33:06 PM5/12/23
to Magenta Discuss, robinson....@gmail.com
Hi Robinson,

MuseNet is down, I understand that it will be back again but not sure when. Google release a model but in the Audio domain, you can join the waitlist here https://aitestkitchen.withgoogle.com/experiments/music-lm but I am not familiar with any new model for symbolic music. 

You said you are familiar with Jukebox. There are a couples of other models that you can fine-tune for audio, lately many diffusion-based models like Dance Diffusion, Audio Diffusion, Riffusion. Or Musik (I think is GAN style?) and Rave, are the ones that I've seen people use most.

For symbolic music generation I am not sure if you can train a model with a small set of files, and I am unsure about fine-tuning a larger model, I think you would need a large dataset, but someone can correct me. There are many approaches to train a model with such files. For instance, you may want to take a look at https://jeffreyjohnens.github.io/MMM/ And here is an open-source model of MMM https://huggingface.co/ai-guru/lakhclean_mmmtrack_4bars_d-2048 Maybe you can see if fine-tuning a small dataset from this bigger one will give you good results (I am afraid not :S) I hope that Magenta creates something for this! 

I think that the notebooks from Magenta are not working lately, but I remember they also had a model that you can use your music as input and the model will continue it (maybe it is the same  Magenta Transformer that you are trying to use).

Bests,
Juan Carlos

Robinson McClellan

unread,
May 12, 2023, 2:27:42 PM5/12/23
to Juan Carlos Piñeros, Magenta Discuss
Hi Juan Carlos,

Thank you very much for this - it's super helpful. I had joined the waitlist for AI Test Kitchen recently and just got the invite. I tried it out - definitely fun. I will look into Dance Diffusion, Audio Diffusion, Riffusion, Musik, and I'll also explore those two examples of symbolic music training data you mentioned.

Regarding symbolic music generation, when I was briefly working with someone at MuseNet in 2019, the way I understood it was that they could create a token to represent my small dataset of MIDI files as a genre within their larger model of many thousands of MIDI files they had trained it on. My small dataset would influence the overall training process, and I could tell the model to create new music in the style of my small dataset. I never got the chance to hear the results with my tokens included, but the results of their approach, which they have posted on the MuseNet site, are pretty good. That was with GPT-2 (as I understand it). Now I'm imagining how powerful the results would be with a GPT-4-level model, or similar.

Just to add a funny anecdote: ChatGPT-4 can generate original MusicXML files if you ask it to - but they are very basic. The first one it gave me was just one note. Then it gave me a few more notes when I asked it to. I asked for sad music and it gave me a minor scale. So it was pretty rudimentary, but I was impressed it could do that at all, given that it's not a music model.  

I would welcome hearing about updates to MuseNet and Magenta - both great models.

Thank you again,
Robin
--
Robinson McClellan

Juan Carlos Piñeros

unread,
May 12, 2023, 4:33:21 PM5/12/23
to Magenta Discuss, Robinson McClellan, Magenta Discuss, Juan Carlos Piñeros
Hi Robin,

Thanks! Sounds amazing what you share, it reminds me of textual inversion when adding a concept to, for instance, Stable Diffusion...

Hopefully Magenta has soon some updates for us :) 


Juan Carlos

Robinson McClellan

unread,
May 15, 2023, 10:02:44 AM5/15/23
to Magenta Discuss, pipa...@gmail.com, Robinson McClellan, Magenta Discuss
Hi Juan Carlos,

Yes, I hope Magenta, or MuseNet, become more active again soon. If anyone here knows someone who works for or with OpenAI or Google who might know more about those plans, I would be interested to hear.

In the meantime, I will keep exploring to see what I might be able to find. On the Music21 forum someone recommended this PerformanceRNN on Kaggle https://www.kaggle.com/code/robbynevels/performancernn. I am not sure I have the skills to adapt my own midi files into it as training data - but I will try to see how far I get.

I welcome other thoughts from all of you here on this forum. I'll check back in when/if I find or develop anything interesting.

Best,
Robin
Reply all
Reply to author
Forward
0 new messages