Google has made an AI that can transform text prompts into music that lasts several minutes.
As The Verge reports(Opens in a new window), the AI model, similar to Open AI’s image generator DALL-E, is called MusicLM and was revealed by Google in a research paper(Opens in a new window) penned
by 13 researchers. The paper includes a plethora of samples made using
MusicLM, which include five-minute clips of melodic techno, swing, and
jazz and range in genre from meditation sounds and electronic music to
death metal and rap.
The
AI was able to generate music from a combination of melody and text
prompts too. For example, in one case it generated an opera vocal to the
melody of “Bella Ciao” being hummed. In another example(Opens in a new window),
MusicLM was able to generate a song from a “gym” prompt that had
incoherent lyrics and a vocal and melody with a distinctively Arab-pop
sound.
The
tool could also generate a fusion of reggaeton and electronic music
“with a spacey, otherworldly sound,” that induces the experience of
being “lost in space,” as one detailed text prompt reads.
The
researchers said their experiments with the AI show that “MusicLM
outperforms previous systems both in audio quality and adherence to the
text description. Moreover, we demonstrate that MusicLM can be
conditioned on both text and a melody in that it can transform whistled
and hummed melodies according to the style described in a text caption.”
For
those hoping to try the music-generating AI tool for themselves, you’ll
be disappointed to hear that Google has “no plans to release models at
this point.” The researchers cite risks of “potential misappropriation
of creative content” as well as potential cultural appropriation or
misrepresentation.
However,
the research paper says a public dataset is being released with around
5,500 music-text pairs, which Google says can help the training and
evaluation of other music-based AIs.