When to use: basic, lookback, attention

259 views
Skip to first unread message

Jeremy Ellis

unread,
Jul 29, 2016, 7:10:50 PM7/29/16
to Magenta Discuss
Say I wanted to train Magenta to output a merge between two songs say: Mary Had a Little Lamb and London Bridges Falling Down. see attached. So I want something like a: Mary had a London Bridge! 

I tend to use mostly defaults, so 50 layers, no primer, and about 1000 training loops, which technique would be better for this situation: basic_rnn, lookback_rnn or attention_rnn? Also does anyone have any input on better default setting, is 50 layers or 1000 training loops too much for how simple these songs are.
london-bridges.mid
mary-had-a-little-lamb.mid

Dan Abolafia

unread,
Aug 17, 2016, 1:29:46 AM8/17/16
to Magenta Discuss
In what way do you want to merge the songs? An interpolation between the notes?

I think an RNN language model (next step prediction) is not the right technique for that (basic_rnn, lookback, and attention_rnn are all RNN language models). Models which are pretty great for generating interpolations are autoencoders or generative adversarial networks. However, doing that for sequences is pretty much at the cutting edge of machine learning and fairly experimental. 

There is a recent paper where the model was able to interpolate between poems, which is pretty neat. https://arxiv.org/pdf/1511.06349.pdf 

Jeremy Ellis

unread,
Aug 17, 2016, 9:41:16 PM8/17/16
to Magenta Discuss
Dan

That looks interesting. The three types:sequence autoen-
coders, skip-thought, and paragraph vector.

Any chance of getting Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz & Samy Bengio
Google Brain
{vinyals, adai, rafalj, bengio}@google.com

To have a look at Magenta?

Dan Abolafia

unread,
Aug 17, 2016, 10:46:43 PM8/17/16
to Jeremy Ellis, Magenta Discuss
Well I think most of the authors know of Magenta (Samy whom I work for certainly has looked at it). I can't make any promises about Magenta implementing these models. We are branching out to a lot of different areas in ML, and the research directions are inexhaustible. Community contributions are always welcome :)


--
Magenta project: magenta.tensorflow.org
To post to this group, send email to magenta...@tensorflow.org
To unsubscribe from this group, send email to magenta-discuss+unsubscribe@tensorflow.org
---
You received this message because you are subscribed to the Google Groups "Magenta Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to magenta-discuss+unsubscribe@tensorflow.org.

Jeremy Ellis

unread,
Aug 18, 2016, 12:14:19 PM8/18/16
to Magenta Discuss
Dan: I don't mind trying to do the coding just want to make sure it has some chance of success. I may have to try Tensorflow as I have already converted melodies into normalized shapes at

http://rocksetta.com/rocksetta-music-shapes/

To refine what I want to do. I want to plug x number of melodies into Magenta. Lets say Nursery Rhymes, and have Magenta generate its own version of a Nursery Rhyme, or Pop song or Classical song etc. Magenta basic_rnn is very close, the generated songs are just a bit to random and then tend to snap to a learnt song too easily, especially after many training loops. The temperature flag does not seem to help with my issue.


If I use Tensorflow and melody shapes, I might be able to have Tensorflow generate a shape and then convert it back to a melody, but that seems like a lot of work especially since Magenta is already working. Could you pass this issue on to Samy?


Tim Cooijmans

unread,
Aug 18, 2016, 1:59:50 PM8/18/16
to Jeremy Ellis, Magenta Discuss
Jeremy, don't be afraid to take the Magenta code and make it do what you want it to. You seem to have a clear idea in your head; see if you can get it to work!

The basic_rnn is a very simple model, so to have it "already working" doesn't mean that much. Implementing research ideas often involves starting from scratch. Certainly *my* work that I've been doing at Magenta in the last months was done from scratch, and my code does not have much in common with the basic_rnn code.

Don't wait for us, go hack around with the code!

Eric Nichols

unread,
Aug 18, 2016, 2:11:40 PM8/18/16
to Tim Cooijmans, Jeremy Ellis, Magenta Discuss
Jeremy,
  This is just slightly related to your original idea above but might be interesting: check out David Cope's "Mozart in Bali" for an example of gradually morphing between styles over the course of a single song.

Jeremy Ellis

unread,
Aug 19, 2016, 1:22:33 AM8/19/16
to Magenta Discuss, keyfre...@gmail.com

Tim, I appreciate your support. I have been looking at the code but can't find a starting location. Can someone point me in the right direction (a github file and line number), I wish to filter each midi note just before it is written to the output midi file at a point where I still have access to Tensorflow variables.

Dan Abolafia

unread,
Aug 19, 2016, 2:59:15 PM8/19/16
to Magenta Discuss, keyfre...@gmail.com
Jeremy, if you decide to create a model there are two things you would need to make:
  1. Code that creates your dataset (what actually gets fed into TensorFlow)
  2. The model itself
Generally the dataset code does not need to involve tensorflow ops or variables.
EncoderPipeline is what converts the melodies (melodies_lib.MonophonicMelody) into TensorFlow inputs (SequenceExample protos). The get_pipeline method builds the MIDI to SequenceExample pipeline, and the run_from_flags function actually runs that pipeline over your MIDI files. For a melody model, you just need to implement your own EncoderPipeline class. There are many ways to get data into your model, and SequenceExample is one, but you are not constrained to outputting that format. You could simply output CSV or JSON files if you like.

Building the model will be the hard part. If you have not already, I would highly recommend that you go through the TensorFlow tutorials at www.tensorflow.org and maybe other good tutorials on the internet. Understanding how to get data into your model (with readers and batch queues, or feed dicts) is important. I would encourage you to look at code for models that other people have built in TF on the internet. Our basic RNN model is a good example for how to build an RNN model in TensorFlow. If you were going to build something like an autoencoder (specifically a variational autoencoder) take a look at implementations like this https://jmetzen.github.io/2015-11-27/vae.html. My last piece of advice would be to start simple and iterate. For example, trying to build the "Generating Sentences from a Continuous Space" model from scratch would be a challenge. Instead start by taking the variational autoencoder code from jmetzen.github.io and training it on fixed length melodies (pad short melodies and truncate long melodies). Then you will be able to try out the melody interpolation you want to do pretty quickly.

I hope that gives you a pretty good idea of how to start tinkering. Cheers!

Jeremy Ellis

unread,
Aug 20, 2016, 7:55:44 AM8/20/16
to Magenta Discuss, keyfre...@gmail.com
Wow, great explanation Dan. I will probably hack around with the basic_rnn but that has given me tons to work on. Thank you.


Dan Abolafia

unread,
Aug 20, 2016, 12:54:58 PM8/20/16
to Jeremy Ellis, Magenta Discuss
My pleasure, have fun!

On Sat, Aug 20, 2016 at 4:55 AM, Jeremy Ellis <keyfre...@gmail.com> wrote:
Wow, great explanation Dan. I will probably hack around with the basic_rnn but that has given me tons to work on. Thank you.
Reply all
Reply to author
Forward
0 new messages