Transfer Learning for Music Transformer?

976 views
Skip to first unread message

Kevin Jiang

unread,
Mar 22, 2020, 11:03:27 PM3/22/20
to Magenta Discuss
Hi all,

I'm an undergraduate who's just begun looking into transformers. I'm wondering if it would be possible to use transfer learning to train Music Transformer to generate performances in other genres, such as jazz music. Since much of jazz relies on harmonies from classical music, many of which are already learned through the existing data set, I have a feeling this may be possible.

Does anyone have any suggestions for where I can start with this? Thanks in advance.

Ian Simon

unread,
Mar 23, 2020, 1:11:29 PM3/23/20
to Kevin Jiang, Magenta Discuss
Hi Kevin, I have tried taking a pretrained music transformer model and fine-tuning it on another dataset, and it seemed to work pretty well.

In general you'll want to follow the process described here but with some extra steps.  I'm not sure what your setup is, but you'll want to do something like this (assuming you have a bunch of MIDI files for training):

1) Convert your MIDI files to a TFRecord of NoteSequences as described here: https://github.com/tensorflow/magenta/tree/master/magenta/scripts

2) Add a new "problem" to score2perf.py, similar to this one but pointing at your TFRecord.

3) Run datagen for your new problem, as described here.  This encodes the NoteSequences into the sequences of event tokens used for training.

4) Download the pretrained checkpoint used by our colab notebook:

5) Train as described here, but use HPARAMS_SET=transformer_tpu, add a "num_hidden_layers=16" hparams, and add a --warm-start-from=/path/to/unconditional_model_16.ckpt flag to start training from the downloaded checkpoint.

I *think* that should work.  Sorry it's so involved :(

-Ian

--
Magenta project: magenta.tensorflow.org
To post to this group, send email to magenta...@tensorflow.org
To unsubscribe from this group, send email to magenta-discu...@tensorflow.org
---
To unsubscribe from this group and stop receiving emails from it, send an email to magenta-discu...@tensorflow.org.

Alejandro Ruiz

unread,
Mar 24, 2020, 11:19:48 AM3/24/20
to Magenta Discuss
I think here they did something similar of what you are trying to do: https://github.com/chrisdonahue/LakhNES

Xiaotao Luo

unread,
Aug 17, 2020, 5:25:15 AM8/17/20
to Magenta Discuss, Ian Simon, Magenta Discuss
Hi, Ian

I'm trying to fine-tune your released model on another dataset following the steps you mentioned above. At step 2, I added the below code to the "score2perf_maestro_language_uncropped_aug" problem, but the program terminated at the training stage and reported the error message "assert not self.has_inputs".

```
@property
def absolute_timing(self):
return True

def score_encoders(self):
return [
('melody', music_encoders.TextMelodyEncoderAbsolute(
steps_per_second=10, min_pitch=MIN_PITCH, max_pitch=MAX_PITCH))
]
```

I think there must be some other detailed steps I missed, could you please take a look at this?

Thank you.

Best,
Xiaotao

Xiaotao Luo

unread,
Aug 17, 2020, 6:36:14 AM8/17/20
to Magenta Discuss, Xiaotao Luo, Ian Simon, Magenta Discuss
I overrode the "has_inputs" function to return "False", now the training process is running as expected. But still, there might be some other details missing.

Ian Simon

unread,
Aug 17, 2020, 11:13:02 AM8/17/20
to Xiaotao Luo, Magenta Discuss
Hi Xiaotao, the score2perf_maestro_language_uncropped_aug problem is just a piano language model.  It looks like you are trying to finetune it for melody -> piano performance, i.e. accompaniment.  If that's what you want to do, you could take the melody-conditioned checkpoint (replace "unconditional_model" in the above URLs with "melody_conditioned_model"), and start with the score2perf_maestro_absmel2perf_5s_to_30s_aug10x problem instead.

If in fact you want to train a language model, not a melody accompaniment model, then you should remove the lines you added.

-Ian

Xiaotao Luo

unread,
Aug 17, 2020, 11:11:42 PM8/17/20
to Magenta Discuss, Ian Simon, Magenta Discuss, Xiaotao Luo

Hi, Ian

I have listened to the samples generated by the "unconditional_model", which is amazing. So I want to finetune it on another specific genre of music. Following the instructions you gave earlier, there are some functions left un-implemented, i.e. "performances_input_transform", "min_hop_size_seconds", etc. So I just copied those functions from the "score2perf_maestro_language_uncropped_aug" problem, which was wrong.

So I wonder if you could give us some insights into those un-implemented functions?

Best,
Xiaotao

Ian Simon

unread,
Aug 18, 2020, 10:36:19 AM8/18/20
to Xiaotao Luo, Magenta Discuss
Hi Xiaotao, performances_input_transform should return an Apache Beam "transform" that reads your NoteSequences TFRecord, which should live on GCS somewhere.  You can copy much of the existing implementation that points at the MAESTRO sequences.

The other fields have to do with cropping.  Probably the easiest thing to do is use the exact values as in the score2perf_maestro_language_uncropped_aug problem, which will apply a random crop at training time.

-Ian

Xiaotao Luo

unread,
Aug 18, 2020, 11:17:58 PM8/18/20
to Magenta Discuss, Ian Simon, Magenta Discuss, Xiaotao Luo
Thanks, Ian

I see, basically, we need a similar "score2perf_maestro_language_uncropped_aug" problem to finetune the released "unconditional_model_16" model. Besides, we'll have to override the "score_encoders","absolute_timing","has_inputs" functions. Am I right? 

For now, my t2t problem without augmentation looks like the following code snippet, but the training process stuck after the "Saving checkpoints for 0 into /root/packages/magenta/models/absolute_melody2perf_problem/model.ckpt". Any ideas what's going on?

```
@registry.register_problem('absolute_melody2perf_problem')
class AbsoluteMelody2PerfProblem(Score2PerfProblem):
"""Base class for musical (absolute-timed) melody-to-performance problems."""

  def performances_input_transform(self, tmp_dir):
    del tmp_dir
    return dict(
      (split_name, datagen_beam.ReadNoteSequencesFromTFRecord(tfrecord_path))
      for split_name, tfrecord_path in MAESTRO_TFRECORD_PATHS.items())

  @property
  def splits(self):
    return None

  @property
  def min_hop_size_seconds(self):
    return 0.0

  @property
  def max_hop_size_seconds(self):
    return 0.0

  @property
  def add_eos_symbol(self):
    return False

  @property
  def random_crop_in_train(self):
    return True

  @property
  def split_in_eval(self):
    return True

  @property
  def has_inputs(self):
    return False

  @property
  def absolute_timing(self):
    return True

  def score_encoders(self):
    return [
      ('melody', music_encoders.TextMelodyEncoderAbsolute(
      steps_per_second=10, min_pitch=MIN_PITCH, max_pitch=MAX_PITCH))]
```

Best,
Xiaotao

Xiaotao Luo

unread,
Aug 19, 2020, 3:19:32 AM8/19/20
to Magenta Discuss, Xiaotao Luo, Ian Simon, Magenta Discuss
I was able to finetune on the conditional model yesterday but now stuck at the unconditional model, which is so strange. Looking at those following discussions didn't find out the solution.

Ian Simon

unread,
Aug 19, 2020, 12:23:02 PM8/19/20
to Xiaotao Luo, Magenta Discuss
Not sure what could be happening but there may be issues with Tensor2Tensor compatibility with the latest Tensorflow.  We're in the process of moving our Transformer setup off of Tensor2Tensor, but if you were able to successfully finetune the melody-conditioned model I suspect it's just a minor configuration issue that's preventing the unconditional model from working.

I would start by just copying the "score2perf_maestro_language_uncropped_aug" problem exactly, but pointing at your data.

-Ian
Reply all
Reply to author
Forward
0 new messages