Chamber music files in MAESTRO-v2.0.0

64 views
Skip to first unread message

Taegyun Kwon

unread,
Oct 29, 2020, 1:34:00 AM10/29/20
to magenta...@tensorflow.org
Dear Magenta Team,

Hi, while I took a look at MAESTRO dataset v2,
I found there are some chamber pieces inside the newly added files.

Since the dataset is widely recognized as a pure-piano dataset, and it is widely used for benchmarking purposes,
I think it would be better to inform or correct it.
I also found some research utilizing the v2 version, but I haven't found any comment about this issue.

I looked up piece names and canonical titles of the whole dataset, and found 6 chamber pieces (5 in train/ 1 in test split).
Fortunately, they are all played in 2018, so v1 wouldn't have this issue.
I suspect some chamber pieces are not filtered out for some reason in 2018 performances.

Here are the lists of chamber pieces I've found:
Anton Arensky Piano Trio in D Minor, Op. 32
2018/MIDI-Unprocessed_Chamber2_MID--AUDIO_09_R3_2018_wav--3.wav
Anton Arensky Piano Trio in D Minor, Op. 32
2018/MIDI-Unprocessed_Chamber4_MID--AUDIO_11_R3_2018_wav--3.wav
Anton Arensky Piano Trio in D Minor, Op. 32
2018/MIDI-Unprocessed_Chamber6_MID--AUDIO_20_R3_2018_wav--3.wav
Ludwig van Beethoven Piano Trio in D Major, Op. 70 No. 1
2018/MIDI-Unprocessed_Chamber1_MID--AUDIO_07_R3_2018_wav--2.wav
Ludwig van Beethoven Piano Trio in D Major, Op. 70 No. 1
2018/MIDI-Unprocessed_Chamber3_MID--AUDIO_10_R3_2018_wav--3.wav
Felix Mendelssohn Piano Trio No. 1 in D Major, Op. 49 (test)
2018/MIDI-Unprocessed_Chamber5_MID--AUDIO_18_R3_2018_wav--2.wav

And another question is:
when I compared the v1 and v2 of the dataset,
I found that only one pieces were missing in v2 version, which is:
2014/MIDI-UNPROCESSED_06-08_R1_2014_MID--AUDIO_08_R1_2014_wav--3.midi
With this piece, I think v1 and v2 could be compatible.
Is there any reason behind this?

Thank you for your efforts!

Best, Taegyun

Curtis "Fjord" Hawthorne

unread,
Oct 29, 2020, 12:29:42 PM10/29/20
to Taegyun Kwon, Chris Donahue, Magenta Discuss
Hi Taegyun,

Sorry about that! +Chris Donahue and I also found these tracks recently and the list we came up with matches what you found. I plan on releasing a v3 of the dataset soon that just excludes those tracks.

The addition of chamber music was new in the 2018 round of the competition, which is why these weren't present in v1 of the dataset. I thought we'd caught all instances of non-piano instruments before the v2 release, but clearly we missed a few!

I don't remember exactly why that piece you mentioned in v1 was omitted from v2, but we did some deduplication steps, so that may be the reason it's left out.

Thanks for the report!

-Fjord

--
Magenta project: magenta.tensorflow.org
To post to this group, send email to magenta...@tensorflow.org
To unsubscribe from this group, send email to magenta-discu...@tensorflow.org
---
To unsubscribe from this group and stop receiving emails from it, send an email to magenta-discu...@tensorflow.org.

Curtis "Fjord" Hawthorne

unread,
Nov 30, 2020, 6:43:17 PM11/30/20
to Taegyun Kwon, Chris Donahue, Magenta Discuss
Thanks again for the report! I just finished releasing MAESTRO v3, which removes these files:


Let me know if you find any other issues!

-Fjord
Reply all
Reply to author
Forward
0 new messages