Reconstructed Audio Of TRAXLZU12903D05F94 From segments_pitches and segments_loudness

88 views
Skip to first unread message

James Suruda

unread,
Nov 5, 2012, 1:18:26 PM11/5/12
to millionso...@googlegroups.com
All-

The pitch data in the Million Song Dataset maps pitches from multiple octaves and voices down to a chroma vector of 12 pitch classes. If we re-render the audio from this chroma vector, is the original melody/harmony of the song still recognizable?

I used only the pitch and loudness from the h5 file of track TRAXLZU12903D05F94 to reconstruct the audio.  Playing all non-zero chroma features resulted in an atonal mess, so I only played back those tones with a value greater than .98.   I overlaid the original audio file after 13 seconds for a reference.

Audio here:

The melody is vaguely recognizable, yes?

It would have been very useful for the Echonest Analyze to provide us with octave information, for instance five octaves of 12 tones, instead of one bucket-of-octaves.  But, so it goes.

Thanks,

Jim Suruda
Computer Science
Southern Illinois University
615.438.1277

Reply all
Reply to author
Forward
0 new messages