Big Data for Music Analysis and Musicology

Tillman Weyde

unread,

Oct 24, 2014, 10:18:36 AM10/24/14

to ismir2014-unco...@ismir.net

The size of datasets that are available in MIR is growing, especially in commercial use (e.g. Spotify / The Echo Nest, iTunes) but also increasingly in openly available collections of audio or as featuresets (e.g. at The Internet Archive or The Million Song Datasets). Apart from recommending tracks and generating playlists, the question arises what can we learn from large datasets about music itself and its relation to culture, society and the world in general. Recent projects in the UK and elsewhere [1], [2] are working on creating new large datasets providing and sometimes integrating different representations. In addition, computational methods for audio analysis tasks such as automatic transcription and chord extraction have progressed to a degree that makes an integrated musical analysis from audio feasible, at least on a statistical basis. Large datasets and a statistical approach enable us to ask and answer new questions about music that are different from traditional work- or composer-centred musicological analysis. Ethnomusicology is already embracing this approach, but on a relatively small scale. Given that large datasets are increasingly being made available, is thus worth discussing which questions are worth asking and how can we use MIR technology to answer them?

[1] Digital Music Lab - Analysing Big Music Data (DML). URL: http://dml.city.ac.uk/

[2] Single Interface for Musical Score Searching and Analysis (SIMSSA). URL: http://simssa.music.mcgill.ca/

bobl...@gmail.com

unread,

Oct 28, 2014, 11:39:22 AM10/28/14

to ismir2014-unco...@ismir.net

Interesting Tillman! I am very interested in "Which questions are worth asking?", and its variant, "Which questions are being asked?" I think these questions are essential to specify before considering concrete datasets that have been designed for other purposes.

And now with kiki/bouba there is a limitless dataset. :) (Could such an amount of data still not be enough for some algorithm to backwards engineer to the algorithms that produced them? Bit of a convoluted question, I admit.)

amelie....@gmail.com

unread,

Oct 30, 2014, 3:00:11 AM10/30/14

to ismir2014-unco...@ismir.net

I'd be up for such a session too.

I'm happy to provide an industry perspective to how can we deal with Big Data in MIR.

Tillman Weyde

unread,

Oct 30, 2014, 5:01:08 AM10/30/14

to ismir2014-unco...@ismir.net

Hi Amelie,

that’s great! Looking forward a to hearing about your views and experience.

Best,

Tillman

Reply all

Reply to author

Forward