Obtaining the Dataset

335 views

Skip to first unread message

aakash budhiraja

unread,

Jul 24, 2013, 11:50:00 AM7/24/13

to millionso...@googlegroups.com

Hi!

I am working as an independent on this dataset as a part of my summer project. Is there a way I could obtain the data,maybe in a hard disk, because I do not have the resources at hand to download the entire 280gb dataset from the internet?
The dataset available at infochimps can be downloaded in smaller chunks, but actually, I was working on the MSD challenge on Kaggle to be more precise, and since the kaggle dataset of song ids is a subset of the entire dataset, I would like to work with the entire dataset. Also, it is not clear as to which part of the MSD does the kaggle data set belong to.
Or would be sufficient to work with the subset?

Thank You

Aakash Budhiraja
Junior Undergraduate
IIT Delhi
+91- 9990003686

Arjannikov, Tom

unread,

Jul 24, 2013, 9:45:23 PM7/24/13

to millionso...@googlegroups.com

Hi Aakash Budhiraja,

I believe http://www.kaggle.com/c/msdchallenge is the challenge you’re referring to. If so, then the challenge is to use existing user information about some of the songs in MSD to predict which other songs in MSD a given user would listen to. In fact, they give you half of the listening data for many users, and you’re to predict the other half for the same users. So yes, you would use the entire MSD and try to find a subset of songs from it based on another subset given to you. And no, it would not be sufficient to work with a subset. Furthermore, Kaggle gives you song IDs only, and MSD contains the pairing of song IDs with corresponding song metadata, which could be helpful to making more accurate predictions.

I hope this helps.

- Tom

--
--
You received this message because you are subscribed to the Google
Groups "Million Song Dataset" group.
To post to this group, send email to millionso...@googlegroups.com
To unsubscribe from this group, send email to
millionsongdata...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/millionsongdataset?hl=en?hl=en
Million Song Dataset main webpage:
http://labrosa.ee.columbia.edu/millionsong/

---
You received this message because you are subscribed to the Google Groups "Million Song Dataset" group.
To unsubscribe from this group and stop receiving emails from it, send an email to millionsongdata...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all

Reply to author

Forward

0 new messages