Music Recommendation Dataset (last.fm)

237 views
Skip to first unread message

Li Ming

unread,
May 26, 2010, 1:57:21 AM5/26/10
to Resys
From: Oscar Celma <oce...@gmail.com>
Date: Wed, May 26, 2010 at 12:46
Subject: [MUSIC-IR] Music Recommendation Dataset (last.fm) - Full
listening history for ~1K users


here's another Music Recommendation Dataset, named "Last.fm Dataset -
1K "

This dataset contains <user-id, timestamp, artist-mbid, artist-name,
song-mbid, song-title> tuples collected from Last.fm API, using the
user.getRecentTracks() method.
The dataset contains the full listening history (till May, 5th 2009)
for nearly 1,000 users.

For more information about the two available datasets see:
http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/index.html

I hope these datasets are useful for people doing research on music
recommendation, similarity, playlist generation, etc.

Cheers, Oscar

--->8----------------------------------

Last.fm Dataset - 1K users

========
README
========

Version 1.0, May 2010

. What is this?

This dataset contains <user, timestamp, artist, song> tuples
collected from Last.fm API,
using the user.getRecentTracks() method.

This dataset represents the whole listening habits (till May, 5th
2009) for nearly 1,000 users.

. Files:

userid-timestamp-artid-artname-traid-traname.tsv (MD5:
64747b21563e3d2aa95751e0ddc46b68)
userid-profile.tsv
(MD5: c53608b6b445db201098c1489ea497df)

. Data Statistics:

File userid-timestamp-artid-artname-traid-traname.tsv

Total Lines: 19,150,868
Unique Users: 992
Artists with MBID: 107,528
Artists without MBDID: 69,420

. Data Format:

The data is formatted one entry per line as follows (tab separated,
"\t"):

userid-timestamp-artid-artname-traid-traname.tsv
userid \t timestamp \t musicbrainz-artist-id \t artist-name \t
musicbrainz-track-id \t track-name

userid-profile.tsv:
userid \t gender ('m'|'f'|empty) \t age (int|empty) \t country
(str|empty) \t signup (date|empty)

. Example:

userid-timestamp-artid-artname-traid-traname.tsv:
user_000639 \t 2009-04-08T01:57:47Z \t MBID \t The Dogs D'Amour
\t MBID \t Fall in Love Again?
user_000639 \t 2009-04-08T01:53:56Z \t MBID \t The Dogs D'Amour
\t MBID \t Wait Until I'm Dead
...

userid-profile.tsv:
user_000639 \t m \t Mexico \t Apr 27, 2005
...

. License:

The data contained in lastfm-dataset-1K.tar.gz is distributed with
permission of Last.fm.
The data is made available for non-commercial use.
Those interested in using the data or web services in a commercial
context should contact:

partners [at] last [dot] fm

For more information see Last.fm terms of service

. Acknowledgements:

Thanks to Last.fm for providing the access to this data via their
web services.
Special thanks to Norman Casagrande.

Sucirst Yie

unread,
May 26, 2010, 12:40:47 PM5/26/10
to re...@googlegroups.com
Thanks for sharing!
 
download speed 似乎不是很快 有谁能转到国内分流么~

2010/5/26 Li Ming <limin...@gmail.com>



--
Sucirst

Loeb

unread,
May 26, 2010, 10:16:55 PM5/26/10
to re...@googlegroups.com
有地方放嘛!? 我可以幫忙抓,傳上去!

2010/5/27 Sucirst Yie <suc...@gmail.com>

feng zhou

unread,
May 26, 2010, 10:48:24 PM5/26/10
to re...@googlegroups.com
若不是很大的话,教育网内我们这有空间可以放http://ustor.hust.edu.cn/raise/newupload/ 上限50MB

simple

unread,
May 26, 2010, 11:01:15 PM5/26/10
to Resys
教育网我们组倒是有服务器可以提供下载,只是这数据集下的好慢啊,几k的速度

On 5月27日, 上午10时16分, Loeb <loeb.c...@gmail.com> wrote:
> 有地方放嘛!? 我可以幫忙抓,傳上去!
>

> 2010/5/27 Sucirst Yie <suci...@gmail.com>


>
>
>
> > Thanks for sharing!
>
> > download speed 似乎不是很快 有谁能转到国内分流么~
>

> > 2010/5/26 Li Ming <liming....@gmail.com>

> > Sucirst- 隐藏被引用文字 -
>
> - 显示引用的文字 -

Li Ming

unread,
May 27, 2010, 3:00:04 AM5/27/10
to Resys
我现在正在下载,速度比较快,估计很快能够下载完。
放到Google Doc上面共享的话,你们的下载速度如何?

Loeb

unread,
May 27, 2010, 3:54:20 AM5/27/10
to re...@googlegroups.com
我抓完了~ 壓縮後二百多M ,請問要放哪!?

Li Ming

unread,
May 27, 2010, 3:54:38 AM5/27/10
to Resys

Li Ming

unread,
May 27, 2010, 4:04:30 AM5/27/10
to Resys
Last.fm Dataset - 1K users (user full listening history)
放到:
https://docs.google.com/leaf?id=0B6ALX4SM5RB-ODFjNjY2MGMtZmFjYS00MmZhLWFhYmEtYWU4MjRkNjdjNGI0&hl=zh_CN
我这两个都没有压缩。
大家都提供下载分流吧。

On May 27, 4:54 pm, Loeb <loeb.c...@gmail.com> wrote:
> 我抓完了~ 壓縮後二百多M ,請問要放哪!?
>

Li Ming

unread,
May 27, 2010, 3:58:35 AM5/27/10
to Resys
Last.fm Dataset - 1K users (user full listening history)
我放这儿了:
https://docs.google.com/leaf?id=0B5XdMfiGgRXNMTcyNmY1YzAtMzRmYS00MDJhLWJmZTItMDMyOTVlY2I5YTU2&hl=zh_CN

On 5月27日, 上午3时54分, Loeb <loeb.c...@gmail.com> wrote:
> 我抓完了~ 壓縮後二百多M ,請問要放哪!?
>

Reply all
Reply to author
Forward
0 new messages