del.icio.us crawl?

27 views
Skip to first unread message

Aaron Swartz

unread,
Jul 22, 2008, 10:53:10 AM7/22/08
to get-t...@googlegroups.com
Does anyone have any kind of decent-sized crawl/dump of del.icio.us
data? It seems like it'd be fun to run some collaborative filtering
algorithms on it.

Philip (flip) Kromer

unread,
Aug 22, 2008, 6:58:18 PM8/22/08
to get-t...@googlegroups.com
Did anyone ping you back on this?  I have a start of one I'm using to mine for datasets links.  I also have a million-ish nodes of the twitter friend graph.  

Lemme know if anyone's interested in analysing either.

*==*

Besides the dataset links, once we've got infochimps.org to where I can really attack the provenance/trust systems, I'm interested in using these graphs to derive approximate reputation from synthetic identities.  Say alice and bob, two users on your site, have proven to be non-asshats.  Both follow each other, and both follow someone named Carol on twitter.com, metafilter.com and del.icio.us

You're reasonable to believe that all those carols are the same; and if yoursite.com/Alice decides to follow yoursite.com/Carol, it's reasonable to provisionally trust this Carol, and even extend diluted trust forward from how you regard twitter.com/carol, metafilter.com/carol and del.icio.us/carol.

anyway, modulo privacy concerns and in concert with other trust metrics, might be interesting to set up a community clearinghouse to connect and claim these synthesized identities.

flip 

Aaron Swartz

unread,
Aug 23, 2008, 8:20:02 AM8/23/08
to get-t...@googlegroups.com
> Did anyone ping you back on this?

Nope.

Message has been deleted

Rufus Pollock

unread,
Aug 26, 2008, 7:56:00 AM8/26/08
to get-t...@googlegroups.com
On 22/08/08 23:58, Philip (flip) Kromer wrote:
> On 7/22/08, Aaron Swartz <m...@aaronsw.com> wrote:
>>
>> Does anyone have any kind of decent-sized crawl/dump of del.icio.us
>> data? It seems like it'd be fun to run some collaborative filtering
>> algorithms on it.
>>
>
> Did anyone ping you back on this? I have a start of one I'm using to mine
> for datasets links. I also have a million-ish nodes of the twitter friend
> graph.

I'd be interested in the twitter graph. I'd also be particularly
interested in dynamic data on networks (i.e. data over time that allows
one to look at network evolution).

~rufus

[snip]

Joseph Turian

unread,
Aug 26, 2008, 12:15:58 PM8/26/08
to get-t...@googlegroups.com
Rufus,

> I'd be interested in the twitter graph. I'd also be particularly
> interested in dynamic data on networks (i.e. data over time that allows
> one to look at network evolution).

You may also be interested in http://hashtags.org

I know the fellow who runs this site. Let me know if you'd like me to
talk to him about a data dump.

Joseph

--
Academic: http://www-etud.iro.umontreal.ca/~turian/
Business: http://www.metaoptimize.com/

Reply all
Reply to author
Forward
0 new messages