social graph methods with a bit more info

0 views
Skip to first unread message

softprops

unread,
Mar 28, 2009, 10:16:46 PM3/28/09
to Twitter Development Talk
It would be nice if the http://twitter.com/[friends|followers]/ids.format
uri's could return a bit more useful info like the screen_name. I'm
new to using the api but my understanding of how to do this would be
to have to make 1 call to the graph method and then iterate over each
id and call http://twitter.com/users/show/id.format but overhead of
doing that grows exponentially with the number of friends/followers a
user has. Esp. when all I want is the screen_name. The show method
packs quite a bit of data for each user.

What is the best way to accomplish getting a list a given users
friends names while pulling the least amount of data?

A suggestion might be http://twitter.com/[friends|followers]/ids/{screen_name}.format?incl=
screen_name,other_attribute,some_other_attribute where the default the
response is just returns the ids and when requested via some comma
delimited param, adds a limited and predefined list of available
attributes that can be pulled or an error status of that list contains
attributes not available.

I realize this would break the semantic of the method's name, 'ids',
but maybe it could be encoded in element attributes
http://twitter.com/friends/ids/{screen_name}.xml?incl= screen_name
might yield

<?xml version="1.0" encoding="UTF-8"?>
<ids>
<id screen_name="foo">1</id>
<id screen_name="bar">2</id>
</ids>

I'm not sure how that would be rendered as json

Currently is this just [1,2,3,4] which packs a very lite payload
maybe [{id:1,screen_name:'foo;},{id:2,screen_name:'bar'}]


Damon Clinkscales

unread,
Mar 29, 2009, 12:47:53 AM3/29/09
to twitter-deve...@googlegroups.com
see

On Sat, Mar 28, 2009 at 9:16 PM, softprops <d.ta...@gmail.com> wrote:
>
> It would be nice if the http://twitter.com/[friends|followers]/ids.format
> uri's could return a bit more useful info like the screen_name.

> .... [ snip ] ...


> <?xml version="1.0" encoding="UTF-8"?>
> <ids>
>  <id screen_name="foo">1</id>
>  <id screen_name="bar">2</id>
> </ids>

They aren't going to do this for performance reasons, even though yes,
it would be useful.

see http://is.gd/ptJ9

-damon

Nick Arnett

unread,
Mar 29, 2009, 10:32:22 AM3/29/09
to twitter-deve...@googlegroups.com
On Sat, Mar 28, 2009 at 7:16 PM, softprops <d.ta...@gmail.com> wrote:

What is the best way to accomplish getting a list a given users
friends names while pulling the least amount of data?

Given the way things are, the fastest way I've found is to get the user's status timeline and pull the names from the tweets, then use the show call to get the ones that didn't show up in the timeline.  No matter how you do it, it is quite slow for people with a lot of friends.  It is nothing you'd want to try to do while the user waits for a response, unless they have very few friends.

Nick

Damon Clinkscales

unread,
Mar 29, 2009, 2:52:44 PM3/29/09
to twitter-deve...@googlegroups.com

An alternative solution may be possible though.

I've recently been reminded that @infochimps has a "massive scrape of
the Twitter social graph" and is willing to make that available, in
whole or in part. However, they are currently awaiting Twitter's
permission on precisely what can be released.

You can read more about this here ->
http://blog.infochimps.org/2008/12/29/massive-scrape-of-twitters-friend-graph/

Assuming that the data is released, even in a limited form, there is
potential there for an id<-->screen_name mapping table which could
serve as a "cache primer" for apps that need that. This could
potentially save a bajillion calls against Twitter's API, which in
turn would have other good effects. One of the most notable places
where this is obviously needed is tying Twitter Search results to
Twitter users. For historical reasons, the user id in the search
result is not the Twitter user_id, so you have to use the screen name.

-damon
--
http://twitter.com/damon

softprops

unread,
Mar 29, 2009, 3:03:05 PM3/29/09
to Twitter Development Talk
Wow! What a great idea. Offloading the burden on twitter's servers/dbs
to a simple id->name cache hosted via another service on someone
elses. I will have to check that out.

On Mar 29, 2:52 pm, Damon Clinkscales <sca...@pobox.com> wrote:
> On Sat, Mar 28, 2009 at 11:47 PM, Damon Clinkscales <sca...@pobox.com> wrote:
> > see
>
> > On Sat, Mar 28, 2009 at 9:16 PM, softprops <d.tang...@gmail.com> wrote:
>
> >> It would be nice if thehttp://twitter.com/[friends|followers]/ids.format
> >> uri's could return a bit more useful info like the screen_name.
> >> .... [ snip ] ...
> >> <?xml version="1.0" encoding="UTF-8"?>
> >> <ids>
> >>  <id screen_name="foo">1</id>
> >>  <id screen_name="bar">2</id>
> >> </ids>
>
> > They aren't going to do this for performance reasons, even though yes,
> > it would be useful.
>
> > seehttp://is.gd/ptJ9
>
> > -damon
>
> An alternative solution may be possible though.
>
> I've recently been reminded that @infochimps has a "massive scrape of
> the Twitter social graph" and is willing to make that available, in
> whole or in part. However, they are currently awaiting Twitter's
> permission on precisely what can be released.
>
> You can read more about this here ->http://blog.infochimps.org/2008/12/29/massive-scrape-of-twitters-frie...

Jesse Stay

unread,
Mar 29, 2009, 5:38:24 PM3/29/09
to twitter-deve...@googlegroups.com
If Twitter's going to allow this, why don't they just do it themselves and provide more accurate and up-to-date info?  How often does this cache update? I'm curious how accurate and reliable this would be, since people are constantly modifying their social graph.

Alex and crew have already said they might be able to provide more info once they fully convert over to their new architecture.  My hope is that once they're able to do that I can just pull subsets of each social graph down, such as "number of new followers since x date", or other criteria.  A FQL-type language (similar to Facebook's) would be ideal for something like that.

Jesse

Damon Clinkscales

unread,
Mar 30, 2009, 1:41:16 AM3/30/09
to twitter-deve...@googlegroups.com
On Sun, Mar 29, 2009 at 4:38 PM, Jesse Stay <jess...@gmail.com> wrote:

> If Twitter's going to allow this, why don't they just do it themselves and
> provide more accurate and up-to-date info?

Yeah, that'd be nice. But, given everything going on, it's probably
not a priority right now.

> How often does this cache update? I'm curious how accurate and reliable this would be, since
> people are constantly modifying their social graph.

In the case of the id/screen_name thing, the data wouldn't change
much. Ideally, there'd be a way of forcing an update from Twitter in
the case of known/suspected stale data. As to keeping up with the
social graph, I think the current social graph methods are
sufficient/wonderful for that.

-damon

Jesse Stay

unread,
Mar 30, 2009, 3:32:43 AM3/30/09
to twitter-deve...@googlegroups.com
On Sun, Mar 29, 2009 at 11:41 PM, Damon Clinkscales <sca...@pobox.com> wrote:
> How often does this cache update? I'm curious how accurate and reliable this would be, since
> people are constantly modifying their social graph.

In the case of the id/screen_name thing, the data wouldn't change
much. Ideally, there'd be a way of forcing an update from Twitter in
the case of known/suspected stale data.  As to keeping up with the
social graph, I think the current social graph methods are
sufficient/wonderful for that.

Ah, okay - so it's not necessarily a grab of the social graph then, but rather a user cache.  If that's what it is I have a similar-sized cache, assuming Twitter were to start allowing this, I could make available as well. I'd be really surprised if they started to allow this though.

Although there is still the problem of keeping the data up-to-date.  People change their images, location, description, Tweets, number of followers/friends, etc. quite often.  I think Twitter could provide a cache of this data a lot faster than they could provide a way to easily force updates on stale data.  It sure would be nice though - I wouldn't have to make as many calls out to Twitter if they had a better way to get just the user updates.

Jesse

TechRavingMad

unread,
Mar 30, 2009, 1:41:19 AM3/30/09
to Twitter Development Talk
You can always provide your own cache. It doesn't take that much to
get a complete name<->ID cache locally. What does take a lot of calls
is keeping it up-to-date. Since you can change names on ID's it's not
always accurate (though the ID never changes).

It's a huge task to get that initial scrape, takes about 2 months
depending on your access, but it's doable.

If we could make more calls per hour you could significantly cut that
time, or if twitter made just that information available in a "fire-
hose" format where you could suck down the entire list at once. It'll
be a big file, there's almost 30 million user IDs now.


On Mar 29, 4:38 pm, Jesse Stay <jesses...@gmail.com> wrote:
> If Twitter's going to allow this, why don't they just do it themselves and
> provide more accurate and up-to-date info?  How often does this cache
> update? I'm curious how accurate and reliable this would be, since people
> are constantly modifying their social graph.
>
> Alex and crew have already said they might be able to provide more info once
> they fully convert over to their new architecture.  My hope is that once
> they're able to do that I can just pull subsets of each social graph down,
> such as "number of new followers since x date", or other criteria.  A
> FQL-type language (similar to Facebook's) would be ideal for something like
> that.
>
> Jesse
>

softprops

unread,
Mar 30, 2009, 3:52:49 AM3/30/09
to Twitter Development Talk
I think that is the point/trade off. What is the real cost to twitter
of developers making more calls for small chunks of data vs. less
calls for a bit more custom set of data? It's less http traffic but a
bigger payload. I guess it also depends on how the data is cached. As
Alex mentioned in the link above "As they are, we fetch data from a
single data store in our architecture to return the lists of IDs. In
order to provide usernames, we'd have to bog down this request by
joining together multiple sources of data." It would require a bit or
rearchitecting on their part before I think we see a compromise being
made. The major difficulty again maintaining the freshness of data
with users changing their screen names among other things. Probably
easier said than done.

It would be great if twitter did start opening up the caching of user
data to other services and perhaps provide web hooks that get fired
when that external services cache should be updated.


>
> Jesse
Reply all
Reply to author
Forward
0 new messages