How to get a long list of Friends or Followers without hitting Twitter's rate limits?

1,513 views
Skip to first unread message
Assigned to sfe...@gmail.com by me

Ron Hornbaker

unread,
Dec 6, 2013, 7:21:04 PM12/6/13
to twitter-...@googlegroups.com
I'm hitting a wall on this one. See my code at https://gist.github.com/ronhornbaker/7817176

I'm trying to get a complete friends list for a user with 54,000 friends. The API max results per request is 200 (https://dev.twitter.com/docs/api/1.1/get/friends/list), so I included that max count in my twitter gem call:

client.friends(twitter_username, {:cursor => cursor, :count => 200} )

and my plan was to sleep for about a minute between calls, to stay under the 15 calls per 15 minutes rate limit.

However, it appears that the client.friends method ignores that :count param, because upon running my code, it churns for around 10 seconds, then errors out with "Rate limit exceeded." Inspecting the output file, I then see 3,000 rows successfully collected – which makes sense, as that is 200 x 15 requests, at 15 requests in 10 seconds gets me shut down by Twitter for the next 15 minutes.

Is there no way to throttle the client.friends or client.followers methods? Why does it make its requests rapid-fire, with no way to throttle? Or have I missed something in the documentation?

sferik

unread,
Dec 7, 2013, 4:40:18 AM12/7/13
to twitter-...@googlegroups.com
Have you read this yet?

Ron Hornbaker

unread,
Dec 7, 2013, 2:45:53 PM12/7/13
to twitter-...@googlegroups.com
Hi Erik, thanks for the quick reply, and for your work with the twitter gem.

I did see that, but to me, it looked like a retry (when successful) would just repeat the client.friends method, and I'd end up getting the first 3,000 friends of the list over and over. Would love to be wrong about this, so I'll give it a try later today and report back... hopefully the cursor will survive the retry...

-Ron

On Saturday, December 7, 2013 1:40:18 AM UTC-8, sferik wrote:
Have you read this yet?

Ron Hornbaker

unread,
Dec 7, 2013, 3:57:29 PM12/7/13
to twitter-...@googlegroups.com
Hi Erik,

As I feared, it appears the cursor does not survive the retry when hitting the rate limit, so I'm still very stuck on this.

Here is my updated code, using your technique to detect and sleep through a rate limit error: https://gist.github.com/ronhornbaker/7817176

When run on the twitter user with 54,000 friends, I ended up getting the first 3,000 friends written to the file repeatedly.

I'll paste the pertinent method here for convenience:

def fetch_all_friends(twitter_username, max_attempts = 100)
# in theory, one failed attempt will occur every 15 minutes, so this could be long-running
# with a long list of friends
num_attempts = 0
client = twitter_client
myfile = File.new("#{twitter_username}_friends_list.txt", "w")
running_count = 0
cursor = -1
while (cursor != 0) do
begin
num_attempts += 1
friends = client.friends(twitter_username, {:cursor => cursor, :count => 200} )
friends.each do |f|
running_count += 1
myfile.puts "\"#{running_count}\",\"#{f.name.gsub('"','\"')}\",\"#{f.screen_name}\",\"#{f.url}\",\"#{f.followers_count}\",\"#{f.location.gsub('"','\"').gsub(/[\n\r]/," ")}\",\"#{f.created_at}\",\"#{f.description.gsub('"','\"').gsub(/[\n\r]/," ")}\",\"#{f.lang}\",\"#{f.time_zone}\",\"#{f.verified}\",\"#{f.profile_image_url}\",\"#{f.website}\",\"#{f.statuses_count}\",\"#{f.profile_background_image_url}\",\"#{f.profile_banner_url}\""
end
puts "#{running_count} done"
cursor = friends.next_cursor
break if cursor == 0
rescue Twitter::Error::TooManyRequests => error
if num_attempts <= max_attempts
# NOTE: Your process could go to sleep for up to 15 minutes but if you
# retry any sooner, it will almost certainly fail with the same exception.
puts "#{running_count} done"
puts "Hit rate limit, sleeping for #{error.rate_limit.reset_in}..."
sleep error.rate_limit.reset_in
retry
else
raise
end
end
end
end

And here was the output to the console prior to me stopping the script:

3000 done
Hit rate limit, sleeping for 881...
6000 done
Hit rate limit, sleeping for 876...
9000 done
Hit rate limit, sleeping for 878...

Looks good, except when I checked the file, there were 3 copies of every friend. :(

-Ron

Ron Hornbaker

unread,
Dec 8, 2013, 6:40:23 PM12/8/13
to twitter-...@googlegroups.com
FYI, I figured it out, and updated the gist at https://gist.github.com/ronhornbaker/7817176 with my working code.

Had to update the cursor in the rescue block, like so:

def fetch_all_friends(twitter_username, max_attempts = 100)
  # in theory, one failed attempt will occur every 15 minutes, so this could be long-running
  # with a long list of friends
  num_attempts = 0
  client = twitter_client
  myfile = File.new("#{twitter_username}_friends_list.txt", "w")
  running_count = 0
  cursor = -1
  while (cursor != 0) do
    begin
      num_attempts += 1
      # 200 is max, see https://dev.twitter.com/docs/api/1.1/get/friends/list
      friends = client.friends(twitter_username, {:cursor => cursor, :count => 200} )
      friends.each do |f|
        running_count += 1
        myfile.puts "\"#{running_count}\",\"#{f.name.gsub('"','\"')}\",\"#{f.screen_name}\",\"#{f.url}\",\"#{f.followers_count}\",\"#{f.location.gsub('"','\"').gsub(/[\n\r]/," ")}\",\"#{f.created_at}\",\"#{f.description.gsub('"','\"').gsub(/[\n\r]/," ")}\",\"#{f.lang}\",\"#{f.time_zone}\",\"#{f.verified}\",\"#{f.profile_image_url}\",\"#{f.website}\",\"#{f.statuses_count}\",\"#{f.profile_background_image_url}\",\"#{f.profile_banner_url}\""
      end
      puts "#{running_count} done"
      cursor = friends.next_cursor
      break if cursor == 0
    rescue Twitter::Error::TooManyRequests => error
      if num_attempts <= max_attempts
        cursor = friends.next_cursor if friends && friends.next_cursor
        puts "#{running_count} done from rescue block..."

        puts "Hit rate limit, sleeping for #{error.rate_limit.reset_in}..."
        sleep error.rate_limit.reset_in
        retry
      else
        raise
      end
    end
  end
end
Reply all
Reply to author
Forward
0 new messages