Jonathan Eunice
unread,Oct 3, 2008, 1:58:28 PM10/3/08Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Twitter Development Talk
I run a script to collect the tweets in my friends_timeline. It uses
since_id to keep track of tweets already retrieved, advancing since_id
on each batch of tweets received. Works pretty well, but unfortunately
misses some tweets during heavy usage periods (like last night's US VP
debate). I suspect a scenario like this:
friends_timeline since_id=1000
gets tweets 1001...1050, with some holes in the sequence
friends_timeline since_id=1050
gets tweets 1051...1075, with some more holes in sequence
at the same time tweets 1022,1027, and 1031 are now available,
but are no longer being requested, so missed
iterate many times
each time losing tweets that were previously requested, but at
the time not ready
and missing out on the "late arrivals"
Is this scenario feasible/likely? And if so, what should I do to guard
against it?
I see no way to efficiently communicate "I have these ids, give me the
ones I don't have," which some APIs have.
The brute force approach would be to rescan friends_timeline
periodically to pick up late arrivals. Is it just that simple? Do I
really need to be that brutish? Or am I missing something?