Search API - 403 bursts and (maybe) a caching issue.

25 views
Skip to first unread message

briantroy

unread,
Oct 26, 2009, 3:47:02 PM10/26/09
to Twitter Development Talk
Everything below ONLY PERTAINS TO THE SEARCH API:

1) Since late last week I've noticed a significant number of 403
errors (403 Error from JSON: since_id too recent, poll less
frequently). These usually indicate I'm hitting a server with an
"older" view of the search index - since it thinks the ID I sent in
since_id is newer than the newest it has. These trouble me because
when I get a 200 after the 403 sometimes I get everything back to my
since_id, sometimes I don't. I appears some indexes have gaps until
they catch up.

QUESTION: Are there any ongoing search indexing issues that you are
aware of?

2) Since late last week I've noticed that some search API requests
appear to get "stuck" returning an empty json result (no new tweets).
This can go on for HOURS (today one got stuck like this for 12 hours).
When I restart my process sometimes this clears up (I get the backlog)
- other times it does not (I continue to get 0 tweets in the json).
All of the requests return HTTP 200 and valid json.

QUESTION: Are they any ongoing caching issues with the search API?

These issues are new in the last 7 days (since about last Thursday).
My IP is whitelisted. I'm sending both a valid user agent and referrer
header. My processes are throttled by the volume of tweets the
receive. I've made no changes to my processing since late September.

Any assistance would be appreciated. My user's are comparing what they
see from my service to search.twitter.com and telling me we are
broken.

Regards,

Brian Roy
justSignal

briantroy

unread,
Oct 26, 2009, 6:31:29 PM10/26/09
to Twitter Development Talk
Actually I can confirm my previous supposition, here is the log for an
empty 200 response with a new max_id:

DEBUG: 06:02:44 PM on Mon October 26th Doing CURL fetch with User
Agent: justsignal/1.0 (+http://justsignal.com) and RFERER:
http://justsignal.com/widgets/20ab5e90bf116397d6fb84ca80321928/widget.html
DEBUG: 06:02:45 PM on Mon October 26th Twitter responded with 200 HTTP
Status Code.
DEBUG: 06:02:45 PM on Mon October 26th MaxID: 5182676703
DEBUG: 06:02:45 PM on Mon October 26th There are: 0 results in this
fetch.
Updating number for api hits for hour: 18 to: 1
THROTTLE-69: 06:02:45 PM on Mon October 26th Slowing collection...
Avg: 0 returning delay: 180
DEBUG: 06:02:45 PM on Mon October 26th Checking for next page... ****
DEBUG: 06:02:45 PM on Mon October 26th There is NOT another page of
results...
DEBUG: 06:02:45 PM on Mon October 26th Old max: 5181314618 New max:
5182676703
DEBUG: 06:02:45 PM on Mon October 26th Old max: 5182676703 New max:
5182676703


As you can see I'm getting 0 tweets and a new max_id... that isn't
good.

Please advise.

Regards,

Brian Roy
justSignal

Vipul

unread,
Oct 26, 2009, 8:26:00 PM10/26/09
to Twitter Development Talk
we are seeing the same issue at our end. It gets better in the night
(PST) and then breaks in the morning.
I don't even see 403 but only 200s. Our (5 minutely) search request
comes back with none, one or two results at the max though i know
every minute there are about a 100 messages (as we'v been getting
consistently till 10/21 or so)
(No breach of limit at our end!!)

On Oct 26, 12:47 pm, briantroy <brian.cosin...@gmail.com> wrote:

briantroy

unread,
Oct 26, 2009, 6:28:25 PM10/26/09
to Twitter Development Talk
This is happening RIGHT NOW for the following:

1) Go to search.twitter.com and enter "tweetsforboobs OR
tweetforboobs" as the search.

2) Go to http://tweetsforboobs.org and see the twitter feed on the
left.

Notice that the last tweet from 2 hours ago (VerticalMeasures) is not
in the twitter feed on tweetsforboobs.org. Also note the ID of the
tweet - from VerticalMeasures that is missing from tweetsforboobs.org:
5181937429

Now here is the log file of the Twitter API call:

DEBUG: 06:18:01 PM on Mon October 26th Doing CURL fetch with User
DEBUG: 06:18:01 PM on Mon October 26th Twitter responded with 200 HTTP
Status Code.
DEBUG: 06:18:01 PM on Mon October 26th MaxID: 5182676703
DEBUG: 06:18:01 PM on Mon October 26th There are: 0 results in this
fetch.
Updating number for api hits for hour: 18 to: 6
THROTTLE-69: 06:18:01 PM on Mon October 26th Slowing collection...
Avg: 0 returning delay: 180
DEBUG: 06:18:01 PM on Mon October 26th Checking for next page... ****
DEBUG: 06:18:01 PM on Mon October 26th There is NOT another page of
results...
DEBUG: 06:18:01 PM on Mon October 26th Old max: 5182676703 New max:
5182676703
DEBUG: 06:18:01 PM on Mon October 26th Old max: 5182676703 New max:
5182676703

Note that our id is already > the last tweet ID from VerticalMeasures,
yet we never got that tweet. Our id from the log snip: (5182676703) is
NOT in our database (we never got it). It does not match the tweet ID
before Vertical Measures: 5180513610

Somehow the API is returning a new (and bigger) max id on 200
responses with no tweets in them OR on 403 (those are the only two
http codes in the log for today). Either way, that shouldn't be
happening.

Brian Roy
justSignal




On Oct 26, 12:47 pm, briantroy <brian.cosin...@gmail.com> wrote:

Marc W

unread,
Oct 27, 2009, 9:56:49 PM10/27/09
to Twitter Development Talk

A number of people are seeing similar things, especially if you
specify a since_id:

http://groups.google.com/group/twitter-development-talk/browse_thread/thread/e6289b6439c1d26d/e367ca8af09d28d5?lnk=gst&q=searches+returning+no+tweets&pli=1

My current (extremely bad) solution is to just "hire hose" and get all
the tweets every time, and then filter out those I've seen before by
id. Gaaak!

I'll see if opening a support ticket or whatever helps with this
instead.

Mark.


On Oct 27, 6:28 am, briantroy <brian.cosin...@gmail.com> wrote:
> This is happening RIGHT NOW for the following:
>
> 1) Go to search.twitter.com and enter "tweetsforboobs OR
> tweetforboobs" as the search.
>
> 2) Go tohttp://tweetsforboobs.organd see the twitter feed on the
> left.
>
> Notice that the last tweet from 2 hours ago (VerticalMeasures) is not
> in the twitter feed on tweetsforboobs.org. Also note the ID of the
> tweet - from VerticalMeasures that is missing from tweetsforboobs.org:
> 5181937429
>
> Now here is the log file of the Twitter API call:
>
> DEBUG: 06:18:01 PM on Mon October 26th Doing CURL fetch with User
> Agent: justsignal/1.0 (+http://justsignal.com) and RFERER:http://justsignal.com/widgets/20ab5e90bf116397d6fb84ca80321928/widget...

Marc W

unread,
Oct 28, 2009, 12:02:36 AM10/28/09
to Twitter Development Talk
It looks as though it depends on the exact nature of the query.

The following always return up to date results, even with a since_id
(I haven't included those since_ids here)

http://search.twitter.com/search.json?q=hong+kong+OR+kowloon&rpp=100
http://search.twitter.com/search.json?q=%23iphone&rpp=100

but the following will just return 200 OK with no results:

http://search.twitter.com/search.json?q=from%3ADavidFeng+OR+from%3ABeijingWithKids+OR+from%3ABlueJDMBA+OR+from%3Akaiserkuo+OR+from%3Acharlieflint+OR+from%3Aourmaninsh&rpp=100
http://search.twitter.com/search.json?q=%23beijing+OR+%E5%8C%97%E4%BA%AC+OR+beijing+OR+%E5%8D%97%E7%BD%97%E6%95%85%E4%B9%A1+OR+nanluoguxiang+OR+%E4%B8%89%E9%87%8C%E5%B1%AF+OR+%E4%BA%94%E9%81%93%E5%8F%A3+OR+%E6%9C%9B%E4%BA%AC+OR+wudaokou+OR+sanlitun&rpp=100

Interesting: change the above first query to:

http://search.twitter.com/search.json?q=hong+kong+OR+kowloon+OR+tsim+tsa+shui&rpp=100

and then the results STOP coming if there is a since_id ....


I've filed a support ticket with Twitter ( 623447 ) with this info,
and hopefully we'll see some progress on it.

Mark.


On Oct 28, 9:56 am, Marc W <marcwanchipm...@gmail.com> wrote:
> A number of people are seeing similar things, especially if you
> specify a since_id:
>
> http://groups.google.com/group/twitter-development-talk/browse_thread...
>
> My current (extremely bad) solution is to just "hire hose" and get all
> the tweets every time, and then filter out those I've seen before by
> id.  Gaaak!
>
> I'll see if opening a support ticket or whatever helps with this
> instead.
>
> Mark.
>
> On Oct 27, 6:28 am, briantroy <brian.cosin...@gmail.com> wrote:
>
>
>
> > This is happening RIGHT NOW for the following:
>
> > 1) Go to search.twitter.com and enter "tweetsforboobs OR
> > tweetforboobs" as the search.
>
> > 2) Go tohttp://tweetsforboobs.organdsee the twitter feed on the

janole

unread,
Oct 29, 2009, 6:30:49 AM10/29/09
to Twitter Development Talk
I'm experiencing the same. Empty results from the Search API when
using the since_id parameter.

This is really bad and my users are complaining about the Saved
Searches tabs not updating.

If you're lucky you end up at a caching server with up-to-date
information, but it seems as if you can't force using that caching
server.

Please let us know if this can be fixed easily.

Ole / Gravity Twitter Client for S60/Symbian

su...@mobileways.de / @janole on Twitter

On 26 Okt., 20:47, briantroy <brian.cosin...@gmail.com> wrote:
> Everything below ONLY PERTAINS TO THESEARCHAPI:
>
> 1) Since late last week I've noticed a significant number of 403
> errors (403 Error from JSON: since_id too recent, poll less
> frequently). These usually indicate I'm hitting a server with an
> "older" view of thesearchindex - since it thinks the ID I sent in
> since_id is newer than the newest it has. These trouble me because
> when I get a 200 after the 403 sometimes I get everything back to my
> since_id, sometimes I don't. I appears some indexes have gaps until
> they catch up.
>
> QUESTION: Are there any ongoingsearchindexing issues that you are
> aware of?
>
> 2) Since late last week I've noticed that somesearchAPI requests
> appear to get "stuck" returning an empty json result (no new tweets).
> This can go on for HOURS (today one got stuck like this for 12 hours).
> When I restart my process sometimes this clears up (I get the backlog)
> - other times it does not (I continue to get 0 tweets in the json).
> All of the requests return HTTP 200 and valid json.
>
> QUESTION: Are they any ongoing caching issues with thesearchAPI?
>
> These issues are new in the last 7 days (since about last Thursday).
> My IP is whitelisted. I'm sending both a valid user agent and referrer
> header. My processes are throttled by the volume of tweets the
> receive. I've made no changes to my processing since late September.
>
> Any assistance would be appreciated. My user's are comparing what they
> see from my service tosearch.twitter.com and telling me we are
Reply all
Reply to author
Forward
0 new messages