Search API: High-Volume and Repeated Queries Should Migrate to Streaming API

199 views

Skip to first unread message

John Kalucki

unread,

Jan 11, 2010, 1:40:32 PM1/11/10

to twitter-ap...@googlegroups.com

With the Streaming API in full production with a track record of stability, heavy and/or repetitive Search API queries can now be handled more effectively on the Streaming API track resource. This is a great opportunity to improve your service quality, efficiency and also Twitter's efficiency, which immediately feeds back into better resource availability and a better experience for everyone.

If your application polls for keywords, mentions, is whitelisted on the Search API, or makes more than perhaps 10 queries per minute, you should begin your migration to Streaming. Desktop clients should postpone a migration to Streaming.

This transition begins a fundamental shift towards a high value, high result quality, lower query volume Search API. We're announcing the shift as early as possible to give developers as much time as practical to make the transition and avoid any service disruptions as we adjust result quality, rate limits, licensing and terms of use agreements.

The overwhelming majority of the above applications will be better served by the Streaming API:

Complete corpus search: Search is focused on result set quality and there are no guarantees to return all matching tweets. Complete results are only available on the Streaming API. Search results are increasingly filtered and reordered for relevance.

Lower latency results: From tweet creation to delivery on the API, latency is usually within a second.

Predictable rate limits: Streaming is built upon well-defined elevated access roles so that client rate-limit-avoidance heuristics are eliminated.

Higher peak capacity: During a peak event, when tweets spike, the Streaming API is less likely to fall behind or begin aggressive rate limiting. Furthermore, the risk of a large client peak capacity emergency blacklisting is reduced.

More consistent results: Hosting a continuously updated REST API on a large cluster inevitably leads to temporal result skew due to internal propagation delay. This issue is largely eliminated by long-lived connections.

More efficient: Bandwidth and processing are not wasted on identical results. Also, repetitive and long-tail queries are processed more efficiently in the Streaming architecture.

Improved Search experience: Shifting the heaviest users away from Search should dramatically improve the overall Search experience. Resources can be allocated to the search architecture's strength: historical, complex and high value queries.

The Streaming API has its quirks. We're working to reduce the effort required to integrate and plug gaps.

Reply all

Reply to author

Forward

0 new messages