The two queries you mention are not equivalent. It's required to pass
"last_offset" as "offset" when you are fetching the next page.
In a process termed "post filtering", the API layer applies filters to
search results (for de-deping, profanity, etc) that come from the main
search index. If results are filtered due to this post filtering, the
last_offset value is incremented and communicated to the caller. eg: If
there were 30 results, and first 4 were filtered, then
q=QUERY&page=1&perpage=10 will return 10 results with last_offset set to
14. You should then get the next 10 results with
There's another finer point re consistency. Topsy is a real-time index, so
it's possible that the result set has changed by the time you fetch page
#2. If you want your results to be consistent, you should add a
maxtime=TIMESTAMP parameter to both page1 and page2. TIMESTAMP can be
current clock time when page1 query is issued.
Hope this helps.
On Wednesday, February 15, 2012 9:36:24 AM UTC-8, David Rees (@studgeek)