Efficient way to collect all Search Resources for a multi-page return?

62 views

Skip to first unread message

Ed

unread,

Mar 22, 2012, 2:02:38 PM3/22/12

to Otter API to Topsy

I am interested in collecting 100 returns per page and up to 10 pages
of returns from a Topsy Otter Search Resource. So, I would be
collecting somewhere between [ 0, 10*100] returns ("tweets").

There is probably a more efficient way to do this than my current
approach, which is:

1) do an initial Search request, then look at the return field "total"
in the JSON "response" field. Then I use this to calculate how many
pages I need to request ( numPages = Min( "total" / 100 ,10) )

2) then I do separate calls for each page number [1, numPages] and get
the data I need.

This approach is very slow...

I am thinking that I should instead be using "last_offset" and
"offset" instead so I do not have to do this pre-checking of how many
pages I can search and then looping through the page number index and
searching one page at a time...

So, my long winded question really boils down to:

What is an efficient way to request all Topsy Search results to be
collected, from 0 (if the query returns none, to 100 per page * 10
pages = 1000 total possible results?

Thanks in advance for any suggestions you have for me on this.
Ed

Mehdi Lahmam B.

unread,

Mar 25, 2012, 10:54:45 AM3/25/12

to otte...@googlegroups.com

I use a do while loop.

I do first a query, and while results are not empty, do another one for the next page.

Topsy API provides on responses a 'last_offset' param which could be useful to know if there is more results pages to fetch or not.

But the behavior is not clear to me.

Take a look to a discussion about that: https://groups.google.com/forum/?fromgroups#!topic/otterapi/O-v9JpFS-NA

Reply all

Reply to author

Forward

0 new messages