Message from discussion
A gnarly little python loop
Received: by 10.66.88.200 with SMTP id bi8mr5855272pab.27.1352603022347;
Sat, 10 Nov 2012 19:03:42 -0800 (PST)
MIME-Version: 1.0
Received: by 10.68.242.74 with SMTP id wo10mr4690722pbc.9.1352603022323; Sat,
10 Nov 2012 19:03:42 -0800 (PST)
Path: s9ni3584pbb.0!nntp.google.com!kr7no1238688pbb.0!postnews.google.com!px4g2000pbc.googlegroups.com!not-for-mail
Newsgroups: comp.lang.python
Date: Sat, 10 Nov 2012 19:03:42 -0800 (PST)
Complaints-To: groups-abuse@google.com
Injection-Info: px4g2000pbc.googlegroups.com; posting-host=199.27.178.37; posting-account=982SGwoAAABP0Ne6tKwJosPMqcbNWXn4
NNTP-Posting-Host: 199.27.178.37
References: <roy-9EBEAD.17581410112012@news.panix.com>
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8)
AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1,gzip(gfe)
Message-ID: <5a260a79-818d-47a8-9404-37b014587730@px4g2000pbc.googlegroups.com>
Subject: Re: A gnarly little python loop
From: Steve Howell <show...@domaintools.com>
Injection-Date: Sun, 11 Nov 2012 03:03:42 +0000
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Nov 10, 2:58=A0pm, Roy Smith <r...@panix.com> wrote:
> I'm trying to pull down tweets with one of the many twitter APIs. =A0The
> particular one I'm using (python-twitter), has a call:
>
> data =3D api.GetSearch(term=3D"foo", page=3Dpage)
>
> The way it works, you start with page=3D1. =A0It returns a list of tweets=
.
> If the list is empty, there are no more tweets. =A0If the list is not
> empty, you can try to get more tweets by asking for page=3D2, page=3D3, e=
tc.
> I've got:
>
> =A0 =A0 page =3D 1
> =A0 =A0 while 1:
> =A0 =A0 =A0 =A0 r =3D api.GetSearch(term=3D"foo", page=3Dpage)
> =A0 =A0 =A0 =A0 if not r:
> =A0 =A0 =A0 =A0 =A0 =A0 break
> =A0 =A0 =A0 =A0 for tweet in r:
> =A0 =A0 =A0 =A0 =A0 =A0 process(tweet)
> =A0 =A0 =A0 =A0 page +=3D 1
>
> It works, but it seems excessively fidgety. =A0Is there some cleaner way
> to refactor this?
I think your code is perfectly readable and clean, but you can flatten
it like so:
def get_tweets(term, get_page):
page_nums =3D itertools.count(1)
pages =3D itertools.imap(api.getSearch, page_nums)
valid_pages =3D itertools.takewhile(bool, pages)
tweets =3D itertools.chain.from_iterable(valid_pages)
return tweets