Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Has Next in Python Iterators

999 views
Skip to first unread message

Kelson Zawack

unread,
Oct 21, 2010, 7:08:00 AM10/21/10
to pytho...@python.org
I have been programing in python for a while now and by in large love
it. One thing I don't love though is that as far as I know iterators
have no has_next type functionality. As a result if I want to iterate
until an element that might or might not be present is found I either
wrap the while loop in a try block or break out of a for loop. Since an
iterator having an end is not actually an exceptional case and the for
construct is really for iterating though the entirety of a list both of
these solutions feel like greasy workarounds and thus not very
pythonic. Is there something I am missing? Is there a reason python
iterators don't have has_next functionality? What is the standard
solution to this problem?

Felipe Bastos Nunes

unread,
Oct 21, 2010, 7:36:41 AM10/21/10
to Kelson Zawack, pytho...@python.org
Looking in the documentation, only the StopIteration raises. I'd like
a hasNext() too. I'll see if it is easy to implement, but maybe it'sn
ot yet there coz for does the work greatly.

2010/10/21, Kelson Zawack <zawa...@gis.a-star.edu.sg>:

> --
> http://mail.python.org/mailman/listinfo/python-list
>


--
Felipe Bastos Nunes

Ben Finney

unread,
Oct 21, 2010, 7:37:18 AM10/21/10
to
Kelson Zawack <zawa...@gis.a-star.edu.sg> writes:


> […] if I want to iterate until an element that might or might not be


> present is found I either wrap the while loop in a try block or break
> out of a for loop.

I'm not sure what exception you would catch, but that could be a good
solution.

The general solution would be to break from the ‘for’ loop when you find
the terminating item.

> Since an iterator having an end is not actually an exceptional case

Right. You're not talking about the end of the iterator, though; you're
talking about stopping *before* the end, when a particular item is
reached.

Either that, or you'll need to be clearer about what the problem is.
A simple, complete, working example would help.

> and the for construct is really for iterating though the entirety of a
> list

Or until something else happens to change the flow, such as a ‘break’
statement.

> both of these solutions feel like greasy workarounds and thus not
> very pythonic.

Show us a simple, complete, working example, and let's see how Pythonic
it looks.

> Is there something I am missing? Is there a reason python iterators
> don't have has_next functionality?

Many iterators can't know whether they have a next item without actually
generating that item.

> What is the standard solution to this problem?

I'm not sure that it is a problem. Let's see the example, to make it
more concrete.

--
\ “We must respect the other fellow's religion, but only in the |
`\ sense and to the extent that we respect his theory that his |
_o__) wife is beautiful and his children smart.” —Henry L. Mencken |
Ben Finney

Steven D'Aprano

unread,
Oct 21, 2010, 8:26:50 AM10/21/10
to
On Thu, 21 Oct 2010 09:36:41 -0200, Felipe Bastos Nunes wrote:

> Looking in the documentation, only the StopIteration raises. I'd like a
> hasNext() too. I'll see if it is easy to implement,


Iterators can be unpredictable. In general, you can't tell whether an
iterator is finished or not until you actually try it. Consider the
following example:

def rand():
x = random.random()
while x < 0.5:
yield x

it = rand()


What should it.hasNext() return?

I know what you're thinking: "it's easy to cache the next result, and
return it on the next call". But iterators can also be dependent on the
time that they are called, like in this example:

def evening_time():
while 1:
yield time.strftime("%H:%m") # e.g. "23:20"


it = time_of_day() # it's 23:59
if it.hasNext(): # Returns True
time.sleep(180) # Do some other processing for three minutes
print it.next() # time is now 00:02, but iterator says it's 23:59

Of course, caching the result of an iterator can work for *some*
iterators. But it's not a general solution suitable for all iterators.
There is no general solution.

--
Steven

Steven D'Aprano

unread,
Oct 21, 2010, 8:39:25 AM10/21/10
to
On Thu, 21 Oct 2010 19:08:00 +0800, Kelson Zawack wrote:

> I have been programing in python for a while now and by in large love
> it. One thing I don't love though is that as far as I know iterators
> have no has_next type functionality. As a result if I want to iterate
> until an element that might or might not be present is found I either
> wrap the while loop in a try block or break out of a for loop.

Yes, they are two solutions to the problem. You could also look at
takewhile and dropWhile from the itertools module.


> Since an
> iterator having an end is not actually an exceptional case

But it is. An iterator is supposed to yield items. That's what they're
for. When there are no more items, that's an exceptional change of state.

> and the for
> construct is really for iterating though the entirety of a list

Nonsense. That's why Python has continue and break statements, so you can
break out of for loops early. But if you don't like break, you could
always use a while loop.


> both of
> these solutions feel like greasy workarounds and thus not very pythonic.

I can't help how they feel to you, but I assure you they are perfectly
Pythonic.


> Is there something I am missing? Is there a reason python iterators
> don't have has_next functionality?

Yes. What you ask for is impossible to implement for generic iterators.
Not hard. Not inconvenient. Impossible.

Of course you can implement it yourself in your own iterators:


class LookAheadIterator:
def __init__(self, n):
self.n = n
def next(self):
if self._n < 0: raise StopIteration
self._n -= 1
return self._n + 1
def __iter__(self):
return self
def hasNext(self):
return self._n >= 0


but there's no general solution that will work for arbitrary iterators.


--
Steven

Paul Rudin

unread,
Oct 21, 2010, 9:35:06 AM10/21/10
to
Kelson Zawack <zawa...@gis.a-star.edu.sg> writes:

> Since an iterator having an end is not actually an exceptional case...

There's no requirement on iterators to be finite, so in a sense it is.

In general it may be impractical to know whether an iterator has reached
the end without calling next().


Alexander Gattin

unread,
Oct 23, 2010, 5:46:42 AM10/23/10
to Steven D'Aprano, pytho...@python.org
Hello,

On Thu, Oct 21, 2010 at 12:26:50PM +0000, Steven
D'Aprano wrote:
> I know what you're thinking: "it's easy to cache
> the next result, and return it on the next
> call". But iterators can also be dependent on
> the time that they are called, like in this
> example:
>
> def evening_time():
> while 1:
> yield time.strftime("%H:%m") # e.g. "23:20"

When working with things like this you should
anyway use some lock/sync/transaction, so that
prefetching iterator will most probably be OK
unless you've fundamentally screwed your code's
logics.

WRT the prefetcher, I'd use smth like this:
> def rand_gen():
> x = random.random()
> while x < 0.9:
> yield x
> x = random.random()
>
> class prefetcher:
> def __init__(self, i):
> self.i = i
> self.s = False
> self._prefetch()
> def _prefetch(self):
> try: self.n = self.i.next()
> except StopIteration: self.s = True
> def has_next(self): return not self.s
> def next(self):
> if self.s: raise StopIteration()
> else:
> n = self.n
> self._prefetch()
> return n
> def __iter__(self): return self
>
> rand_pre = prefetcher(rand_gen())
> while rand_pre.has_next(): print rand_pre.next()

--
With best regards,
xrgtn

Kelson Zawack

unread,
Oct 25, 2010, 6:33:24 AM10/25/10
to
The example I have in mind is list like [2,2,2,2,2,2,1,3,3,3,3] where
you want to loop until you see not a 2 and then you want to loop until
you see not a 3. In this situation you cannot use a for loop as
follows:

foo_list_iter = iter([2,2,2,2,2,2,1,3,3,3,3])
for foo_item in foo_list_iter:
if foo_item != 2:
break
because it will eat the 1 and not allow the second loop to find it.
takeWhile and dropWhile have the same problem. It is possible to use
a while loop as follows:

foo_list_item = foo_list_iter.next()
while foo_list_item == 2:
foo_list_item = foo_list_iter.next()
while foo_list_item == 3:
foo_list_item = foo_list_iter.next()

but if you can't be sure the list is not empty/all 2s then all 3s you
need to surround this code in a try block. Unless there is a good
reason for having to do this I think it is undesirable because it
means that the second clause of the loop invariant, namely that you
are not off the end of the list, is being controlled outside of the
loop.

As for the feasibly of implementing a has_next function I agree that
you cannot write one method that will provide the proper functionality
in all cases, and thus that you cannot create a has_next for
generators. Iterators however are a different beast, they are
returned by the thing they are iterating over and thus any special
cases can be covered by writing a specific implementation for the
iterable in question. This sort of functionality is possible to
implement, because java does it.

Stefan Behnel

unread,
Oct 25, 2010, 7:02:27 AM10/25/10
to pytho...@python.org
Kelson Zawack, 25.10.2010 12:33:

> The example I have in mind is list like [2,2,2,2,2,2,1,3,3,3,3] where
> you want to loop until you see not a 2 and then you want to loop until
> you see not a 3. In this situation you cannot use a for loop as
> follows:
>
> foo_list_iter = iter([2,2,2,2,2,2,1,3,3,3,3])
> for foo_item in foo_list_iter:
> if foo_item != 2:
> break
> because it will eat the 1 and not allow the second loop to find it.
> takeWhile and dropWhile have the same problem. It is possible to use
> a while loop as follows:
>
> foo_list_item = foo_list_iter.next()
> while foo_list_item == 2:
> foo_list_item = foo_list_iter.next()
> while foo_list_item == 3:
> foo_list_item = foo_list_iter.next()

Why not combine the two into this:

foo_list_iter = iter([2,2,2,2,2,2,1,3,3,3,3])

for item in foo_list_iter:
if item != 2:
print item
break
# not sure what to do with the "1" here, hope you do ...
for item in foo_list_iter:
if item != 3:
print item

Also take a look at the itertools module, which provide little helpers like
takewhile(), groupby() and some other nice filters.


> As for the feasibly of implementing a has_next function I agree that
> you cannot write one method that will provide the proper functionality
> in all cases, and thus that you cannot create a has_next for
> generators. Iterators however are a different beast, they are
> returned by the thing they are iterating over and thus any special
> cases can be covered by writing a specific implementation for the
> iterable in question.

Well, then just write the specific iterable wrappers that you need for your
specific case. If you do it well, you can make your code a lot easier to
read that way. And that's usually a very good proof that something like
"has_next()" is not needed at all and just complicates things.


> This sort of functionality is possible to
> implement, because java does it.

Like that was an argument for anything. Java iterators also require you to
implement "remove()". I can't remember ever doing anything else but
throwing an exception in my implementation for that. Bad design isn't a
good reason for a feature to be copied.

Stefan

Jussi Piitulainen

unread,
Oct 25, 2010, 9:47:22 AM10/25/10
to
Kelson Zawack writes:

> The example I have in mind is list like [2,2,2,2,2,2,1,3,3,3,3]
> where you want to loop until you see not a 2 and then you want to
> loop until you see not a 3. In this situation you cannot use a for
> loop as follows:

...


> because it will eat the 1 and not allow the second loop to find it.
> takeWhile and dropWhile have the same problem. It is possible to

...

The following may or may not be of interest to you, or to someone
else. If not, please ignore. Probably I misuse some of the technical
terms.

Instead of trying to consume an initial part of an iterable, one can
make a new generator that consumes the underlying iterator in the
desired way. This works nicely if no other code consumes the same
underlying data. My implementation of "after" is below, after the
examples, and after that is an alternative implementation.

Python 3.1.1 (r311:74480, Feb 8 2010, 14:06:51)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from echoed import *
>>> list(after(after([3,1,4,1,5], is2), is3))
[1, 4, 1, 5]
>>> list(after(after([2,2,3,1,4,1,5], is2), is3))
[1, 4, 1, 5]
>>> list(after(after([2,2,1,4,1,5], is2), is3))
[1, 4, 1, 5]
>>> list(after(after([2,2], is2), is3))
[]
>>> list(after(after([3,3], is2), is3))
[]
>>> list(after(after([], is2), is3))
[]
>>> list(after(after([1,4,1,5], is2), is3))
[1, 4, 1, 5]

The implementation of "after" uses two simple auxiliaries, "echoed"
and "evens", which double every item and drop every other item,
respectively. In a sense, "after" returns the echoes of the desired
items: when the first item of interest is encountered in the echoed
stream, its echo remains there.

def echoed(items):
for item in items:
yield item
yield item

def evens(items):
for item in items:
yield item
next(items)

def after(items, is_kind):
echoed_items = echoed(items)
try:
while is_kind(next(echoed_items)):
next(echoed_items)
except StopIteration:
pass
return evens(echoed_items)

def is2(x):
return x == 2

def is3(x):
return x == 3

Alternatively, and perhaps better, one can push the first item of
interest back into a new generator. The auxiliary "later" below does
that; "past" is the alternative implementation of the desired
functionality, used like "after" above.

def later(first, rest):
yield first
for item in rest:
yield item

def past(items, is_kind):
items = iter(items)
try:
item = next(items)
while is_kind(item):
item = next(items)
except StopIteration:
return items
return later(item, items)

The names of my functions may not be the best. Many more disclaimers
apply.

Hrvoje Niksic

unread,
Oct 25, 2010, 10:11:26 AM10/25/10
to
Kelson Zawack <zawa...@gis.a-star.edu.sg> writes:

> Iterators however are a different beast, they are returned by the
> thing they are iterating over and thus any special cases can be
> covered by writing a specific implementation for the iterable in
> question. This sort of functionality is possible to implement,
> because java does it.

Note that you can wrap any iterator into a wrapper that implements
has_next along with the usual iteration protocol. For example:

class IterHasNext(object):
def __init__(self, it):
self.it = iter(it)

def __iter__(self):
return self

def next(self):
if hasattr(self, 'cached_next'):
val = self.cached_next
del self.cached_next
return val
return self.it.next()

def has_next(self):
if hasattr(self, 'cached_next'):
return True
try:
self.cached_next = self.it.next()
return True
except StopIteration:
return False

>>> it = IterHasNext([1, 2, 3])
>>> it.next()
1
>>> it.has_next()
True
>>> it.next()
2
>>> it.next()
3
>>> it.has_next()
False
>>> it.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in next
StopIteration

Paul Rudin

unread,
Oct 25, 2010, 1:14:01 PM10/25/10
to
Kelson Zawack <zawa...@gis.a-star.edu.sg> writes:

> The example I have in mind is list like [2,2,2,2,2,2,1,3,3,3,3] where
> you want to loop until you see not a 2 and then you want to loop until
> you see not a 3.

"loop until you see not a 2" - you mean yield 2s as long as there are 2s
to be consumed?

"loop until you see not a 3" - you mean yield 3s as long as there are 3s
to be consumed?

So in this case you'd see just the initial 2s? (Since the 1 immediately
follows the 2s and is "not a 3")

def twos_then_threes(iterable):
i = iter(iterable)
n = i.next()
while n==2:
yield n
n=i.next()
while n==3:
yield n
n=i.next()

so... no need for any has_next()

or do I misunderstand what you want to do? Probably this can be done
with some itertools.groupby() cunningness...


Ian

unread,
Oct 25, 2010, 1:54:42 PM10/25/10
to
On Oct 25, 4:33 am, Kelson Zawack <zawack...@gis.a-star.edu.sg> wrote:
> The example I have in mind is list like [2,2,2,2,2,2,1,3,3,3,3] where
> you want to loop until you see not a 2 and then you want to loop until
> you see not a 3.  In this situation you cannot use a for loop as
> follows:
>
> foo_list_iter = iter([2,2,2,2,2,2,1,3,3,3,3])
> for foo_item in foo_list_iter:
>     if foo_item != 2:
>         break
> because it will eat the 1 and not allow the second loop to find it.
> takeWhile and dropWhile have the same problem.  It is possible to use
> a while loop as follows:
>
> foo_list_item = foo_list_iter.next()
> while foo_list_item == 2:
>     foo_list_item = foo_list_iter.next()
> while foo_list_item == 3:
>     foo_list_item = foo_list_iter.next()
>
> but if you can't be sure the list is not empty/all 2s then all 3s you
> need to surround this code in a try block.  Unless there is a good
> reason for having to do this I think it is undesirable because it
> means that the second clause of the loop invariant, namely that you
> are not off the end of the list, is being controlled outside of the
> loop.

from itertools import chain, imap, izip, tee
from operator import itemgetter

foo_list_iter, next_foo_list_iter = tee([2,2,2,2,2,2,1,3,3,3,3])
next_foo_list_iter = chain([None], next_foo_list_iter)
foo_list_iter = imap(itemgetter(0), izip(foo_list_iter,
next_foo_list_iter))

for foo_item in foo_list_iter:
if foo_item != 2:

foo_list_iter = next_foo_list_iter
break

But in practice I think the best solution is to create an explicit
iterator wrapper that implements hasnext() and use it as needed, as
others in this thread have suggested.

Cheers,
Ian

0 new messages