Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Best way to check that you are at the beginning (the end) of an iterable?

45 views
Skip to first unread message

Laurent

unread,
Sep 7, 2011, 5:35:41 PM9/7/11
to
Hi there,

What is the simplest way to check that you are at the beginning or at the end of an iterable? I'm using enumerate with Python 3.2 (see below) but I'm wondering if there would be a better way.


l = ['a', 'b', 'a', 'c']

for pos, i in enumerate(l):
if pos == 0:
print("head =", i)
else:
print(i)


I know that Python is not exactly a functional language but wouldn't something like "ishead()" or "istail()" be useful?

Cameron Simpson

unread,
Sep 7, 2011, 6:48:24 PM9/7/11
to comp.lan...@googlegroups.com, pytho...@python.org
There are a few reasons these do not exist out of the box (quite aside
from how easy it is to do on the occasions you actually want it).
Tackling ishead and istail in turn...

The "ishead()" would need to be a top level function (like "len()")
because if it were an iterator method, every iterator would need to have
it implemented; currently the number of methods needed to roll your own
iterator is just two (iter and next). ishead() could be done as a top
level function, though it would need the storage cost of an additional
state value to every iterator (i.e. a "first" boolean or equivalent). So
you'd be proposing more memory cost and possibly a retrospective code
change for all the existing planetwide code, for a minor convenient. As
you note, enumerate gets you a pos value, and it is easy enough to write
a for loop like this:

first = True
for i in iterable_thing:
if first:
print "head =", i
else:
print i
first = False

Your istail() is much worse.

A generator would need to do lookahead to answer istail() in the general
case. Consider iterating over the lines in a file, or better still the
lines coming from a pipeline. Or iteraing over packets received on a
network connection. You can't answer "istail()" there until you have
seen the next line/packet (or EOF/connection close). And that may be an
arbitrary amount of time in the future. You're going to stall your whole
program for such a question?

You can do this easily enough for yourself as an itertools-like thing:
write a wrapper generator that answers ishead() and istail() for
arbitrary iterators. Completely untested example code:

class BoundSensitiveIterator(object):
def __init__(self, subiter):
self.sofar = 0
self.subiter = subiter
self.pending = ()
def iter(self):
return self
def next(self):
self.sofar += 1
if self.pending is None:
raise StopIteration
if self.pending:
nxt = self.pending[0]
self.pending = ()
return nxt
return self.subiter.next()
def ishead(self):
# maybe <= 1, depending on what you want it to mean
return self.sofar == 1
def istail(self):
if self.pending is None:
return True
if self.pending:
return False
try:
nxt = self.subiter.next()
except StopIteration:
self.pending = None
return True
else:
self.pending = (nxt,)
return False

I = BoundSensitiveIterator(other_iterable)
for n in I:
print n, "ishead =", I.ishead(), "istail =", I.istail()

You can see it adds some performance and storage overhead, and of course
may stall if you every ask istail() of an "on demand" iterable.

About the only time I do this is my personal "the()" convenience
function:

def the(list, context=None):
''' Returns the first element of an iterable, but requires there to be
exactly one.
'''
icontext="expected exactly one value"
if context is not None:
icontext=icontext+" for "+context

first=True
for elem in list:
if first:
it=elem
first=False
else:
raise IndexError, "%s: got more than one element (%s, %s, ...)" \
% (icontext, it, elem)

if first:
raise IndexError, "%s: got no elements" % icontext

return it

Which I use as a definite article in places where an iterable _should_
yield exactly one result (eg SQL SELECTs that _ought_ to get exactly
one hit). I can see I wrote that a long time ago - it could do with some
style fixes. And a code scan shows it sees little use:-)

Cheers,
--
Cameron Simpson <c...@zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Electronic cardboard blurs the line between printed objects and the virtual
world. - overhead by WIRED at the Intelligent Printing conference Oct2006

Laurent

unread,
Sep 7, 2011, 7:22:36 PM9/7/11
to comp.lan...@googlegroups.com, pytho...@python.org
I totally understand the performance issue that an hypothetical "istail" would bring, even if I think it would just be the programmer's responsibility not to use it when it's not certain that an end can be detected.
But I don't see why *adding* something like "ishead" would be so bad (at worse by using a boolean somewhere as you mentioned).

Anyway I was just asking if there is something better than enumerate. So the answer is no? The fact that I have to create a tuple with an incrementing integer for something as simple as checking that I'm at the head just sounds awfully unpythonic to me.

Laurent

unread,
Sep 7, 2011, 7:22:36 PM9/7/11
to pytho...@python.org

Cameron Simpson

unread,
Sep 7, 2011, 8:23:14 PM9/7/11
to comp.lan...@googlegroups.com, pytho...@python.org
On 07Sep2011 16:22, Laurent <lauren...@gmail.com> wrote:
| I totally understand the performance issue that an hypothetical
| "istail" would bring, even if I think it would just be the programmer's
| responsibility not to use it when it's not certain that an end can
| be detected.

The trouble with these things is that their presence leads to stallable
code, often in libraries. Let the programmer write code dependent on
istail() without thinking of the stall case (or even the gratuitous
execution case, as in a generator with side effects in calling .next())
and have that buried in a utilities function.

Facilities like feof() in C and eof in Pascal already lead to lots of
code that runs happily with flat files and behaves badly in interactive
or piped input. It is _so_ easy to adopt a style like:

while not eof(filehandle):
line = filehandle.nextline()
...

that is it often thought that having offered the eof() function is a
design error. (Of course in the example above the usual python idiom
would win out from existing habit, but there are plenty of other
situations where is would just be _easy_ to rely of istail() in whatever
form.)

| But I don't see why *adding* something like "ishead" would be so bad
| (at worse by using a boolean somewhere as you mentioned).

It is not awful, but as remarked:
- extra storage cost to _every_ iterable, for a rarely used facility
- extra runtime cost to maintain the state
- _retroactive_ burden on _every_ iterator implementation presently
existing; every iterator sudden needs to implement and offer this
extra facility to be generate purpose use
- it is easy to provide the facility on the rare occasions when it is
needed

Personally, I think point 3 above is the killer and 1 and 2 are serious
counter arguments.

| Anyway I was just asking if there is something better than enumerate. So
| the answer is no? The fact that I have to create a tuple with an
| incrementing integer for something as simple as checking that I'm at
| the head just sounds awfully unpythonic to me.

You can just use a boolean if you like. I have plent of loops like:

first = true
for i in iterable:
if first:
blah ...
...
first = False

Cheap and easy. Cheers,
Bye and bye, God caught his eye, - Epitaph for a waiter by David McCord

Steven D'Aprano

unread,
Sep 7, 2011, 8:24:58 PM9/7/11
to
Laurent wrote:

> Hi there,
>
> What is the simplest way to check that you are at the beginning or at the
> end of an iterable?


I don't think this question is meaningful. There are basically two
fundamental types of iterables, sequences and iterators.

Sequences have random access and a length, so if the "start" and "end" of
the sequence is important to you, just use indexing:

beginning = sequence[0]
end = sequence[-1]
for i, x in enumerate(sequence):
if i == 0: print("at the beginning")
elif i == len(sequence)-1: print("at the end")
print(x)


Iterators don't have random access, and in general they don't have a
beginning or an end. There may not be any internal sequence to speak of:
the iterator might be getting data from a hardware device that provides
values continuously, or some other series of values without a well-defined
beginning or end. Example:

def time():
from time import asctime
while True:
yield asctime()

it = time()

What would it even mean to say that I am at the beginning or end of it?

Iterators have no memory, so in one sense you are *always* at the beginning
of the iterator: next() always returns the next item, and the previous item
is lost forever. So the answer to the question "Am I at the beginning of an
iterator?" is always "You are now".

For sequences, the question is best handled differently. For iterators, the
question doesn't make sense in general. If you need an iterator that can
report its internal state, write your own:

import random, time
class MyIter(object):
def __init__(self):
self.start = True
self.end = False
def __next__(self):
if self.start:
self.start = False
if self.end:
raise StopIteration
if random.random() < 0.01:
self.end = True
return time.asctime()
def __iter__(self):
return self

--
Steven

Miki Tebeka

unread,
Sep 7, 2011, 8:24:53 PM9/7/11
to
I guess enumerate is the best way to check for first argument. Note that if someone passes you the iterator as argument you have now way of checking if the consumed items from it.

istail can be implemented using itertools.chain, see https://gist.github.com/1202260

Laurent

unread,
Sep 7, 2011, 8:53:36 PM9/7/11
to pytho...@python.org
Yes of course the use of a boolean variable is obvious but I'm mixing python code with html using Mako templates. In Mako for code readability reasons I try to stick to simple "for" and "if" constructions, and I try to avoid variables declarations inside the html, that's all. Thanks anyway.

Laurent

unread,
Sep 7, 2011, 8:53:36 PM9/7/11
to comp.lan...@googlegroups.com, pytho...@python.org

Tim Chase

unread,
Sep 7, 2011, 8:01:33 PM9/7/11
to pytho...@python.org, Laurent
On 09/07/11 18:22, Laurent wrote:
> Anyway I was just asking if there is something better than
> enumerate. So the answer is no? The fact that I have to create
> a tuple with an incrementing integer for something as simple
> as checking that I'm at the head just sounds awfully
> unpythonic to me.

I've made various generators that are roughly (modulo
edge-condition & error checking) something like

def with_prev(it):
prev = None
for i in it:
yield prev, i
i = prev

def with_next(it):
prev = it.next()
for i in it:
yield prev, i
prev = i
yield prev, None

which can then be used something like your original

for cur, next in with_next(iterable):
if next is None:
do_something_with_last(cur)
else:
do_regular_stuff_with_non_last(cur)

for prev, cur in with_prev(iterable):
if prev is None:
do_something_with_first(cur)
else:
do_something_with_others(cur)

If your iterable can return None, you could create a custom
object to signal the non-condition:

NO_ITEM = object()

and then use NO_ITEM in place of "None" in the above code.

-tkc


Terry Reedy

unread,
Sep 7, 2011, 9:06:56 PM9/7/11
to pytho...@python.org
On 9/7/2011 8:23 PM, Cameron Simpson wrote:
> On 07Sep2011 16:22, Laurent<lauren...@gmail.com> wrote:
> | I totally understand the performance issue that an hypothetical
> | "istail" would bring, even if I think it would just be the programmer's
> | responsibility not to use it when it's not certain that an end can
> | be detected.
>
> The trouble with these things is that their presence leads to stallable
> code, often in libraries. Let the programmer write code dependent on
> istail() without thinking of the stall case (or even the gratuitous
> execution case, as in a generator with side effects in calling .next())
> and have that buried in a utilities function.
>
> Facilities like feof() in C and eof in Pascal already lead to lots of
> code that runs happily with flat files and behaves badly in interactive
> or piped input. It is _so_ easy to adopt a style like:
>
> while not eof(filehandle):
> line = filehandle.nextline()
> ...
>
> that is it often thought that having offered the eof() function is a
> design error. (Of course in the example above the usual python idiom
> would win out from existing habit, but there are plenty of other
> situations where is would just be _easy_ to rely of istail() in whatever
> form.)
>
> | But I don't see why *adding* something like "ishead" would be so bad
> | (at worse by using a boolean somewhere as you mentioned).
>
> It is not awful, but as remarked:
> - extra storage cost to _every_ iterable, for a rarely used facility
> - extra runtime cost to maintain the state
> - _retroactive_ burden on _every_ iterator implementation presently
> existing; every iterator sudden needs to implement and offer this
> extra facility to be generate purpose use
> - it is easy to provide the facility on the rare occasions when it is
> needed
>
> Personally, I think point 3 above is the killer and 1 and 2 are serious
> counter arguments.

The iterator protocol is intentionally as simple as sensibly possible.

> | Anyway I was just asking if there is something better than enumerate. So
> | the answer is no? The fact that I have to create a tuple with an
> | incrementing integer for something as simple as checking that I'm at
> | the head just sounds awfully unpythonic to me.
>
> You can just use a boolean if you like. I have plent of loops like:
>
> first = true
> for i in iterable:
> if first:
> blah ...
> ...
> first = False
>
> Cheap and easy. Cheers,

Or grab and process the first item separately from the rest.

it = iter(iterable)
try:
first = next(it)
<process first item>
except StopIteration:
raise ValueError("Empty iterable not allowed")
for i in it:
<process non-first item>

--
Terry Jan Reedy

Laurent

unread,
Sep 7, 2011, 9:06:53 PM9/7/11
to
Yes, I was just hoping for something already included that I wouldn't know (i'm new to Python).

Laurent

unread,
Sep 7, 2011, 9:08:51 PM9/7/11
to comp.lan...@googlegroups.com, pytho...@python.org, Laurent
Interesting. I will check that yield functionality out. Thanks.

Terry Reedy

unread,
Sep 7, 2011, 9:08:11 PM9/7/11
to pytho...@python.org
On 9/7/2011 8:24 PM, Steven D'Aprano wrote:

> I don't think this question is meaningful. There are basically two
> fundamental types of iterables, sequences and iterators.

And non-sequence iterables like set and dict.

> Sequences have random access and a length, so if the "start" and "end" of
> the sequence is important to you, just use indexing:
>
> beginning = sequence[0]
> end = sequence[-1]
> for i, x in enumerate(sequence):
> if i == 0: print("at the beginning")
> elif i == len(sequence)-1: print("at the end")
> print(x)

And finite non-sequences can be turned into sequences with list(iterable).

--
Terry Jan Reedy

Laurent

unread,
Sep 7, 2011, 9:08:51 PM9/7/11
to pytho...@python.org, Laurent

Laurent

unread,
Sep 7, 2011, 9:05:17 PM9/7/11
to

> I don't think this question is meaningful. There are basically two
> fundamental types of iterables, sequences and iterators.
>
> Sequences have random access and a length, so if the "start" and "end" of
> the sequence is important to you, just use indexing:
>
> beginning = sequence[0]
> end = sequence[-1]
> for i, x in enumerate(sequence):
> if i == 0: print("at the beginning")
> elif i == len(sequence)-1: print("at the end")
> print(x)
>
>
> Iterators don't have random access, and in general they don't have a
> beginning or an end. There may not be any internal sequence to speak of:
> the iterator might be getting data from a hardware device that provides
> values continuously, or some other series of values without a well-defined
> beginning or end.

Maybe I should have said "best way to check that you didn't start the iteration process yet" but you see what I mean.

Well I guess I have to unlearn my bad lisp/scheme habits...

Chris Rebert

unread,
Sep 7, 2011, 10:27:09 PM9/7/11
to pytho...@python.org
On Wed, Sep 7, 2011 at 5:24 PM, Miki Tebeka <miki....@gmail.com> wrote:
> I guess enumerate is the best way to check for first argument. Note that if someone passes you the iterator as argument you have now way of checking if the consumed items from it.
>
> istail can be implemented using itertools.chain, see https://gist.github.com/1202260

For the archives, if Gist ever goes down:

from itertools import chain

def istail(it):
'''Check if iterator has one more element. Return True/False and
iterator.'''
try:
i = next(it)
except StopIteration:
return False, it

try:
j = next(it)
return False, chain([i, j], it)
except StopIteration:
return True, chain([i], it)


t, it = istail(iter([]))
print t, list(it)
t, it = istail(iter([1]))
print t, list(it)
t, it = istail(iter([1, 2]))
print t, list(it)

Chris Torek

unread,
Sep 8, 2011, 10:21:23 AM9/8/11
to
In article <mailman.854.13154413...@python.org>
Cameron Simpson <c...@zip.com.au> wrote:
>Facilities like feof() in C and eof in Pascal already lead to lots of
>code that runs happily with flat files and behaves badly in interactive
>or piped input. It is _so_ easy to adopt a style like:
>
> while not eof(filehandle):
> line = filehandle.nextline()
> ...

Minor but important point here: eof() in Pascal is predictive (uses
a "crystal ball" to peer into the future to see whether EOF is is
about to occur -- which really means, reads ahead, causing that
interactivity problem you mentioned), but feof() in C is "post-dictive".
The feof(stream) function returns a false value if the stream has
not yet encountered an EOF, but your very next attempt to read from
it may (or may not) immediately encounter that EOF.

Thus, feof() in C is sort of (but not really) useless. (The actual
use cases are to distinguish between "EOF" and "error" after a
failed read from a stream -- since C lacks exceptions, getc() just
returns EOF to indicate "failed to get a character due to end of
file or error" -- or in some more obscure cases, such as the
nonstandard getw(), to distinguish between a valid -1 value and
having encountered an EOF. The companion ferror() function tells
you whether an earlier EOF value was due to an error.)
--
In-Real-Life: Chris Torek, Wind River Systems
Intel require I note that my opinions are not those of WRS or Intel
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html

Cameron Simpson

unread,
Sep 8, 2011, 6:39:44 PM9/8/11
to Chris Torek, pytho...@python.org
On 08Sep2011 14:21, Chris Torek <nos...@torek.net> wrote:
| In article <mailman.854.13154413...@python.org>
| Cameron Simpson <c...@zip.com.au> wrote:
| >Facilities like feof() in C and eof in Pascal already lead to lots of
| >code that runs happily with flat files and behaves badly in interactive
| >or piped input. It is _so_ easy to adopt a style like:
| >
| > while not eof(filehandle):
| > line = filehandle.nextline()
| > ...
|
| Minor but important point here: eof() in Pascal is predictive (uses
| a "crystal ball" to peer into the future to see whether EOF is is
| about to occur -- which really means, reads ahead, causing that
| interactivity problem you mentioned), but feof() in C is "post-dictive".
| The feof(stream) function returns a false value if the stream has
| not yet encountered an EOF, but your very next attempt to read from
| it may (or may not) immediately encounter that EOF.

Thanks. I had forgotten this nuance. Cheers,
"Where am I?"
"In the Village."
"What do you want?"
"Information."
"Whose side are you on?"
"That would be telling. We want information. Information. Information!"
"You won't get it!"
"By hook or by crook, we will."
"Who are you?"
"The new number 2."
"Who is number 1?"
"You are number 6."
"I am not a number, I am a free man!"
[Laughter]

Peter Otten

unread,
Sep 9, 2011, 7:04:57 AM9/9/11
to
Cameron Simpson wrote:

> About the only time I do this is my personal "the()" convenience
> function:
>
> def the(list, context=None):
> ''' Returns the first element of an iterable, but requires there to be
> exactly one.
> '''
> icontext="expected exactly one value"
> if context is not None:
> icontext=icontext+" for "+context
>
> first=True
> for elem in list:
> if first:
> it=elem
> first=False
> else:
> raise IndexError, "%s: got more than one element (%s, %s, ...)" \
> % (icontext, it, elem)
>
> if first:
> raise IndexError, "%s: got no elements" % icontext
>
> return it
>

> Which I use as a definite article in places where an iterable should
> yield exactly one result (eg SQL SELECTs that ought to get exactly


> one hit). I can see I wrote that a long time ago - it could do with some
> style fixes. And a code scan shows it sees little use:-)

A lightweight alternative to that is unpacking:

>>> [x] = ""
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: need more than 0 values to unpack
>>> [x] = "a"
>>> [x] = "ab"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack

Chris Angelico

unread,
Sep 9, 2011, 7:30:03 AM9/9/11
to pytho...@python.org
On Fri, Sep 9, 2011 at 9:04 PM, Peter Otten <__pet...@web.de> wrote:
>>>> [x] = ""
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> ValueError: need more than 0 values to unpack
>>>> [x] = "a"
>>>> [x] = "ab"
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> ValueError: too many values to unpack
>

Hey look, it's a new operator - the "assign-sole-result-of-iterable" operator!

x ,= "a"

:)

ChrisA
0 new messages