Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

End of file

4 views
Skip to first unread message

Kat

unread,
Oct 7, 2004, 3:29:32 AM10/7/04
to
Hi ,
How do you identify the last line of a file? I am in a "for" loop and
need to know which is the last line of the file while it is being read
in this loop.

Thanks
Kat

duikboot

unread,
Oct 7, 2004, 3:46:47 AM10/7/04
to

f = open("test.txt").readlines()
lines = len(f)
print lines
counter = 1
for line in f:
if counter == lines:
print "last line: %s" % line
counter += 1

--
http://www.baandersconsultancy.nl | http://www.nosonis.com

Duncan Booth

unread,
Oct 7, 2004, 4:36:43 AM10/7/04
to
Kat wrote:

You need to read the next line before you can tell which is the last line.
The easiest way is probably to use a generator:

def lineswithlast(filename):
prev, line = None, None
for line in file(filename):
if prev is not None:
yield prev, False
prev = line
if line:
yield line, True


for line, last in lineswithlast('somefile.txt'):
print last, line

Peter Otten

unread,
Oct 7, 2004, 5:16:54 AM10/7/04
to
Kat wrote:

You might consider moving the special treatment of the last line out of the
for-loop. In that case the following class would be useful. After iterating
over all but the last line you can look up its value in the 'last'
attribute.

import cStringIO as stringio

class AllButLast:
def __init__(self, iterable):
self.iterable = iterable
def __iter__(self):
it = iter(self.iterable)
prev = it.next()
for item in it:
yield prev
prev = item
self.last = prev

def demo(iterable):
abl = AllButLast(iterable)
for item in abl:
print "ITEM", repr(item)
try:
abl.last
except AttributeError:
print "NO ITEMS"
else:
print "LAST", repr(abl.last)


if __name__ == "__main__":
for s in [
"alpha\nbeta\ngamma",
"alpha\nbeta\ngamma\n",
"alpha",
"",
]:
print "---"
demo(stringio.StringIO(s))

Peter

Alex Martelli

unread,
Oct 7, 2004, 7:38:43 AM10/7/04
to
duikboot <"adijkstra at baanders NOSPAM consultancy.nl"> wrote:

> Kat wrote:
> > Hi ,
> > How do you identify the last line of a file? I am in a "for" loop and
> > need to know which is the last line of the file while it is being read
> > in this loop.
> >
> > Thanks
> > Kat
>
> f = open("test.txt").readlines()
> lines = len(f)
> print lines
> counter = 1
> for line in f:
> if counter == lines:
> print "last line: %s" % line
> counter += 1

A slight variation on this idea is using the enumerate built-in rather
than maintaining the counter by hand. enumerate counts from 0, so:

for counter, line in enumerate(f):
if counter == lines-1: is_last_line(line)
else: is_ordinary_line(line)

If the file's possibly too big to read comfortably in memory, of course,
other suggestions based on generators &c are preferable.


Alex

Andreas Kostyrka

unread,
Oct 7, 2004, 8:30:03 AM10/7/04
to Alex Martelli, pytho...@python.org
This should do it "right":

f = file("/etc/passwd")
fi = iter(f)

def inext(i):
try:
return i.next()
except StopIteration:
return StopIteration

next = inext(fi)
while next <> StopIteration:
line = next
next = inext(fi)
if next == StopIteration:
print "LAST USER", line.rstrip()
else:
print "NOT LAST", line.rstrip()

Alex Martelli

unread,
Oct 7, 2004, 9:44:08 AM10/7/04
to
Andreas Kostyrka <and...@kostyrka.org> wrote:
...

> > for counter, line in enumerate(f):
> > if counter == lines-1: is_last_line(line)
> > else: is_ordinary_line(line)
> >
> > If the file's possibly too big to read comfortably in memory, of course,
> > other suggestions based on generators &c are preferable.

> This should do it "right":
>
> f = file("/etc/passwd")
> fi = iter(f)
>
> def inext(i):
> try:
> return i.next()
> except StopIteration:
> return StopIteration
>
> next = inext(fi)
> while next <> StopIteration:
> line = next
> next = inext(fi)
> if next == StopIteration:
> print "LAST USER", line.rstrip()
> else:
> print "NOT LAST", line.rstrip()

I think the semantics are correct, but I also believe the control
structure in the "application code" is too messy. A generator lets you
code a simple for loop on the application side of things, and any
complexity stays where it should be, inside the generator. Haven't read
the other posts proposing generators, but something like:

def item_and_lastflag(sequence):
it = iter(sequence)
next = it.next()
for current in it:
yield next, False
next = current
yield next, True

lets you code, application-side:

for line, is_last in item_and_lastflag(open('/etc/passwd')):
if is_last: print 'Last user:', line.rstrip()
else print: 'Not last:', line.rstrip()

Not only is the application-side loop crystal-clear; it appears to me
that the _overall_ complexity is decreased, not just moved to the
generator.


Alex

Nick Craig-Wood

unread,
Oct 7, 2004, 11:29:57 AM10/7/04
to
Duncan Booth <duncan...@invalid.invalid> wrote:
> You need to read the next line before you can tell which is the last line.
> The easiest way is probably to use a generator:
>
> def lineswithlast(filename):
> prev, line = None, None
> for line in file(filename):
> if prev is not None:
> yield prev, False
> prev = line
> if line:
> yield line, True
>
>
> for line, last in lineswithlast('somefile.txt'):
> print last, line

I like that. I generalised it into a general purpose iterator thing.

You'll note I use the spurious sentinel local variable to mark unused
values rather than None (which is likely to appear in a general
list). I think this technique (which I invented a few minutes ago!)
is guaranteed correct (ie sentinel can never occur in iterator).

def iterlast(iterator):
"Returns the original sequence with a flag to say whether the item is the last one or not"
sentinel = []
prev, next = sentinel, sentinel
for next in iterator:
if prev is not sentinel:
yield prev, False
prev = next
if next is not sentinel:
yield next, True

for line, last in iterlast(file('z')):
print last, line

--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick

Alex Martelli

unread,
Oct 7, 2004, 12:36:11 PM10/7/04
to
Nick Craig-Wood <ni...@craig-wood.com> wrote:
...

> You'll note I use the spurious sentinel local variable to mark unused
> values rather than None (which is likely to appear in a general
> list). I think this technique (which I invented a few minutes ago!)
> is guaranteed correct (ie sentinel can never occur in iterator).
...
> sentinel = []

It's fine, but I would still suggest using the Canonical Pythonic Way To
Make a Sentinel Object:

sentinel = object()

Since an immediate instance of type object has no possible use except as
a unique, distinguishable placeholder, this way you're "strongly saying"
``this thing here is a sentinel''. An empty list _might_ be intended
for many other purposes, a reader of your code (assumed to be perfectly
conversant with the language and built-ins of course) may hesitate a
microsecond more (looking around the code for other uses of this
object), while the CPWtMaSO shouldn't leave room for doubt...


Alex

Nick Craig-Wood

unread,
Oct 8, 2004, 3:29:57 AM10/8/04
to
Alex Martelli <ale...@yahoo.com> wrote:
> > sentinel = []
>
> It's fine, but I would still suggest using the Canonical Pythonic Way To
> Make a Sentinel Object:
>
> sentinel = object()
>
> Since an immediate instance of type object has no possible use except as
> a unique, distinguishable placeholder, ``this thing here is a sentinel''.

Yes a good idiom which I didn't know (still learning) - thanks!

This only works in python >= 2.2 according to my tests.

Its also half the speed and 4 times the typing

$ /usr/lib/python2.3/timeit.py 'object()'
1000000 loops, best of 3: 0.674 usec per loop
$ /usr/lib/python2.3/timeit.py '[]'
1000000 loops, best of 3: 0.369 usec per loop

But who's counting ;-)

Alex Martelli

unread,
Oct 8, 2004, 4:31:52 AM10/8/04
to
Nick Craig-Wood <ni...@craig-wood.com> wrote:

> Alex Martelli <ale...@yahoo.com> wrote:
> > > sentinel = []
> >
> > It's fine, but I would still suggest using the Canonical Pythonic Way To
> > Make a Sentinel Object:
> >
> > sentinel = object()
> >
> > Since an immediate instance of type object has no possible use except as
> > a unique, distinguishable placeholder, ``this thing here is a sentinel''.
>
> Yes a good idiom which I didn't know (still learning) - thanks!

You're welcome.

> This only works in python >= 2.2 according to my tests.

Yes, 2.2 is when Python acquired the 'object' built-in. If you need to
also support ancient versions of Python, it's often possible to do so by
clever initialization -- substituting your own coding if at startup you
find you're running under too-old versions. You presumably already do
that, e.g., for True and False, staticmethod, &c -- in the 2.3->2.4
transition it makes sense to do it for sorted, reversed, set, ... -- for
this specific issue of using object() for a sentinel, for example:

try: object
except NameError: def object(): return []

plus the usual optional stick-into-builtins, are all you need in your
application's initialization phase.


> Its also half the speed and 4 times the typing
>
> $ /usr/lib/python2.3/timeit.py 'object()'
> 1000000 loops, best of 3: 0.674 usec per loop
> $ /usr/lib/python2.3/timeit.py '[]'
> 1000000 loops, best of 3: 0.369 usec per loop
>
> But who's counting ;-)

Nobody, I sure hope. 's=[]' is just four characters, while 'sentinel =
[]', the usage you suggested (with proper spacing and a decent name), is
13, and yet it's pretty obvious the clarity of the latter is well worth
the triple typing; and if going to 'sentinel = object()' makes it
clearer yet, the lesser move from 13 to 19 (an extra-typing factor of
less than 1.5) is similarly well justified.
And I think it's unlikely you'll need so many sentinels as to notice the
extra 300 nanoseconds or so to instantiate each of them...


Alex

Nick Craig-Wood

unread,
Oct 8, 2004, 9:29:59 AM10/8/04
to
Alex Martelli <ale...@yahoo.com> wrote:
> Nick Craig-Wood <ni...@craig-wood.com> wrote:

[snip good advice re backwards compatibility]

> > Its also half the speed and 4 times the typing

[snip]


> > But who's counting ;-)
>
> Nobody, I sure hope. 's=[]' is just four characters, while 'sentinel =
> []', the usage you suggested (with proper spacing and a decent name), is
> 13, and yet it's pretty obvious the clarity of the latter is well worth
> the triple typing; and if going to 'sentinel = object()' makes it
> clearer yet, the lesser move from 13 to 19 (an extra-typing factor of
> less than 1.5) is similarly well justified.
> And I think it's unlikely you'll need so many sentinels as to notice the
> extra 300 nanoseconds or so to instantiate each of them...

My toungue was firmly in cheek as I wrote that as I hope the smiley
above indicated! I timed the two usages just to see (and because
timeit makes it so easy). Indeed what is 6 characters and 300 nS
between friends ;-)

I shall put the object() as sentinel trick in my toolbag where it
belongs!

Alex Martelli

unread,
Oct 8, 2004, 11:11:52 AM10/8/04
to
Nick Craig-Wood <ni...@craig-wood.com> wrote:
...
> > > But who's counting ;-)
> >
> > Nobody, I sure hope. 's=[]' is just four characters, while 'sentinel =
...
> My toungue was firmly in cheek as I wrote that as I hope the smiley
> above indicated! I timed the two usages just to see (and because
> timeit makes it so easy). Indeed what is 6 characters and 300 nS
> between friends ;-)

Yep, I don't think there was any doubt -- I just dotted some t's and
crossed some i's (...hmmm...?-) for the benefit of hypothetical newbie
readers...;-)


Alex

Andrew Dalke

unread,
Oct 8, 2004, 1:39:04 PM10/8/04
to
Alex:

> Yes, 2.2 is when Python acquired the 'object' built-in. If you need to
> also support ancient versions of Python, it's often possible to do so by
> clever initialization -- substituting your own coding if at startup you
> find you're running under too-old versions.

I've used

class sentinel:
pass


I didn't realized

sentinel = object()

was the new and improved way to do it.

Regarding timings, class def is slower than either [] or
object(), but in any case it's only done once in
module scope.

Andrew
da...@dalkescientific.com

Andrew Dalke

unread,
Oct 8, 2004, 1:51:08 PM10/8/04
to
Me:

> I've used
>
> class sentinel:
> pass

Also used in the standard lib, in

cookielib.py:

class Absent: pass
...
if path is not Absent and path != "":
path_specified = True
path = escape_path(path)
else:
...


Only one that I could find. The 'object()' approach
is used in gettext.py a few times like

missing = object()
tmsg = self._catalog.get(message, missing)
if tmsg is missing:
if self._fallback:
return self._fallback.lgettext(message)

and once in pickle.py as

self.mark = object()

Pre-object (09-Nov-01) it used

self.mark = ['spam']


Andrew
da...@dalkescientific.com

Greg Ewing

unread,
Oct 10, 2004, 11:22:28 PM10/10/04
to
Alex Martelli wrote:
> Since an immediate instance of type object has no possible use except as
> a unique, distinguishable placeholder,

That's not true -- you can also assign attributes to such
an object and use it as a record. (It's not a common use,
but it's a *possible* use!)

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

Bengt Richter

unread,
Oct 11, 2004, 12:51:28 AM10/11/04
to
On Mon, 11 Oct 2004 16:22:28 +1300, Greg Ewing <gr...@cosc.canterbury.ac.nz> wrote:

>Alex Martelli wrote:
>> Since an immediate instance of type object has no possible use except as
>> a unique, distinguishable placeholder,
>
>That's not true -- you can also assign attributes to such
>an object and use it as a record. (It's not a common use,
>but it's a *possible* use!)
>

I originally thought that too, but (is this a 2.3.2 bug?):

Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> obj = object()
>>> obj.x = 1
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'object' object has no attribute 'x'
>>> dir(obj)
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__', '__init__', '__new__', '
__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__']
>>> type(obj)
<type 'object'>
>>> obj
<object object at 0x008DE3B8>

__setattr__ is there, so how does one make it do something useful?


OTOH, making a thing to hang attributes on is a one liner (though if you
want more than one instance, class Record: pass; rec=Record() is probably better.
<BTW>
why is class apparently not legal as a simple statement terminated by ';' ?
(I wanted to attempt an alternate one-liner ;-)

>>> class Record:pass; rec=Record()
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 1, in Record
NameError: name 'Record' is not defined
>>> class Record:pass
...
>>> rec=Record()
</BTW>

>>> rec = type('Record',(),{})()
>>> rec.x=1
>>> vars(rec)
{'x': 1}
>>> type(rec)
<class '__main__.Record'>
>>> rec
<__main__.Record object at 0x00901110>

Regards,
Bengt Richter

Robert Brewer

unread,
Oct 11, 2004, 12:57:27 AM10/11/04
to Bengt Richter, pytho...@python.org
Alex Martelli:

> Since an immediate instance of type object has no possible
> use except as a unique, distinguishable placeholder,

Greg Ewing:


>That's not true -- you can also assign attributes to such
>an object and use it as a record. (It's not a common use,
>but it's a *possible* use!)

Bengt Richter:


> I originally thought that too, but (is this a 2.3.2 bug?):

> >>> obj = object()
> >>> obj.x = 1
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> AttributeError: 'object' object has no attribute 'x'

>8


> OTOH, making a thing to hang attributes on is a one liner
> (though if you want more than one instance, class Record: pass;
> rec=Record() is probably better.

Or even:

class Record(object): pass
rec = Record()

Is there some reason people consistently don't use new-style classes? :(


Robert Brewer
MIS
Amor Ministries
fuma...@amor.org

Andrew Durdin

unread,
Oct 11, 2004, 12:58:44 AM10/11/04
to Bengt Richter, pytho...@python.org
On Mon, 11 Oct 2004 04:51:28 GMT, Bengt Richter <bo...@oz.net> wrote:
> why is class apparently not legal as a simple statement terminated by ';' ?
> (I wanted to attempt an alternate one-liner ;-)
>
> >>> class Record:pass; rec=Record()
> ...
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "<stdin>", line 1, in Record
> NameError: name 'Record' is not defined

This is the equivalent of:

class Record:
pass
rec = Record()

That is, the whole line after the : is interpreted as the body of the
class. The name Record is not defined within its body, hence the
NameError.

Another [sort-of related] question: why does the following not produce
a NameError for "foo"?

def foo(): print foo
foo()

Andrew Durdin

unread,
Oct 11, 2004, 1:05:02 AM10/11/04
to pytho...@python.org
On Sun, 10 Oct 2004 21:57:27 -0700, Robert Brewer <fuma...@amor.org> wrote:
>
> Is there some reason people consistently don't use new-style classes? :(

I usually use old-style classes accidentally, because it's easier to
do """class Foo:""" than """class Foo(object):""", particularly in the
interactive interpreter.
Is there a plan for the former to produce a new-style class (i.e.
total removal of old-style classes) before Py3k?

Steven Bethard

unread,
Oct 11, 2004, 1:58:03 AM10/11/04
to pytho...@python.org
Andrew Durdin <adurdin <at> gmail.com> writes:

> Another [sort-of related] question: why does the following not produce
> a NameError for "foo"?
>
> def foo(): print foo
> foo()

I'm thinking this was meant to be "left as an exercise to the reader" ;), but
just in case it wasn't, you've exactly illustrated the difference between a
class definition statement and a function definition statement. Executing a
class definition statement executes the class block, while executing a
function definition statement only initializes the function object, without
actually executing the code in the function's block. Hence:

>>> class C(object):
... print C


...
Traceback (most recent call last):
File "<stdin>", line 1, in ?

File "<stdin>", line 2, in C
NameError: name 'C' is not defined

The name of a class is bound to the class object at the end of the execution
of a class statment. Since executing a class statement executes the code in
the class's block, this example references C before it has been bound to the
class object, hence the NameError.

>>> def f():
... print f
...
>>> f()
<function f at 0x009D6670>

The name of a function is bound to the function object when the def statement
is executed. However, the function's code block is not executed until f is
called, at which point the name f has already been bound to the function
object and is thus available from the globals.

Steve

Alex Martelli

unread,
Oct 11, 2004, 4:05:17 AM10/11/04
to
Greg Ewing <gr...@cosc.canterbury.ac.nz> wrote:

> Alex Martelli wrote:
> > Since an immediate instance of type object has no possible use except as
> > a unique, distinguishable placeholder,
>
> That's not true -- you can also assign attributes to such
> an object and use it as a record. (It's not a common use,
> but it's a *possible* use!)

That's not true -- you cannot do what you state (in either Python 2.3 or
2.4, it seems to me). Vide:

protagonist:~ alex$ python2.3
Python 2.3 (#1, Sep 13 2003, 00:49:11)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin


Type "help", "copyright", "credits" or "license" for more information.

>>> x23=object()
>>> x23.foo=23


Traceback (most recent call last):
File "<stdin>", line 1, in ?

AttributeError: 'object' object has no attribute 'foo'
>>>

protagonist:~ alex$ python2.4
Python 2.4a3 (#1, Sep 3 2004, 22:25:02)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1640)] on darwin


Type "help", "copyright", "credits" or "license" for more information.

>>> x24=object()
>>> x24.bar=24


Traceback (most recent call last):
File "<stdin>", line 1, in ?

AttributeError: 'object' object has no attribute 'bar'
>>>


If an object() could be used as a Bunch I do think it would be
reasonably common. But it can't, so I insist that its role as
sentinel/placeholder is the only possibility.


Alex

Bengt Richter

unread,
Oct 11, 2004, 4:29:31 AM10/11/04
to
On Mon, 11 Oct 2004 15:58:44 +1100, Andrew Durdin <adu...@gmail.com> wrote:

>On Mon, 11 Oct 2004 04:51:28 GMT, Bengt Richter <bo...@oz.net> wrote:
>> why is class apparently not legal as a simple statement terminated by ';' ?
>> (I wanted to attempt an alternate one-liner ;-)
>>
>> >>> class Record:pass; rec=Record()
>> ...
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in ?
>> File "<stdin>", line 1, in Record
>> NameError: name 'Record' is not defined
>
>This is the equivalent of:
>
>class Record:
> pass
> rec = Record()

D'oh ;-/
Not thinking.


>
>That is, the whole line after the : is interpreted as the body of the
>class. The name Record is not defined within its body, hence the
>NameError.
>
>Another [sort-of related] question: why does the following not produce
>a NameError for "foo"?
>
>def foo(): print foo
>foo()

the foo in the print line isn't looked for until foo is executing,
by which time foo is bound. In contrast to the body of the class definition,
which executes right away to create the contents of the class dict etc.,
so there the class name is not yet bound.

Regards,
Bengt Richter

Bengt Richter

unread,
Oct 11, 2004, 4:38:46 AM10/11/04
to
On Sun, 10 Oct 2004 21:57:27 -0700, "Robert Brewer" <fuma...@amor.org> wrote:
[...]
>Bengt Richter:
[...]
>> OTOH, making a thing to hang attributes on is a one liner=20

>> (though if you want more than one instance, class Record: pass;
>> rec=3DRecord() is probably better.

>
>Or even:
>
>class Record(object): pass
>rec =3D Record()

>
>Is there some reason people consistently don't use new-style classes? :(
>
typing fatigue? I normally do, but slipped up there. Probably general fatigue,
given the botch density in that post ;-/
Maybe I will start putting __metaclass__=type at the top of sources...

Regards,
Bengt Richter

Fredrik Lundh

unread,
Oct 11, 2004, 6:48:36 AM10/11/04
to pytho...@python.org
Robert Brewer wrote:
> Is there some reason people consistently don't use new-style classes? :(

performance:

http://www.python.org/~jeremy/weblog/030506.html

</F>

Andrew Durdin

unread,
Oct 11, 2004, 7:25:53 AM10/11/04
to Steven Bethard, pytho...@python.org
On Mon, 11 Oct 2004 05:58:03 +0000 (UTC), Steven Bethard
>
> The name of a function is bound to the function object when the def statement
> is executed. However, the function's code block is not executed until f is
> called, at which point the name f has already been bound to the function
> object and is thus available from the globals.

What I wasn't expecting was that "foo" would automatically be looked
up in the enclosing scope... for some reason I'd forgotten about
nested scopes, I don't know why.

Greg Ewing

unread,
Oct 11, 2004, 11:30:40 PM10/11/04
to
Bengt Richter wrote:
> On Mon, 11 Oct 2004 16:22:28 +1300, Greg Ewing <gr...@cosc.canterbury.ac.nz> wrote:
>>That's not true -- you can also assign attributes to such
>>an object and use it as a record. (It's not a common use,
>>but it's a *possible* use!)
>
> I originally thought that too, but (is this a 2.3.2 bug?):
>
> >>> obj = object()
> >>> obj.x = 1
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> AttributeError: 'object' object has no attribute 'x'

I stand corrected! I had forgotten that instances of the
bare object class don't have a __dict__.

So it appears that sentinels are about the only practical
use for them.

Mel Wilson

unread,
Oct 13, 2004, 9:45:15 AM10/13/04
to
In article <YgA9d.9451$M05....@newsread3.news.pas.earthlink.net>,

Andrew Dalke <ada...@mindspring.com> wrote:
>I've used
>
>class sentinel:
> pass

I liked

class sentinel: "What the sentinel is meant to be for."


Regards. Mel.

0 new messages