It's ...

1 view
Skip to first unread message

Angus Rodgers

unread,
Jun 24, 2009, 3:53:49 PM6/24/09
to

... my first Python program! So please be gentle (no fifty ton
weights on the head!), but tell me if it's properly "Pythonic",
or if it's a dead parrot (and if the latter, how to revive it).

I'm working from Beazley's /Python: Essential Reference/ (2nd
ed. 2001), so my first newbie question is how best to find out
what's changed from version 2.1 to version 2.5. (I've recently
installed 2.5.4 on my creaky old Win98SE system.) I expect to
be buying the 4th edition when it comes out, which will be soon,
but before then, is there a quick online way to find this out?

Having only got up to page 84 - where we can actually start to
read stuff from the hard disk - I'm emboldened to try to learn
to do something useful, such as removing all those annoying hard
tab characters from my many old text files (before I cottoned on
to using soft tabs in my text editor).

This sort of thing seems to work, in the interpreter (for an
ASCII text file, named 'h071.txt', in the current directory):

stop = 3 # Tab stops every 3 characters
from types import StringType # Is this awkwardness necessary?
detab = lambda s : StringType.expandtabs(s, stop) # Or use def
f = open('h071.txt') # Do some stuff to f, perhaps, and then:
f.seek(0)
print ''.join(map(detab, f.xreadlines()))
f.close()

Obviously, to turn this into a generally useful program, I need
to learn to write to a new file, and how to parcel up the Python
code, and write a script to apply the "detab" function to all the
files found by searching a Windows directory, and replace the old
files with the new ones; but, for the guts of the program, is this
a reasonable way to write the code to strip tabs from a text file?

For writing the output file, this seems to work in the interpreter:

g = open('temp.txt', 'w')
g.writelines(map(detab, f.xreadlines()))
g.close()

In practice, does this avoid creating the whole string in memory
at one time, as is done by using ''.join()? (I'll have to read up
on "opaque sequence objects", which have only been mentioned once
or twice in passing - another instance perhaps being an xrange()?)
Not that that matters much in practice (in this simple case), but
it seems elegant to avoid creating the whole output file at once.

OK, I'm just getting my feet wet, and I'll try not to ask too many
silly questions!

First impressions are: (1) Python seems both elegant and practical;
and (2) Beazley seems a pleasantly unfussy introduction for someone
with at least a little programming experience in other languages.

--
Angus Rodgers

Angus Rodgers

unread,
Jun 24, 2009, 4:34:59 PM6/24/09
to
On Wed, 24 Jun 2009 20:53:49 +0100, I wrote:

>[...] my first newbie question is how best to find out


>what's changed from version 2.1 to version 2.5.

>[...] is there a quick online way to find this out?

One way seems to be:

<http://www.python.org/doc/2.3/whatsnew/>
<http://www.python.org/doc/2.4/whatsnew/>
<http://www.python.org/doc/2.5/whatsnew/>

... although there doesn't seem to be any
<http://www.python.org/doc/2.2/whatsnew/>

... ah! ...
<http://www.python.org/doc/2.2.3/whatsnew/>
"What's New in Python 2.2"

--
Angus Rodgers

J. Cliff Dyer

unread,
Jun 24, 2009, 4:40:29 PM6/24/09
to Angus Rodgers, pytho...@python.org
On Wed, 2009-06-24 at 20:53 +0100, Angus Rodgers wrote:
> ... my first Python program! So please be gentle (no fifty ton
> weights on the head!), but tell me if it's properly "Pythonic",
> or if it's a dead parrot (and if the latter, how to revive it).
>

Yay. Welcome to Python.


> I'm working from Beazley's /Python: Essential Reference/ (2nd
> ed. 2001), so my first newbie question is how best to find out
> what's changed from version 2.1 to version 2.5. (I've recently
> installed 2.5.4 on my creaky old Win98SE system.) I expect to
> be buying the 4th edition when it comes out, which will be soon,
> but before then, is there a quick online way to find this out?
>

Check here: http://docs.python.org/whatsnew/index.html

It's not designed to be newbie friendly, but it's in there.

> Having only got up to page 84 - where we can actually start to
> read stuff from the hard disk - I'm emboldened to try to learn
> to do something useful, such as removing all those annoying hard
> tab characters from my many old text files (before I cottoned on
> to using soft tabs in my text editor).
>
> This sort of thing seems to work, in the interpreter (for an
> ASCII text file, named 'h071.txt', in the current directory):
>
> stop = 3 # Tab stops every 3 characters
> from types import StringType # Is this awkwardness necessary?

Not anymore. You can just use str for this.

> detab = lambda s : StringType.expandtabs(s, stop) # Or use def

First, use def. lambda is a rarity for use when you'd rather not assign
your function to a variable.

Second, expandtabs is a method on string objects. s is a string object,
so you can just use s.expandtabs(stop)

Third, I'd recommend passing your tabstops into detab with a default
argument, rather than defining it irrevocably in a global variable
(which is brittle and ugly)

def detab(s, stop=3):
#do stuff

Then you can do

three_space_version = detab(s)
eight_space_version = detab(s, 8)

> f = open('h071.txt') # Do some stuff to f, perhaps, and then:
> f.seek(0)

f is not opened for writing, so if you do stuff to the contents of f,
you'll have to put the new version in a different variable, so f.seek(0)
doesn't help. If you don't do stuff to it, then you're at the beginning
of the file anyway, so either way, you shouldn't need to f.seek(0).

> print ''.join(map(detab, f.xreadlines()))

Sometime in the history of python, files became iterable, which means
you can do the following:

for line in f:
print detab(line)

Much prettier than running through join/map shenanigans. This is also
the place to modify the output before passing it to detab:

for line in f:
# do stuff to line
print detab(line)

Also note that you can iterate over a file several times:

f = open('foo.txt')
for line in f:
print line[0] # prints the first character of every line
for line in f:
print line[1] #prints the second character of every line
> f.close()
>


> Obviously, to turn this into a generally useful program, I need
> to learn to write to a new file, and how to parcel up the Python
> code, and write a script to apply the "detab" function to all the
> files found by searching a Windows directory, and replace the old
> files with the new ones; but, for the guts of the program, is this
> a reasonable way to write the code to strip tabs from a text file?
>
> For writing the output file, this seems to work in the interpreter:
>
> g = open('temp.txt', 'w')
> g.writelines(map(detab, f.xreadlines()))
> g.close()
>

Doesn't help, as map returns a list. You can use itertools.imap, or you
can use a for loop, as above.

> In practice, does this avoid creating the whole string in memory
> at one time, as is done by using ''.join()? (I'll have to read up
> on "opaque sequence objects", which have only been mentioned once
> or twice in passing - another instance perhaps being an xrange()?)
> Not that that matters much in practice (in this simple case), but
> it seems elegant to avoid creating the whole output file at once.

The terms to look for, rather than opaque sequence objects are
"iterators" and "generators".

>
> OK, I'm just getting my feet wet, and I'll try not to ask too many
> silly questions!
>
> First impressions are: (1) Python seems both elegant and practical;
> and (2) Beazley seems a pleasantly unfussy introduction for someone
> with at least a little programming experience in other languages.
>

Glad you're enjoying Beazley. I would look for something more
up-to-date. Python's come a long way since 2.1. I'd hate for you to
miss out on all the iterators, booleans, codecs, subprocess, yield,
unified int/longs, decorators, decimals, sets, context managers and
new-style classes that have come since then.


> --
> Angus Rodgers

Cheers,
Cliff


MRAB

unread,
Jun 24, 2009, 4:44:27 PM6/24/09
to pytho...@python.org
Angus Rodgers wrote:
[snip]

> This sort of thing seems to work, in the interpreter (for an
> ASCII text file, named 'h071.txt', in the current directory):
>
> stop = 3 # Tab stops every 3 characters
> from types import StringType # Is this awkwardness necessary?
> detab = lambda s : StringType.expandtabs(s, stop) # Or use def
> f = open('h071.txt') # Do some stuff to f, perhaps, and then:
> f.seek(0)
> print ''.join(map(detab, f.xreadlines()))
> f.close()
>
stop = 3 # Tab stops every 3 characters
detab = lambda s: s.expandtabs(stop)

f = open('h071.txt') # Do some stuff to f, perhaps, and then:
# f.seek(0) # Not necessary
print ''.join(map(detab, f.xreadlines()))
f.close()

> Obviously, to turn this into a generally useful program, I need
> to learn to write to a new file, and how to parcel up the Python
> code, and write a script to apply the "detab" function to all the
> files found by searching a Windows directory, and replace the old
> files with the new ones; but, for the guts of the program, is this
> a reasonable way to write the code to strip tabs from a text file?
>
> For writing the output file, this seems to work in the interpreter:
>
> g = open('temp.txt', 'w')
> g.writelines(map(detab, f.xreadlines()))
> g.close()
>
> In practice, does this avoid creating the whole string in memory
> at one time, as is done by using ''.join()? (I'll have to read up
> on "opaque sequence objects", which have only been mentioned once
> or twice in passing - another instance perhaps being an xrange()?)
> Not that that matters much in practice (in this simple case), but
> it seems elegant to avoid creating the whole output file at once.
>
> OK, I'm just getting my feet wet, and I'll try not to ask too many
> silly questions!
>
> First impressions are: (1) Python seems both elegant and practical;
> and (2) Beazley seems a pleasantly unfussy introduction for someone
> with at least a little programming experience in other languages.
>

STOP = 3 # Tab stops every 3 characters
in_file = open('h071.txt')
out_file = open('temp.txt', 'w')
for line in in_file: # Iterates one line at a time
out_file.write(line.expandtabs(STOP))
in_file.close()
out_file.close()

Scott David Daniels

unread,
Jun 24, 2009, 5:10:54 PM6/24/09
to
Angus Rodgers wrote:
> ... my first ... question is how best to find out what's changed from version 2.1
> to version 2.5. (I've recently installed 2.5.4)
Consecutively read:
http://docs.python.org/whatsnew/2.2.html
http://docs.python.org/whatsnew/2.3.html
http://docs.python.org/whatsnew/2.4.html
http://docs.python.org/whatsnew/2.5.html

> stop = 3 # Tab stops every 3 characters

Typically program constants should be full caps. See PEP 8

> from types import StringType # Is this awkwardness necessary?

Nope

> detab = lambda s : StringType.expandtabs(s, stop) # Or use def

Really use def unless you have a solid reason not to. At the moment,
I'd suggest you simply presume you have no such reason.
Also, expandtabs is an instance method, so the roundabout is not needed.

def detab(s):
return s.expandtabs(stop)

> f = open('h071.txt') # Do some stuff to f, perhaps, and then:
> f.seek(0)
> print ''.join(map(detab, f.xreadlines()))

Too much, even though that is how you thought the problem out.
First, text files are now iterable (producing a line at a time
much like xreadlines). Second the map above creates a list of
all detabbed lines, then (while that list still exists), it also
creates a string constant which is the "new" contents of the file.

I'd simply use:
for line in f:
print detab(line.rstrip())
or even:
for line in f:
print line.rstrip().expandtabs(stop)

> ...
> g.writelines(map(detab, f.xreadlines()))


>
> In practice, does this avoid creating the whole string in memory
> at one time, as is done by using ''.join()?

Nope. But you could use a generator expression if you wanted:
g.writelines(detab(line) for line in f)

> OK, I'm just getting my feet wet, and I'll try not to ask too many
> silly questions!
>
> First impressions are: (1) Python seems both elegant and practical;
> and (2) Beazley seems a pleasantly unfussy introduction for someone
> with at least a little programming experience in other languages.

Both 1 and 2 are true in spades.

--Scott David Daniels
Scott....@Acm.Org

Angus Rodgers

unread,
Jun 24, 2009, 5:12:33 PM6/24/09
to
On Wed, 24 Jun 2009 16:40:29 -0400, "J. Cliff Dyer"
<j...@sdf.lonestar.org> wrote:

>On Wed, 2009-06-24 at 20:53 +0100, Angus Rodgers wrote:

>> [...]


>> from types import StringType # Is this awkwardness necessary?
>
>Not anymore. You can just use str for this.
>
>> detab = lambda s : StringType.expandtabs(s, stop) # Or use def
>
>First, use def. lambda is a rarity for use when you'd rather not assign
>your function to a variable.
>
>Second, expandtabs is a method on string objects. s is a string object,
>so you can just use s.expandtabs(stop)

How exactly do I get detab, as a function from strings to strings
(for a fixed tab size)? (This is aside from the point, which you
make below, that the whole map/join idea is a bit of a no-no - in
some other context, I might want to isolate a method like this.)

>Third, I'd recommend passing your tabstops into detab with a default
>argument, rather than defining it irrevocably in a global variable
>(which is brittle and ugly)

No argument there - I was just messing about in the interpreter,
to see if the main idea worked.

>> f = open('h071.txt') # Do some stuff to f, perhaps, and then:
>> f.seek(0)
>
>f is not opened for writing, so if you do stuff to the contents of f,
>you'll have to put the new version in a different variable, so f.seek(0)
>doesn't help. If you don't do stuff to it, then you're at the beginning
>of the file anyway, so either way, you shouldn't need to f.seek(0).

I seemed to find that if I executed f.xreadlines() or f.readlines()
once, I was somehow positioned at the end of the file or something,
and had to do the f.seek(0) - but maybe I did something else silly.

>> print ''.join(map(detab, f.xreadlines()))
>
>Sometime in the history of python, files became iterable, which means
>you can do the following:
>
>for line in f:
> print detab(line)
>
>Much prettier than running through join/map shenanigans. This is also
>the place to modify the output before passing it to detab:
>
>for line in f:
> # do stuff to line
> print detab(line)
>
>Also note that you can iterate over a file several times:
>
>f = open('foo.txt')
>for line in f:
> print line[0] # prints the first character of every line
>for line in f:
> print line[1] #prints the second character of every line
>> f.close()

This all looks very nice.

>> For writing the output file, this seems to work in the interpreter:
>>
>> g = open('temp.txt', 'w')
>> g.writelines(map(detab, f.xreadlines()))
>> g.close()
>>
>
>Doesn't help, as map returns a list.

Pity. Oh, well.

>You can use itertools.imap, or you
>can use a for loop, as above.

This is whetting my appetite!

>The terms to look for, rather than opaque sequence objects are
>"iterators" and "generators".

OK, will do.

>Glad you're enjoying Beazley. I would look for something more
>up-to-date. Python's come a long way since 2.1. I'd hate for you to
>miss out on all the iterators, booleans, codecs, subprocess, yield,
>unified int/longs, decorators, decimals, sets, context managers and
>new-style classes that have come since then.

I'll get either Beazley's 4th ed. (due next month, IIRC), or Chun,
/Core Python Programming/ (2nd ed.), or both, unless someone has
a better suggestion. (Eventually I'll migrate from Windows 98SE(!),
and will need info on Python later than 2.5, but that's all I need
for now.)

--
Angus Rodgers

Angus Rodgers

unread,
Jun 24, 2009, 5:15:43 PM6/24/09
to
On Wed, 24 Jun 2009 22:12:33 +0100, I wrote:

>How exactly do I get detab, as a function from strings to strings
>(for a fixed tab size)?

(It's OK - this has been explained in another reply. I'm still a
little hazy about what exactly objects are in Python, but the haze
will soon clear, I'm sure, especially after I have written more
than one one-line program!)

--
Angus Rodgers

Angus Rodgers

unread,
Jun 24, 2009, 5:19:05 PM6/24/09
to
Spam? Say it ain't so! Where did that Viking helmet come from?
--
Angus Rodgers

Angus Rodgers

unread,
Jun 24, 2009, 5:43:01 PM6/24/09
to
On Wed, 24 Jun 2009 14:10:54 -0700, Scott David Daniels
<Scott....@Acm.Org> wrote:

>Angus Rodgers wrote:
>
>> from types import StringType # Is this awkwardness necessary?
>Nope

I'm starting to see some of the mental haze that was confusing me.

>Also, expandtabs is an instance method, so the roundabout is not needed.
>
> def detab(s):
> return s.expandtabs(stop)

I'd forgotten where Beazley had explained that "methods such as
... s.expandtabs() always return a new string as opposed to mod-
ifying the string s." I must have been hazily thinking of it as
somehow modifying s, even though my awkward code itself depended
on a vague understanding that it didn't. No point in nailing
this polly to the perch any more!

>I'd simply use:
> for line in f:
> print detab(line.rstrip())
>or even:
> for line in f:
> print line.rstrip().expandtabs(stop)

I'll read up on iterating through files, somewhere online for
the moment, and then get a more up-to-date textbook.

And I'll try not too ask too many silly questions like this, but
I wanted to make sure I wasn't getting into any bad programming
habits right at the start - and it's a good thing I did, because
I was!

>Nope. But you could use a generator expression if you wanted:
> g.writelines(detab(line) for line in f)

Ah, so that actually does what I was fondly hoping my code would
do. Thanks! I must learn about these "generator" thingies.

--
Angus Rodgers

Aahz

unread,
Jun 24, 2009, 5:54:34 PM6/24/09
to
In article <mailman.2074.1245876...@python.org>,

J. Cliff Dyer <j...@sdf.lonestar.org> wrote:
>
>Glad you're enjoying Beazley. I would look for something more
>up-to-date. Python's come a long way since 2.1. I'd hate for you to
>miss out on all the iterators, booleans, codecs, subprocess, yield,
>unified int/longs, decorators, decimals, sets, context managers and
>new-style classes that have come since then.

While those are all nice, they certainly aren't essential to learning
Python.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha

Angus Rodgers

unread,
Jun 24, 2009, 7:03:55 PM6/24/09
to
On Wed, 24 Jun 2009 22:43:01 +0100, I wrote:

>No point in nailing this polly to the perch any more!

Indeed not, so please skip what follows (I've surely been enough
of an annoying newbie, already!), but I've just remembered why I
wrote my program in such an awkward way. I wanted to be able to
import the type name t (StringType in this case) so that I could
simply use t.m() as the name of one of its methods [if "method"
is the correct term]; but in this case, where m is expandtabs(),
an additional parameter (the tab size) is needed; so, I used the
lambda expression to get around this, entirely failing to realise
that (as was clearly shown in the replies I got), if I was going
to use "lambda" at all (not recommended!), then it would be a lot
simpler to write the function as lambda s : s.m(), with or without
any additional parameters needed. (It didn't really have anything
to do with a separate confusion as to what exactly "objects" are.)

>I wanted to make sure I wasn't getting into any bad programming
>habits right at the start

I'm just trying to make sure I really understand how I screwed up.

(In future, I'll try to work through a textbook with exercises.
But I thought I'd better try to get some quick feedback at the
start, because I knew that I was fumbling around, and that it
was unlikely to be necessary to use such circumlocutions.)

--
Angus Rodgers

J. Clifford Dyer

unread,
Jun 24, 2009, 7:41:44 PM6/24/09
to Aahz, pytho...@python.org
On Wed, 2009-06-24 at 14:54 -0700, Aahz wrote:
> In article <mailman.2074.1245876...@python.org>,
> J. Cliff Dyer <j...@sdf.lonestar.org> wrote:
> >
> >Glad you're enjoying Beazley. I would look for something more
> >up-to-date. Python's come a long way since 2.1. I'd hate for you to
> >miss out on all the iterators, booleans, codecs, subprocess, yield,
> >unified int/longs, decorators, decimals, sets, context managers and
> >new-style classes that have come since then.
>
> While those are all nice, they certainly aren't essential to learning
> Python.

Mostly, no, you are correct. With some exceptions:

1) You have to know iterators at a basic level (not enough to understand
how the iterator protocol works, but enough to know what 'for line in
f:' does.

2) Sets are as essential as any other data structure. If you are
learning both lists and tuples, you should be learning sets as well.

3) If you're learning object-oriented programmin, new-style classes
should be the only classes you use.

4) You should know how a decorator works, in case you run across one in
the wild.

5) Booleans are a basic type. You should know them.

Codecs, the subprocess module, yield, decimals and context managers can
certainly come later. (All this of course, is assuming the Python 2.x
world, which I think is still the right way to learn, for now)


Cheers,
Cliff

Scott David Daniels

unread,
Jun 25, 2009, 12:29:55 AM6/25/09
to
Scott David Daniels wrote:
> Angus Rodgers wrote:
>> ... my first ... question is how best to find out what's changed from
>> version 2.1
> > to version 2.5. (I've recently installed 2.5.4)
> Consecutively read:
> http://docs.python.org/whatsnew/2.2.html
As someone else pointed out:
http://www.python.org/doc/2.2.3/whatsnew/whatsnew22.html

I forgot to add Richard Gruet's excellent resources, homed here:
http://rgruet.free.fr/

You want to check out PQR 2.5 or 2.6 reference; he encodes when
(what version) a feature came into the language. SO it is a great
overview to read through as you try to take yourself from 2.1 to 2.5.
I'd go for the "modern" style unless the colors annoy you.

--Scott David Daniels
Scott....@Acm.Org

Angus Rodgers

unread,
Jun 25, 2009, 3:05:58 AM6/25/09
to
Someone has gently directed me to the Tutor mailing list:
<http://mail.python.org/mailman/listinfo/tutor>
which I hadn't known about. I've joined, and will try to
confine my initial blundering experiments to there. Sorry
about the spam spam spam spam, lovely spam, wonderful spam!
--
Angus Rodgers

Kirk Strauser

unread,
Jun 25, 2009, 11:31:47 AM6/25/09
to
At 2009-06-24T19:53:49Z, Angus Rodgers <twi...@bigfoot.com> writes:

> stop = 3 # Tab stops every 3 characters
> from types import StringType # Is this awkwardness necessary?
> detab = lambda s : StringType.expandtabs(s, stop) # Or use def
> f = open('h071.txt') # Do some stuff to f, perhaps, and then:
> f.seek(0)
> print ''.join(map(detab, f.xreadlines()))
> f.close()

An equivalent in modern Pythons:

>>> print ''.join(line.expandtabs(3) for line in file('h071.txt'))

In short: expandtabs is a method on strings, there's no need to seek to the
beginning, and files are closed when they are garbage collected (although I
can't make myself not close output files after years of doing so), and map()
is largely deprecated in favor of list comprehensions and generator
functions.
--
Kirk Strauser
The Day Companies

Angus Rodgers

unread,
Jun 25, 2009, 12:53:51 PM6/25/09
to
On Thu, 25 Jun 2009 10:31:47 -0500, Kirk Strauser
<ki...@daycos.com> wrote:

>At 2009-06-24T19:53:49Z, Angus Rodgers <twi...@bigfoot.com> writes:
>

>> print ''.join(map(detab, f.xreadlines()))


>
>An equivalent in modern Pythons:
>
>>>> print ''.join(line.expandtabs(3) for line in file('h071.txt'))

I guess the code below would also have worked in 2.1?
(It does in 2.5.4.)

print ''.join(line.expandtabs(3) for line in \

file('h071.txt').xreadlines())

--
Angus Rodgers

Angus Rodgers

unread,
Jun 25, 2009, 12:56:47 PM6/25/09
to

Possibly silly question (in for a penny ...): does the new feature,
by which a file becomes iterable, operate by some kind of coercion
of a file object to a list object, via something like x.readlines()?
<runs for cover>
--
Angus Rodgers

Angus Rodgers

unread,
Jun 25, 2009, 12:58:35 PM6/25/09
to
On Thu, 25 Jun 2009 17:56:47 +0100, I found a new way to disgrace
myself, thus:

>[...] something like x.readlines()?
^
I don't know how that full stop got in there. Please ignore it!
--
Angus Rodgers

Angus Rodgers

unread,
Jun 25, 2009, 1:07:19 PM6/25/09
to
On Thu, 25 Jun 2009 17:56:47 +0100, I burbled incoherently:

>[...] does the new feature,


>by which a file becomes iterable, operate by some kind of coercion
>of a file object to a list object, via something like x.readlines()?

Sorry to follow up my own post yet again (amongst my weapons is
a fanatical attention to detail when it's too late!), but I had
better rephrase that question:

Scratch "list object", and replace it with something like: "some
kind of iterator object, that is at least already implicit in 2.1
(although the term 'iterator' isn't mentioned in the index to the
2nd edition of Beazley's book)". Something like that! 8-P

--
Angus Rodgers

MRAB

unread,
Jun 25, 2009, 1:22:48 PM6/25/09
to pytho...@python.org
That uses a generator expression, which was introduced in 2.4.

Angus Rodgers

unread,
Jun 26, 2009, 7:52:55 AM6/26/09
to

Sorry, I forgot that list comprehensions need square brackets.

The following code works in 2.1 (I installed version 2.1.3, on
a different machine, to check!):

f = open('h071.txt') # Can't use file('h071.txt') in 2.1
print ''.join([line.expandtabs(3) for line in f.xreadlines()])

(Of course, in practice I'll stick to doing it the more sensible
way that's already been explained to me. I'm ordering a copy of
Wesley Chun, /Core Python Programming/ (2nd ed., 2006), to learn
about version 2.5.)
--
Angus Rodgers

Gabriel Genellina

unread,
Jun 27, 2009, 2:32:12 AM6/27/09
to pytho...@python.org
En Thu, 25 Jun 2009 14:07:19 -0300, Angus Rodgers <twi...@bigfoot.com>
escribi�:

Iterators were added in Python 2.2. An iterator is an object that can be
iterated over; that is, an object for which "for item in some_iterator:
..." works.
Files are their own iterators, yielding one line at a time.
See PEP 234 http://www.python.org/dev/peps/pep-0234/

--
Gabriel Genellina

Angus Rodgers

unread,
Jun 27, 2009, 5:03:43 AM6/27/09
to
On Sat, 27 Jun 2009 03:32:12 -0300, "Gabriel Genellina"
<gags...@yahoo.com.ar> wrote:

>Iterators were added in Python 2.2.

Just my luck. :-)

You've got to love a language whose documentation contains
sentences beginning like this:

"Among its chief virtues are the following four -- no, five
-- no, six -- points: [...]"

--
Angus Rodgers

Beni Cherniavsky

unread,
Jun 30, 2009, 4:24:15 PM6/30/09
to
On Jun 24, 11:40 pm, "J. Cliff Dyer" <j...@sdf.lonestar.org> wrote:
> Also note that you can iterate over a file several times:
>
> f = open('foo.txt')
> for line in f:
> print line[0] # prints the first character of every line
> for line in f:
> print line[1] #prints the second character of every line
>
No, you can't. The second loop prints nothing!
A file by default advances forward. Once you reach the end, you stay
there.

You could explicitly call f.seek(0, 0) to rewind it. Note that not
all file objects are seekable (e.g. pipes and sockets aren't).

The cleaner way to read a regular file twice is to *open* it time:

for line in open('foo.txt'):


print line[0] # prints the first character of every line

for line in open('foo.txt'):
print line[1] # prints the second character of every line

Quick recap for Angus:
for loops work on "iterables" - objects that can be asked for an
"iterator".
Python iterators are unseekable - once exhausted they stay empty.
Most iterables (e.g. lists) return a new iterator every time you ask,
so you can iterate over the same data many times.
But if you already have an iterator, you can use it in a for loop -
when asked for an iterator, it will offer itself (in other words an
iterator is a degenerate kind of iterable).
This is what happened with the file object - it's an iterator and
can't be reused.

Reusing the same iterator between for loops is sometimes useful if you
exit the first loop mid-way:

f = open('foo.mail')
# skip headers until you see an empty line
for line in f:
if not line.strip():
break
# print remainer or file
for line in f:
sys.stdout.write(line)


P.S. Warning: after you use ``for line in f``, it's dangerous to use
``f.read()`` and ``f.readline()`` (buffering mess - just don't.)


> Glad you're enjoying Beazley.  I would look for something more
> up-to-date.  Python's come a long way since 2.1.  I'd hate for you to
> miss out on all the iterators, booleans, codecs, subprocess, yield,
> unified int/longs, decorators, decimals, sets, context managers and
> new-style classes that have come since then.
>

Seconded - 2.1 is ancient.
If you continue with the book, here is a quick list of the most
fundamental improvements to keep in mind:

1. Iterators, generators, generator expressions.
2. Working nested scopes.
3. Decorators.
4. with statement.
5. set & bool types.
6. Descriptors (if confusing, just understand properties).
7. from __future__ import division, // operator.

and the most refreshing modules added:

- subprocess
- ctypes
- itertools
- ElementTree
- optparse

and not new but I just love introducing it people:

- doctest

J. Cliff Dyer

unread,
Jul 1, 2009, 11:48:23 AM7/1/09
to Beni Cherniavsky, pytho...@python.org
On Tue, 2009-06-30 at 13:24 -0700, Beni Cherniavsky wrote:
> On Jun 24, 11:40 pm, "J. Cliff Dyer" <j...@sdf.lonestar.org> wrote:
> > Also note that you can iterate over a file several times:
> >
> > f = open('foo.txt')
> > for line in f:
> > print line[0] # prints the first character of every line
> > for line in f:
> > print line[1] #prints the second character of every line
> >
> No, you can't. The second loop prints nothing!
> A file by default advances forward. Once you reach the end, you stay
> there.

You are, of course, absolutely right. Sorry for the misinformation.

:(

Reply all
Reply to author
Forward
0 new messages