I am new to Python and have one simple question to which I cannot find
a satisfactory solution.
I want to read text line-by-line from a text file, but want to ignore
only the first line. I know how to do it in Java (Java has been my
primary language for the last couple of years) and following is what I
have in Python, but I don't like it and want to learn the better way
of doing it.
file = open(fileName, 'r')
lineNumber = 0
for line in file:
if lineNumber == 0:
lineNumber = lineNumber + 1
else:
lineNumber = lineNumber + 1
print line
Can anyone show me the better of doing this kind of task?
Thanks in advance.
> I want to read text line-by-line from a text file, but want to ignore
> only the first line. I know how to do it in Java (Java has been my
> primary language for the last couple of years) and following is what I
> have in Python, but I don't like it and want to learn the better way of
> doing it.
>
> file = open(fileName, 'r')
> lineNumber = 0
> for line in file:
> if lineNumber == 0:
> lineNumber = lineNumber + 1
> else:
> lineNumber = lineNumber + 1
> print line
>
> Can anyone show me the better of doing this kind of task?
input_file = open(filename)
lines = iter(input_file)
lines.next() # Skip line.
for line in lines:
print line
input_file.close()
Ciao,
Marc 'BlackJack' Rintsch
fileInput = open(filename, 'r')
for lnNum, line in enumerate(fileInput):
if not lnNum:
continue
print line
Why don't you read and discard the first line before processing the
rest of the file?
file = open(filename, 'r')
file.readline()
for line in file: print line,
(It works).
LineList=open(filename,'r').readlines()[1,]
for line in Linelist:
blah blah
You don't want to do that if the file is very large. Also,
you meant [1:] rather than [1,]
That's bad practice as you load the entire file in memory first as
well as it will result in a type error (should be '.readlines()[1:]')
A file object is its own iterator so you can
do more simply:
input_file = open(filename)
input_file.next() # Skip line.
for line in input_file:
print line,
input_file.close()
Since the line read includes the terminating
EOL character(s), print it with a "print ... ,"
to avoid adding an additional EOL.
If the OP needs line numbers elsewhere in the
code something like the following would work.
infile = open(fileName, 'r')
for lineNumber, line in enumerate (infile):
# enumerate returns numbers starting with 0.
if lineNumber == 0: continue
print line,
This also seems like a good time to mention (untested):
from itertools import islice
for line in islice(infile, 1, None):
print line,
==================
actually:
import os
file = open(filename, 'r')
for line in file:
dummy=line
for line in file:
print line
is cleaner and faster.
If you need line numbers, pre-parse things, whatever, add where needed.
Steve
nors...@hughes.net
That's not cleaner, that's a 'WTF?'! A ``for`` line over `file` that
does *not* iterate over the file but is just there to skip the first line
and a completely useless `dummy` name. That's seriously ugly and
confusing.
Ciao,
Marc 'BlackJack' Rintsch
I believe that file.readline() will work better than file.next() for
most purposes since the latter will raise StopIteration on an empty file
whereas file.readline() merely returns ''.
> On 28 Aug 2008 19:32:45 GMT, Marc 'BlackJack' Rintsch <bj_...@gmx.net>
> declaimed the following in comp.lang.python:
> Nice to see someone else was as, uhm, offended by that code sample
> as I was -- I just lacked the vocabulary to put it across cleanly, so
> didn't respond.
>
> Yes, the "dummy" statement could be completely dropped to the same
> effect -- still leaving the useless outer loop...
Nevertheless, I've just done some timeit tests on the two code snippets,
and to my *great* surprise the second ugly snippet is consistently a
smidgen faster even with the pointless import and dummy statement left in.
That is so counter-intuitive that I wonder whether I've done something
wrong, or if it's some sort of freakish side-effect of disk caching or
something. But further investigation will have to wait for later.
If anyone wants to run their own timing tests, don't forget to close the
file explicitly, otherwise timeit() will (I think...) simply iterate over
the EOF for all but the first iteration.
--
Steven