'While' question

Ben Keshet

unread,

Aug 21, 2008, 6:01:25 PM8/21/08

to pytho...@python.org

Hi -

I am writing now my second script ever in python and need some help with
'while'. I am reading text from a set of files and manipulating the data
somehow. I use 'while 'word' not in line' to recognize words in the
texts. Sometimes, the files are empty, so while doesn't find 'word' and
runs forever. I have two questions:
1) how do I overcome this, and make the script skip the empty files?
(should I use another command?)
2) how do I interrupt the code without closing Python? (I have ActivePython)

I do know that the strings I am searching for are within the first say
50 lines.

Thanks!

Code:
|while 'PRIMARY' not in line:
line = f.readline()[:-1]
# copy scores
while 'es' not in line:
line = f.readline()[:-1]
out_file.write(line)
out_file.write(' ')
print
out_file.write('\n')
f.close()
out_file.close()

For example, 'PRIMARY' and 'es' do not exist when the file I am reading
(f) is empty.
|

Wojtek Walczak

unread,

Aug 21, 2008, 6:11:41 PM8/21/08

to

On Thu, 21 Aug 2008 18:01:25 -0400, Ben Keshet wrote:
> somehow. I use 'while 'word' not in line' to recognize words in the
> texts. Sometimes, the files are empty, so while doesn't find 'word' and
> runs forever. I have two questions:
> 1) how do I overcome this, and make the script skip the empty files?
> (should I use another command?)
> 2) how do I interrupt the code without closing Python? (I have ActivePython)

Try the docs first. You need to read about 'continue' and
'break' statements: http://docs.python.org/tut/node6.html

HTH.

--
Regards,
Wojtek Walczak,
http://tosh.pl/gminick/

Ben Keshet

unread,

Aug 21, 2008, 7:01:21 PM8/21/08

to Wojtek Walczak, Python list

Thanks for the reference. I tried it with a general example and got it
to work - I used an index that counts up to a threshold that is set to
break. It does not work though with my real code. I suspect this is
because I cannot really read any lines from an empty file, so the code
gets stuck even before I get to j=j+1:

line = f.readline()[:-1]
j=0

while 'PRIMARY' not in line:
line = f.readline()[:-1]

j=j+1
if j==30:
break

Any suggestions?

BK

Larry Bates

unread,

Aug 21, 2008, 7:29:34 PM8/21/08

to

You might consider turning this around into something like:

for j, line in enumerate(f):
if 'PRIMARY' in line:
continue

if j == 30:
break

IMHO this is MUCH easier to understand.

-Larry

Message has been deleted

John Machin

unread,

Aug 21, 2008, 7:41:09 PM8/21/08

to

On Aug 22, 9:01 am, Ben Keshet <kesh...@umbc.edu> wrote:
> Thanks for the reference. I tried it with a general example and got it
> to work - I used an index that counts up to a threshold that is set to
> break. It does not work though with my real code. I suspect this is
> because I cannot really read any lines from an empty file, so the code
> gets stuck even before I get to j=j+1:
>
> line = f.readline()[:-1]
> j=0
> while 'PRIMARY' not in line:
> line = f.readline()[:-1]
> j=j+1
> if j==30:
> break
>
> Any suggestions?
>

(1) don't top-post
(2) use a 'for' statement
(3) readline is antique
(4) don't throw away the last character in the line without knowing
what it is

for line in f:
line = line.rstrip('\n')
# do something useful here
if 'PRIMARY' in line:
break
# do more useful stuff here

A quick rule of thumb for Python: if your code looks ugly or strained
or awkward, it's probably also wrong.

HTH,
John

Bruno Desthuilliers

unread,

Aug 22, 2008, 4:28:09 AM8/22/08

to

John Machin a écrit :
(snip)

> A quick rule of thumb for Python: if your code looks ugly or strained
> or awkward, it's probably also wrong.

+1 QOTW

Ben Keshet

unread,

Aug 22, 2008, 10:42:13 AM8/22/08

to Python list, wlf...@ix.netcom.com, John Machin

Thanks. I tried to use 'for' instead of 'while' as both of you
suggested. It's running well as my previous version but breaks
completely instead of just skipping the empty file. I suspect the
reason is that this part is inside another 'for' so it stops
everything. I just want to it to break only one 'for', that is go back
to 5th line in the example code (directory C has one empty file):

receptors = ['A', 'B', 'C']
for x in receptors:
print x
for i in range(3):
for r in (7, 9, 11, 13, 15, 17):
f =
open('c:/Linux/Dock_method_validation/%s/validation/ligand_ran_line_%s_%sA_secondary_scored.mol2'
%(x,i,r), 'r')
line = f.readline()[:-1]
out_file =
open('c:/Linux/Dock_method_validation/%s/validation/pockets.out' %(x),'a')
out_file.write('%s ' %i)
out_file.write('%s ' %r)
# skip to scores
j=0
for line in f:
line = line.rstrip()
if "PRIMARY" not in line:
j += 1
if j == 20:
break
else:
for line in f:
if "TRIPOS" not in line:
line = line.rstrip()
out_file.write(line)
else:
break
f.close()
out_file.close()

Any suggestions as for how to control the "extent of break"? should I do
something else instead? Thank you!

Wojtek Walczak

unread,

Aug 22, 2008, 4:10:43 PM8/22/08

to

On Fri, 22 Aug 2008 10:42:13 -0400, Ben Keshet wrote:
> Thanks. I tried to use 'for' instead of 'while' as both of you
> suggested. It's running well as my previous version but breaks
> completely instead of just skipping the empty file. I suspect the
> reason is that this part is inside another 'for' so it stops
> everything. I just want to it to break only one 'for', that is go back
> to 5th line in the example code (directory C has one empty file):

> for line in f:
^^^

> line = line.rstrip()
> if "PRIMARY" not in line:
> j += 1
> if j == 20:
> break
> else:
> for line in f:

^^^
You're iterating through the same value in inner and outer loop.
Don't do that. It's hard to predict the behavior of such a code.

Regarding break statement, it breaks only the inner loop
and returns to the outer loop/block.

It would be great if you could reduce your code to a short piece
that illustrates your problem and that we could all run.

Ben Finney

unread,

Aug 22, 2008, 10:17:37 PM8/22/08

to

Bruno Desthuilliers <bruno.42.de...@websiteburo.invalid> writes:

Merely a special case of the truism that "Your code is probably wrong
(regardless of any other properties it may show)".

That doesn't make John's quote any less worthy of QOTW, though :-)

--
\ “The problem with television is that the people must sit and |
`\ keep their eyes glued on a screen: the average American family |
_o__) hasn't time for it.” —_The New York Times_, 1939 |
Ben Finney

Ben Keshet

unread,

Aug 22, 2008, 6:49:10 PM8/22/08

to Wojtek Walczak, Python list

I ended up using another method as someone suggested to me. I am still
not sure why the previous version got stuck on empty files, while this
one doesn't:

receptors = ['A' 'B']
for x in receptors:
# open out_file for appending for each 'x' in receptors, close at
same level

out_file =
open('c:/Linux/Dock_method_validation/%s/validation/pockets.out' %(x),'a')

for i in range(10):

for r in (7, 9, 11, 13, 15, 17):
f =
open('c:/Linux/Dock_method_validation/%s/validation/ligand_ran_line_%s_%sA_secondary_scored.mol2'
%(x,i,r), 'r')

# assume 'PRIMARY' should be found first
# set flag for string 'PRIMARY'
primary = False
# iterate on file object, empty files will be skipped
for line in f:
if 'PRIMARY' in line:
primary = True
out_file.write(line.strip())
# copy scores
elif 'TRIPOS' not in line and primary:
out_file.write(line.strip())
out_file.write(' ')
elif 'TRIPOS' in line and primary:
break

Scott David Daniels

unread,

Aug 23, 2008, 2:37:09 PM8/23/08

to

Ben Keshet wrote:
...

> I ended up using another method as someone suggested to me. I am still
> not sure why the previous version got stuck on empty files, while this
> one doesn't:
>
> receptors = ['A' 'B']

*** Alarm bells *** Do you mean ['AB'], or do you mean ['A', 'B']?
> ...(more code one way) ...

Don't be afraid of defining functions, you are nested too deeply to
easily understand, and hence likely to make mistakes. For similar
reasons, I don't like names like x and i unless there are no better
names. Also, since you don't seem to "really" need to write, I used
print. The comments would be better if I knew the field a bit (or
your code had had better names).

Try something like (obviously I couldn't test it, so untested code):

OUTPUT = 'c:/Linux/Dock_method_validation/%s/validation/pockets.out'
INPUT = ('c:/Linux/Dock_method_validation/%s/validation/'
'ligand_ran_line_%s_%sA_secondary_scored.mol2')

def extract_dist(dest, receptor, line, ligand):
'''Get distances after "PRIMARY" from the appropriate file
'''
source = open(INPUT % (receptor, line, ligand), 'r')
gen = iter(source) # get a name for walking through the file.
try:
# Find the start
for j, text in enumerate(gen):
if 'PRIMARY' in text:
print >>dest, text.strip(),
break
if j == 19: # Stop looking after 20 lines.
return # nothing here, go on to the next one
# copy scores up to TRIPOS
for text in gen:
if 'TRIPOS' in text:
break
print >>dest, text.strip(),
print
print >>dest
finally:
source.close()

for receptor in 'A', 'B':
# open out_file for appending per receptor, close at same level
out_file = open(OUTPUT % receptor, 'a')
for line in range(10):
for ligand in (7, 9, 11, 13, 15, 17):
extract_dist(out_file, receptor, line, ligand)
out_file.close()

--Scott David Daniels
Scott....@Acm.Org