Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

getting the line just before or after a pattern searched

0 views
Skip to first unread message

s999999...@yahoo.com

unread,
Feb 16, 2006, 10:24:28 PM2/16/06
to
hi

i have a file something like this

abcdefgh
ijklmnopq
12345678
rstuvwxyz
.....
.....
.....
12345678
.....

whenever i search the file and reach 12345678, how do i get the line
just above and below ( or more than 1 line above/below) the pattern
12345678 and save to variables? thanks

Alex Martelli

unread,
Feb 16, 2006, 11:10:48 PM2/16/06
to
<s999999...@yahoo.com> wrote:

If the file's of reasonable size, read it all into memory into a list of
lines with a .readlines method call, then loop with an index over said
list and when you find your pattern at index i use i-1, i+1, and so on
(just check that the resulting i+N or i-N is >=0 and <len(thelist)).

If the file's too big to keep in memory at once, you need more clever
approaches -- keep a collections.deque of the last M lines read to be
able to get "N lines ago" for N<M, and for getting lines "after" the one
of interest (which you will read only in "future" iterations) keep track
of relative linenumbers that "will" interest you, decrement them with
each line read, trigger when they reach 0. But unless you're dealing
with files of many, MANY hundres of megabytes, on any typical modern
machine you should be OK with the first, WAY simpler approach.


Alex

bon...@gmail.com

unread,
Feb 16, 2006, 11:52:40 PM2/16/06
to

I would try something similar to the pairwise recipe here :

http://www.python.org/doc/2.4.2/lib/itertools-recipes.html

Raymond Hettinger

unread,
Feb 17, 2006, 6:25:27 AM2/17/06
to

You could use the re module to search everything at once:

>>> import re
>>> print re.findall('\n([^\n]*)\n12345678\n([^\n]*)', open('input.dat').read())
[('ijklmnopq ', 'rstuvwxyz '), ('..... ', '.....')]


Raymond

Daniel Marcel Eichler

unread,
Feb 17, 2006, 7:48:14 AM2/17/06
to pytho...@python.org
s999999...@yahoo.com wrote:

source = file(bla).read().split('\n')
for i, line in enumerate(source):
if line == '12345678':
print '\n'.join( source[i-1:i+1] )

Something like this, for example. Of course, you must also secure that i-1
isn't smaller than zero.


mfg

Daniel

Magnus Lycka

unread,
Feb 22, 2006, 11:18:56 AM2/22/06
to

For gigantic files (i.e. can't put all in RAM at once)
just make sure you always remember the previously read line.
Something like this:

old = None
f = open('mybigfile.txt')
for line in f:
if line == '12345678\n':
print old,
print line,
print f.next(),
old=line

0 new messages