Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

grepping lines above and below a pattern found

2 views
Skip to first unread message

mike

unread,
Apr 21, 2004, 5:48:13 AM4/21/04
to
hi

say i have this in a text file

"Use this form to
post your messages.
Remember that
it can be viewed by
millions of people worldwide.
For questions about posting,
check our FAQ"

what is the most commonly used way to print ,say , 2 lines above and 2
lines below a certain pattern found such that my output is

"post your messages.
Remember that
it can be viewed by
millions of people worldwide.
For questions about posting, "

for the matched pattern "viewed" .

thanks...

Gunnar Hjalmarsson

unread,
Apr 21, 2004, 6:14:08 AM4/21/04
to
mike wrote:
> say i have this in a text file
>
> "Use this form to
> post your messages.
> Remember that
> it can be viewed by
> millions of people worldwide.
> For questions about posting,
> check our FAQ"
>
> what is the most commonly used way to print ,say , 2 lines above
> and 2 lines below a certain pattern found

Don't know which way is most common. If the file isn't too big, an
easy approach is to slurp all the lines into an array. Otherwise you
can read the file line by line in a while loop, and keep the two
latest lines saved temporarily.

Which ways have you considered?

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Paul Lalli

unread,
Apr 21, 2004, 7:35:38 AM4/21/04
to


Assuming the file is not so huge that putting it all in memory at once
will hurt you....


#!/usr/bin/perl
use strict;
use warnings;

{
local $/;
open FILE, 'file.txt' or die "Cannot open file: $!";
my $contents = <FILE>;
my $word = 'viewed';
my @matches = $contents =~ /((?:.*\n){2}.*\b$word\b.*\n(?:.*\n){2})/g;
foreach (@matches){
print "$_\n";
}
}


Paul Lalli

Abhinav

unread,
Apr 21, 2004, 9:01:45 AM4/21/04
to

mike wrote:

Is this allowed ?

system "grep -A2 -B2 $pattern $file";

Thanks
Abhinav

David K. Wall

unread,
Apr 21, 2004, 2:55:30 PM4/21/04
to
Abhinav <matrix_...@yahoo.dot.com> wrote:

Might not work well on an OS without grep. :-)

I'm tired of the question, anyway. How's this for a fairly general
solution? The interface could probably be improved -- and I suppose
it would be easier to just use Tie::File and loop through the lines.

use strict;
use warnings;

print join "...\n",
lines_around(
qr/\bviewed\b/i,
*DATA,
{before => 2, after => 2}
);


sub lines_around {
# pattern in filehandle with options...
my ($pattern, $filehandle, $options) = @_;

$options->{before} = defined $options->{before}
? $options->{before}
: 1;
$options->{after} = defined $options->{after}
? $options->{after}
: 1;

# return matching line, default 1
$options->{match} = defined $options->{match}
? $options->{match}
: 1;

# return first match only, default 0
$options->{first_only} = defined $options->{first_only}
? $options->{first_only}
: 0;

my (@buffer, @matches);
while (my $line = <$filehandle>) {
shift @buffer if @buffer > $options->{before};
push @buffer, $line;
next unless $line =~ /$pattern/;
pop @buffer unless $options->{match};
my @matched_lines;
push @matched_lines, @buffer;
push @matched_lines, scalar <DATA>
for 1 .. $options->{after};
return join('', @matched_lines) if $options->{first_only};
push @matches, join '', @matched_lines;
@buffer = ();
}
return @matches;
}

__DATA__


Use this form to
post your messages.
Remember that

it can be VIEWED by

millions of people worldwide.
For questions about posting,
check our FAQ

Use this form to

post your messages.
Remember that

it can be VIEWED by

John W. Krahn

unread,
Apr 21, 2004, 5:44:36 PM4/21/04
to

my $above = 2;
my $below = 2;

my @buffer;

while ( <> ) {
@buffer = ( @buffer, $_ )[ -( $above + 1 ) .. -1 ];
/viewed/ && print @buffer, map scalar <>, 1 .. $below;
}

John
--
use Perl;
program
fulfillment

Tad McClellan

unread,
Apr 21, 2004, 5:31:34 PM4/21/04
to
mike <s999999...@yahoo.com> wrote:


------------------------------
#!/usr/bin/perl
use warnings;
use strict;
use Tie::File;

my $filename = 'text.file';

tie my @array, 'Tie::File', $filename, autochomp => 0 or
die "could not tie file $!";

foreach my $i ( 0 .. $#array ) {
print @array[ $i-2 .. $i+2 ] if $array[$i] =~ /viewed/;
}

untie @array;
------------------------------


--
Tad McClellan SGML consulting
ta...@augustmail.com Perl programming
Fort Worth, Texas

Brad Baxter

unread,
Apr 21, 2004, 10:21:23 PM4/21/04
to
On Wed, 21 Apr 2004, mike wrote:

perl
-ne'/viewed/&&($x=$.);push@a,$_;$.>5&&shift@a;$x&&$.-$x>1&&last}{$x&&print@a'
file

Yes, it diverges from spec 2 lines from the end, but I didn't think
anybody would care. :-)

Brad

Clyde Ingram

unread,
Apr 22, 2004, 6:12:52 AM4/22/04
to

"Tad McClellan" <ta...@augustmail.com> wrote in message
news:slrnc8dq1m...@magna.augustmail.com...

> mike <s999999...@yahoo.com> wrote:
>
> > what is the most commonly used way to print ,say , 2 lines above and 2
> > lines below a certain pattern found such that my output is

> tie my @array, 'Tie::File', $filename, autochomp => 0 or

Nice .. didn't know about this.
The OP might have some use for my inelegant meanderings below, on this
theme.
(Kindly forgive the overlong posting. I expect others have far more elegant
solutions too.)

Trying a "grep" on own source for pattern "[Tt]i." (running on Windoze XPee)
with:

trial_cgrep.pl [Tt]i. 2 trial_cgrep.pl

shows (with "+" marking matched records):

(2) :use warnings;
(3) :use strict;
(4)+:use Tie::File;
(5) :use English;
(6) :
(24) :chomp @file_list;
(25) :
(26)+:die "$nr_lines: nr_lines should be zero or positive integer"
(27) : unless ($nr_lines =~ /^\d+$/);
(28) :
(36) :for my $filename (@file_list) {
(37) :
(38)+: tie my @array, 'Tie::File', $filename, autochomp => 0 or
(39)+: die "could not tie file $!";
(40) :
(41) : # Build a look-up table of record numbers already printed, so that
we
(72) : }
(73) :
(74)+: untie @array;
(75) :}
(76) :

FWIW, the source is below.

Regards,
Clyde

#!/usr/bin/perl
use warnings;
use strict;
use Tie::File;

use English;

my $usage = "$PROGRAM_NAME pattern nr_lines file1 file2 file3 ...";
my $help = "
pattern: Search pattern.
nr_lines: Number of records to show, before and after matching
records.
file1 ...: List of files to search.
";

my $MIN_NR_ARGS_EXPECTED = 3;
my $nr_args_supplied = 0 + @ARGV;

die "Expected at least $MIN_NR_ARGS_EXPECTED arguments. " .
"Got only $nr_args_supplied.\n" .
"Usage: $usage\n$help" unless ($nr_args_supplied >=
$MIN_NR_ARGS_EXPECTED);

my $pattern = shift;
my $nr_lines = shift;
my @file_list = @ARGV;
chomp @file_list;

die "$nr_lines: nr_lines should be zero or positive integer"
unless ($nr_lines =~ /^\d+$/);

foreach (@file_list) {

die "$_: No such readable file"
unless (-r $_);
}


for my $filename (@file_list) {

tie my @array, 'Tie::File', $filename, autochomp => 0 or
die "could not tie file $!";

# Build a look-up table of record numbers already printed, so that we
# do not print the same record twice
my %printed_nr = ();

foreach my $i ( 0 .. $#array ) {

if ($array[$i] =~ /$pattern/) {

for my $j ($i-$nr_lines .. $i+$nr_lines) {

next if $j < 0; # Avoid falling off start of array
next if $j > $#array; # .. and end of array
next if $printed_nr{$j}; # Skip record if already printed

# Prepare print prefix for record being printed
my @prefix_parts = ();

# Omit the filename if only one supplied.
push (@prefix_parts, "$filename ") if (@file_list > 1);

# Record number. e.g. (42)
push (@prefix_parts, "(" . ($j+1) . ")");

# "+" indicator, if this record matches the pattern, else
SPACE
push (@prefix_parts, (($array[$j] =~ $pattern) ? "+" : "
" ));

my $prefix = join("", @prefix_parts) . ":";

print $prefix . $array[$j];
$printed_nr{$j} = 1;
}
}
}

untie @array;
}


ENDS


Anno Siegel

unread,
Apr 22, 2004, 7:00:51 AM4/22/04
to
Tad McClellan <ta...@augustmail.com> wrote in comp.lang.perl.misc:

That's a good way to put Tie::File to use. It doesn't address two
problems that come into view when the basic functionality is there:

- Dealing with matches that are too close to the beginning or end of
the file to fill a whole window

- What to do with overlapping windows

The OP hasn't addressed these issues, and I don't want to suggest
solutions (some have been suggested in this thread). I just want
to point out that this solution, pretty as it is, isn't the whole
story.

Anno

Tad McClellan

unread,
Apr 22, 2004, 8:56:22 AM4/22/04
to
Clyde Ingram <clydenos...@nospamorhamgetofftheline.freeservenospamorham.co.uk> wrote:
>
> "Tad McClellan" <ta...@augustmail.com> wrote in message
> news:slrnc8dq1m...@magna.augustmail.com...
>> mike <s999999...@yahoo.com> wrote:
>>
>> > what is the most commonly used way to print ,say , 2 lines above and 2
>> > lines below a certain pattern found such that my output is
>
>> tie my @array, 'Tie::File', $filename, autochomp => 0 or
>
> Nice .. didn't know about this.


I knew about it because I re-read the Perl FAQs from time to time. :-)

How do I change one line in a file/
delete a line in a file/
insert a line in the middle of a file/
append to the beginning of a file?


> my $nr_args_supplied = 0 + @ARGV;

^^^
^^^

The LHS already ensures scalar context for @ARGV, so you don't need that.


> my $pattern = shift;
> my $nr_lines = shift;
> my @file_list = @ARGV;

my( $pattern, $nr_lines, @file_list ) = @ARGV;


> chomp @file_list;


You are expecting command line arguments with newlines in them?


> die "$_: No such readable file"
> unless (-r $_);


That should probably be:

die "$_: No such readable file"

unless (-f $_ and -r _);

Brad Baxter

unread,
Apr 22, 2004, 9:47:32 AM4/22/04
to
On Wed, 21 Apr 2004, Brad Baxter wrote:
>
> perl
> -ne'/viewed/&&($x=$.);push@a,$_;$.>5&&shift@a;$x&&$.-$x>1&&last}{$x&&print@a'
> file
>
> Yes, it diverges from spec 2 lines from the end, but I didn't think
> anybody would care. :-)

It also has a problem with overlapping windows.

perl
-ne'/viewed/&&\!$x&&($x=$.);push@a,$_;$.>5&&shift@a;$x&&$.-$x>1&&last}{$x&&print@a'
file

(\! to hide !$ from the shell ... YMMV)

Brad

Clyde Ingram

unread,
Apr 22, 2004, 10:16:14 AM4/22/04
to
Tad,

> I knew about it because I re-read the Perl FAQs from time to time. :-)
>

> > my $nr_args_supplied = 0 + @ARGV;
> ^^^

> The LHS already ensures scalar context for @ARGV, so you don't need that.

Ah, quite right.

> > my $pattern = shift;
> > my $nr_lines = shift;
> > my @file_list = @ARGV;
>
> my( $pattern, $nr_lines, @file_list ) = @ARGV;

Oh yes . . .

> > chomp @file_list;
>
> You are expecting command line arguments with newlines in them?

Er. . ., no

> > die "$_: No such readable file"
> > unless (-r $_);
>
> That should probably be:
>
> die "$_: No such readable file"
> unless (-f $_ and -r _);

Um, certainly. . .

Thanks for the improvements, Tad.
That was just a quick knock up script. (I did say it was inelegant.)
Always nice to be shot down so constructively.

Regards,
Clyde


David Combs

unread,
May 21, 2004, 12:01:03 AM5/21/04
to
In article <Xns94D297D2B5...@216.168.3.30>,

David K. Wall <dw...@fastmail.fm> wrote:
>Abhinav <matrix_...@yahoo.dot.com> wrote:
>
>> mike wrote:
>>
>>> say i have this in a text file
>>>
>>> "Use this form to
>>> post your messages.
>>> Remember that
>>> it can be viewed by
>>> millions of people worldwide.
>>> For questions about posting,
>>> check our FAQ"
>>>
>>> what is the most commonly used way to print ,say , 2 lines above
>>> and 2 lines below a certain pattern found such that my output is
>>>
>>> "post your messages.
>>> Remember that
>>> it can be viewed by
>>> millions of people worldwide.
>>> For questions about posting, "
>>>
>>> for the matched pattern "viewed" .
>>>
>>> thanks...
>> Is this allowed ?
>>
>> system "grep -A2 -B2 $pattern $file";
>
>Might not work well on an OS without grep. :-)

Might not work well on an OS without GNU grep. :-)

David

David Combs

unread,
May 21, 2004, 12:12:34 AM5/21/04
to

John: Perhaps you could add a few comments explaining
your (as usual) nifty code?

Once you've done that, maybe it can be inserted
in to some tutorial part of the pod-files --
looks like it'd be useful, and educational, for
lots of people.

Thanks,

David


Anno Siegel

unread,
May 21, 2004, 4:46:47 AM5/21/04
to
John W. Krahn <kra...@acm.org> wrote in comp.lang.perl.misc:

[...]

> my $above = 2;
> my $below = 2;
>
> my @buffer;
>
> while ( <> ) {
> @buffer = ( @buffer, $_ )[ -( $above + 1 ) .. -1 ];
> /viewed/ && print @buffer, map scalar <>, 1 .. $below;

/viewed/ && print @buffer,
grep defined, map scalar <DATA>, 1 .. $below;

...in case of matches near the end of the file.

> }

Anno

0 new messages