using awk to remove header and footer lines

Kevin Outman

unread,

Jun 5, 1998, 3:00:00 AM6/5/98

to

Looking for help:

I have a file with a known number of lines on top (41) and bottom (16),
but an unknown number of lines in the middle. I want to remove these
lines on the top and bottom and keep the lines in the middle. The lines
to remove have no particular consistent strings to match.
I know I can easily remove the lines on top using the "tail +42
filename" command, but as far as I know there is no equivalent way to
remove the lines on the bottom. Is there an easy way to remove these
lines??
I tried counting the lines in the file (after removing the lines on
top), subtracting 16 then using this number in awk to control the
printing of lines (something like this):

ALLLINES=`cat filename | wc -l`
LINENUM=`expr $ALLLINES - 16`
awk '{
if ( NR <= LINE ) {
print $0
}
}' LINE=$LINENUM filename

This worked at home on my Linux system, but not at work on an SGI
running IRIX 5.3 where it would output the desired lines, skip the first
3 lines of the bottom section and then continue outputting the remaining
13 lines of the bottom section. I don't get it! What am I missing?? Is
there some difference in the versions of awk which would explain this?
More importantly, how can I get rid of these bottom lines?

Thanks ,

Kevin Outman

Mark Katz

unread,

Jun 5, 1998, 3:00:00 AM6/5/98

to

In article <3577F59E...@iinet.net.au>
koes...@opera.iinet.net.au "Kevin Outman" writes:

>I have a file with a known number of lines on top (41) and bottom (16),
>but an unknown number of lines in the middle. I want to remove these
>lines on the top and bottom and keep the lines in the middle. The lines
>to remove have no particular consistent strings to match.
>I know I can easily remove the lines on top using the "tail +42
>filename" command, but as far as I know there is no equivalent way to
>remove the lines on the bottom. Is there an easy way to remove these
>lines??

You may be interested in a mail I sent today to the sed'ers (informal)
newsgroup run so ably by Al Aab

Hope it helps

Mark
--------------------------------------------------------
>
> Seems a pity one cant use '$-10q' in sed
>
> Incidentally my awk solution was
> ------
> BEGIN {ig=10}
> {
> if (NR > ig) print x[NR%ig]
> x[NR%ig]=$0 }
> -------
> Rgds
> Mark
> --------
> ::From: k...@halcyon.com (Ken Pizzini)
> ::Newsgroups: alt.comp.editors.batch
> ::Subject: Re: sed question
> ::Date: 4 Jun 1998 18:48:21 -0700
> ::
> ::In article <slrn6ndn81....@niesel.dkrz.de> you write:
> ::>How can I get all the lines of a file except the 10 last lines?
> ::>Many thanks in advance.
> ::
> ::The simplest way I can think of is:
> :: echo '1,$-10p' | ed - file
> ::
> ::A more cumbersome way, that works better on large files is:
> :: sed -n -e ': loop;1,10!{P;N;D;};N;b loop' file
> ::(The ;'s may need to be replaced with newlines with some
> ::implementations of sed.)
> :: --Ken Pizzini
--
Mark Katz
ISPC, London - Innovation in data-delivery tools
Tel: (44) 181-455 4665, Fax (44) 181-458 9554
** Visit our website on http://www.efiche.com/efiche **

Bernard Murray

unread,

Jun 5, 1998, 3:00:00 AM6/5/98

to

In article <3577F59E...@iinet.net.au>, koes...@opera.iinet.net.au wrote:

> Looking for help:

>
> I have a file with a known number of lines on top (41) and bottom (16),
> but an unknown number of lines in the middle. I want to remove these
> lines on the top and bottom and keep the lines in the middle. The lines
> to remove have no particular consistent strings to match.
> I know I can easily remove the lines on top using the "tail +42
> filename" command, but as far as I know there is no equivalent way to
> remove the lines on the bottom. Is there an easy way to remove these
> lines??

> I tried counting the lines in the file

[snip]

Since there were a couple of vi posts earlier dare I suggest
using ex and a here-file script? You can cut off the top and
bottom in one go. I am an amateur at this but...

If your unadulterated file is N lines long then substitute
(N - 16) for {lastline} below

ex - {your filename} << END_OF_FILE
{lastline},\$d
1,41d
wq
END_OF_FILE

Or instead of having to calculate {lastline} you could hop
to the bottom of the file, then back up 16 lines and then
delete to the end.
Just thought I'd make life a little more interesting
(and somewhat arcane)...
Bernard
--
Bernard Murray, PhD
Dept. Cell. Mol. Pharmacol., UCSF, San Francisco, USA

d...@cts.com

unread,

Jun 6, 1998, 3:00:00 AM6/6/98

to

Kevin Outman <koes...@iinet.net.au> wrote:

>I have a file with a known number of lines on top (41) and bottom (16),
>but an unknown number of lines in the middle. I want to remove these
>lines on the top and bottom and keep the lines in the middle. The lines
>to remove have no particular consistent strings to match.

>Thanks ,
>
>Kevin Outman
>
>

In AWK, deleting the first 41 lines is easy (as it would be in SED).
However, you can't print a line until you know for sure that it isn't
one of the last 16.

Therefore I save 16 lines, and after that I can begin printing the
earliest ones that I've saved because I know that they aren't amongst
the last 16. After I've printed a line, I delete it from memory.

# WARNING: Untested code (beware of off-by-one errors)

# Delete the first 41 lines
NR <= 41 {next}

{
# Save each line temporarily
line[++i] = $0
# At this point, 'i' is the number of lines that have been saved
(though some may have been deleted subsequently)

# Have more than 16 lines been read previously?
if (i > 16) {
# Yes -- print the line that appeared 16 lines ago
print line[i-16]

# Delete to save memory
delete line[i-16]
}
}

david pointon

unread,

Jun 8, 1998, 3:00:00 AM6/8/98

to

Kevin Outman wrote:
>
> Looking for help:

>
> I have a file with a known number of lines on top (41) and bottom (16),
> but an unknown number of lines in the middle. I want to remove these
> lines on the top and bottom and keep the lines in the middle. The lines
> to remove have no particular consistent strings to match.

> I know I can easily remove the lines on top using the "tail +42
> filename" command, but as far as I know there is no equivalent way to
> remove the lines on the bottom. Is there an easy way to remove these
> lines??

> I tried counting the lines in the file (after removing the lines on
> top), subtracting 16 then using this number in awk to control the
> printing of lines (something like this):
>
> ALLLINES=`cat filename | wc -l`
> LINENUM=`expr $ALLLINES - 16`
> awk '{
> if ( NR <= LINE ) {
> print $0
> }
> }' LINE=$LINENUM filename
>
> This worked at home on my Linux system, but not at work on an SGI
> running IRIX 5.3 where it would output the desired lines, skip the first
> 3 lines of the bottom section and then continue outputting the remaining
> 13 lines of the bottom section. I don't get it! What am I missing?? Is
> there some difference in the versions of awk which would explain this?
> More importantly, how can I get rid of these bottom lines?
>
> Thanks ,
>
> Kevin Outman

It's not an awk(1) solution I know but here goes anyway. How about ...
sed "1,41/d; `expr \`wc -l < file\``,$d" file > newfile
mv newfile file

Note that this is untested but I see no reason why it shouldn't work.

HTH,
Dave P

--
Dave Pointon | 'Now I saw, though too late, the folly of beginning
Sun Microsystems (UK) | a work without first counting the cost and, without
david....@uk.sun.com | judging rightly of our strength to carry it through.'
(01753) 566837 | Robinson Crusoe
================================================================================
require disclaimer.pl

david pointon

unread,

Jun 8, 1998, 3:00:00 AM6/8/98

to

david pointon wrote:
>
> Kevin Outman wrote:
<snip>

>
> It's not an awk(1) solution I know but here goes anyway. How about ...
> sed "1,41/d; `expr \`wc -l < file\``,$d" file > newfile
> mv newfile file
>
> Note that this is untested but I see no reason why it shouldn't work.
>
> HTH,
> Dave P
>

Maybe I should've proof read my solution ;-|

It should read :
sed "1,41d; `expr \`wc -l < file\``,$d" file > newfile

Note the missing slash.

HTH even more :-)

Peter Swedock

unread,

Jun 11, 1998, 3:00:00 AM6/11/98

to

Kevin Outman (koes...@iinet.net.au) wrote:
: Looking for help:

: I have a file with a known number of lines on top (41) and bottom (16),
: but an unknown number of lines in the middle. I want to remove these
: lines on the top and bottom and keep the lines in the middle. The lines
: to remove have no particular consistent strings to match.

What about the lines to stay: do the lines that you want to keep have
anything about them that is consistent??

: I know I can easily remove the lines on top using the "tail +42

: filename" command, but as far as I know there is no equivalent way to
: remove the lines on the bottom. Is there an easy way to remove these
: lines??
: I tried counting the lines in the file (after removing the lines on
: top), subtracting 16 then using this number in awk to control the
: printing of lines (something like this):

Whatabout NR - 57 giving you the total number (call it TN) you want to keep... then
using sed find lines 42 through 42+TN.

Petr

--
GTEI, Powered By BBN.

No, no, no... it should read:

"A well regulated militia, being neccessary to the welfare of the state,
the right-wing people who keep and bear arms, shall not be unhinged."

Edwin Luebben

unread,

Jun 17, 1998, 3:00:00 AM6/17/98

to

david pointon wrote:
>
> Looking for help:
> I have a file with a known number of lines on top (41) and bottom (16),
> but an unknown number of lines in the middle. I want to remove these
> lines on the top and bottom and keep the lines in the middle. The lines
> to remove have no particular consistent strings to match.

> I know I can easily remove the lines on top using the "tail +42
> filename" command, but as far as I know there is no equivalent way to
> remove the lines on the bottom. Is there an easy way to remove these
> lines??
> I tried counting the lines in the file (after removing the lines on
> top), subtracting 16 then using this number in awk to control the
> printing of lines (something like this):

> ALLLINES=`cat filename | wc -l`
> LINENUM=`expr $ALLLINES - 16`
> awk '{
> if ( NR <= LINE ) {
> print $0
> }
> }' LINE=$LINENUM filename
> This worked at home on my Linux system, but not at work on an SGI
> running IRIX 5.3 where it would output the desired lines, skip the first
> 3 lines of the bottom section and then continue outputting the remaining
> 13 lines of the bottom section. I don't get it! What am I missing?? Is
> there some difference in the versions of awk which would explain this?
> More importantly, how can I get rid of these bottom lines?
> Thanks ,
> Kevin Outman

Howd'ya like this pure awk way:

NR > 41{ # skip header lines
ringbuffer[NR] = $0 # buffer line
}
NR > (41+16){ # wait another 16 lines
print ringbuffer[NR-16] # print line that's neigther header nor footer
delete ringbuffer[NR-16]# this not really needed but better for larger
files
}

Be careful ! not testet at all

Edwin Luebben

### #### ####
# # #
# ### ###
# # #
### #### #### Groupe Cegedim

The A I S p l u s Developer Team

Mamoon R. Ansari

unread,

Jun 18, 1998, 3:00:00 AM6/18/98

to

Peter Swedock <pswe...@bbnplanet.com> wrote in article
<BIUf1.78$Fr5.8...@cam-news-reader1.bbnplanet.com>...

> Kevin Outman (koes...@iinet.net.au) wrote:
> : Looking for help:
>
> : I have a file with a known number of lines on top (41) and bottom (16),
> : but an unknown number of lines in the middle. I want to remove these
> : lines on the top and bottom and keep the lines in the middle. The lines
> : to remove have no particular consistent strings to match.
>

> What about the lines to stay: do the lines that you want to keep have
> anything about them that is consistent??
>

> : I know I can easily remove the lines on top using the "tail +42

> : filename" command, but as far as I know there is no equivalent way to
> : remove the lines on the bottom. Is there an easy way to remove these
> : lines??
> : I tried counting the lines in the file (after removing the lines on
> : top), subtracting 16 then using this number in awk to control the
> : printing of lines (something like this):
>
>

> Whatabout NR - 57 giving you the total number (call it TN) you want to
keep... then
> using sed find lines 42 through 42+TN.
>
> Petr
>
> --
> GTEI, Powered By BBN.
>
> No, no, no... it should read:
>
> "A well regulated militia, being neccessary to the welfare of the state,
> the right-wing people who keep and bear arms, shall not be unhinged."
>

Use an ex script as follows:

ex - scriptname datafile >output

in the script, do

1,42d
$,+16,d
x

that should do it.

Mamoon Ansari
Systems Administrator
IT Data Solutions
Michigan Tech University
mran...@mtu.edu