how to read the last field in a line?

Sharkie

unread,

Nov 6, 2003, 2:26:13 AM11/6/03

to

This is probably a basic question. I use awk occasionally in various
shell scripts, and it has shown to be very handy.

One problem I had encountered on several occasions, is where I have to
print the last field, or certain field but counted from the right,
as opposed to from the left. So f.ex. if I have this file:

this_line_consist_of_many_words
short_line2
this_line_has_even_more_words_separated_by_underscore
...

using awk and underscore as delimiter I can easily print out any words
using $1, $2, $3, etc.

$ awk 'BEGIN{OFS=FS="_"}{print $1" "$2" "$3}' temp.txt

But how can i print the last word (field), or 2nd from the right,
if number of fields varies for each line?

I'm on sun solaris 7. I heard about ARGV and ARGC, but they don't return
anything (unless I'm not using them right).

William Park

unread,

Nov 6, 2003, 2:47:41 AM11/6/03

to

Sharkie <shar...@yahoo.com> wrote:
> This is probably a basic question. I use awk occasionally in various
> shell scripts, and it has shown to be very handy.
>
> One problem I had encountered on several occasions, is where I have to
> print the last field, or certain field but counted from the right,
> as opposed to from the left. So f.ex. if I have this file:
>
> this_line_consist_of_many_words
> short_line2
> this_line_has_even_more_words_separated_by_underscore
> ...
>
>
> using awk and underscore as delimiter I can easily print out any words
> using $1, $2, $3, etc.
>
> $ awk 'BEGIN{OFS=FS="_"}{print $1" "$2" "$3}' temp.txt
>
> But how can i print the last word (field), or 2nd from the right,
> if number of fields varies for each line?

Hint:
awk '{print $NF, $NF-2}'

--
William Park, Open Geometry Consulting, <openge...@yahoo.ca>
Linux solution for data management and processing.

Aharon Robbins

unread,

Nov 6, 2003, 2:20:48 AM11/6/03

to

In article <bocuas$1cii89$1...@ID-99293.news.uni-berlin.de>,

Actually:

awk '{ print $NF, $(NF-2) }'
--
Aharon (Arnold) Robbins --- Pioneer Consulting Ltd. arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381 Fax: +1 530 688 5518
Nof Ayalon Cell Phone: +972 51 297-545
D.N. Shimshon 99785 ISRAEL

Philipp Buehler

unread,

Nov 6, 2003, 4:41:50 AM11/6/03

to

* Aharon (arn...@skeeve.com) wrote:
> awk '{ print $NF, $(NF-2) }'

Btw.. any "elegant" (lean) solution to print all fields
expcept first 2 or such?
Besides taking $NF and for-looping from '2' to '$NF'?

Just curious ;)

ciao
--
"Unix was the first OS where you could carry the media and system
documentation around in a briefcase. This was fixed in BSD4.2."

William Park

unread,

Nov 6, 2003, 5:23:13 AM11/6/03

to

Philipp Buehler <pb+usen...@usenet-nov-2003.fips.de> wrote:
> * Aharon (arn...@skeeve.com) wrote:
>> awk '{ print $NF, $(NF-2) }'
>
> Btw.. any "elegant" (lean) solution to print all fields
> expcept first 2 or such?
> Besides taking $NF and for-looping from '2' to '$NF'?
>
> Just curious ;)

Not without patching the source... :-)

Walter Peters

unread,

Nov 6, 2003, 6:22:47 AM11/6/03

to

Philipp Buehler wrote on 06.11.2003 10:41:

> * Aharon (arn...@skeeve.com) wrote:
>
>> awk '{ print $NF, $(NF-2) }'
>
>
> Btw.. any "elegant" (lean) solution to print all fields
> expcept first 2 or such?
> Besides taking $NF and for-looping from '2' to '$NF'?

awk '{$1="";$2=""; print $0}'
works for me ;)

--
Walter

Michael Zawrotny

unread,

Nov 6, 2003, 8:32:07 AM11/6/03

to

That's ok for some purposes, but not necessarily all. After assigning
to one of the fields, $0 is re-constructed, which may not exactly
preserve whitespace. On my system, gawk 3.1.1 and mawk 1.3.3 both
print leading space with the above. In addition, on a tab separated
file, the remaining tabs are converted to space characters. Whether
or not the above behavior is acceptable obviously depends on what the
application is.

Mike

--
Michael Zawrotny
Institute of Molecular Biophysics
Florida State University | email: zawr...@sb.fsu.edu
Tallahassee, FL 32306-4380 | phone: (850) 644-0069

Message has been deleted

Philipp Buehler

unread,

Nov 6, 2003, 10:46:21 AM11/6/03

to

* Walter (w.nospa...@arcor.de) wrote:
>> Btw.. any "elegant" (lean) solution to print all fields
>> expcept first 2 or such?
>> Besides taking $NF and for-looping from '2' to '$NF'?
>
> awk '{$1="";$2=""; print $0}'

this has sideeffects in printing and sometimes reassignment..

Kenny McCormack

unread,

Nov 6, 2003, 11:02:16 AM11/6/03

to

In article <slrnbqkm5j.kmj...@tubb.sysfive.com>,

Philipp Buehler <pb+usen...@usenet-nov-2003.fips.de> wrote:
>* Walter (w.nospa...@arcor.de) wrote:
>>> Btw.. any "elegant" (lean) solution to print all fields
>>> expcept first 2 or such?
>>> Besides taking $NF and for-looping from '2' to '$NF'?
>>
>> awk '{$1="";$2=""; print $0}'
>
>this has sideeffects in printing and sometimes reassignment..

How about:

#!gawk
{
for (i=0; i<2; i++) # Adjust to set the number of fields to delete
sub(/^[ \t]*[^ \t]*[ \t]*/,"")
print
}

There's probably also a way to do it with tricky application of the match
function, using features found in advanced versions of AWK (e.g., GAWK
& TAWK), where you would end up with an integer which points to the 3rd
field in $0, without modifying $0.

And, FWIW, you can probably do it in standard AWK with multiple invocations
of the match() function, but the advanced versions would allow you to do it
in 1.

Patrick TJ McPhee

unread,

Nov 6, 2003, 11:27:17 AM11/6/03

to

In article <slrnbqk0s7.kmj...@tubb.sysfive.com>,
Philipp Buehler <pb+usen...@usenet-nov-2003.fips.de> wrote:
% * Aharon (arn...@skeeve.com) wrote:
% > awk '{ print $NF, $(NF-2) }'
%
% Btw.. any "elegant" (lean) solution to print all fields
% expcept first 2 or such?
% Besides taking $NF and for-looping from '2' to '$NF'?

It depends on what you mean by elegant. You can create a pattern
which matches the first 2 (or n) fields plus the field separators,
use match() to get the offset, then use substr to print the rest.
For the default field separator

match(/^[[:space:]]*[^[:space:]]+[[:space:]]+[^[:space:]]+[[:space:]]+/, $0)
print substr($0, RSTART+RLENGTH)

you can construct the RE with a loop once you have a pattern for the field
and a pattern for the field separator, but you don't seem to like loops...

The advantage of this is that it preserves the field separators, which
can't be done in general using looping solutions.
--

Patrick TJ McPhee
East York Canada
pt...@interlog.com

Philipp Buehler

unread,

Nov 7, 2003, 4:05:24 AM11/7/03

to

* Kenny (gaz...@yin.interaccess.com) wrote:
>>>> Besides taking $NF and for-looping from '2' to '$NF'?

^^^^^^^^^^^^

> #!gawk

ugh :)

> for (i=0; i<2; i++) # Adjust to set the number of fields to delete

well, see above. Obviously there's no "shortcut" w/o constraints.
Case closed on my side :)

Philipp Buehler

unread,

Nov 7, 2003, 4:07:19 AM11/7/03

to

* Patrick (pt...@interlog.com) wrote:
> match(/^[[:space:]]*[^[:space:]]+[[:space:]]+[^[:space:]]+[[:space:]]+/, $0)
> print substr($0, RSTART+RLENGTH)

oh.. that's clever

> The advantage of this is that it preserves the field separators, which
> can't be done in general using looping solutions.

yeah, point being here. Further processing could rely on FS. hmm :)

Lisa

unread,

Feb 4, 2006, 2:17:32 PM2/4/06

to

Philipp Buehler wrote:
> * Aharon (arn...@skeeve.com) wrote:
>
>> awk '{ print $NF, $(NF-2) }'
>
>
> Btw.. any "elegant" (lean) solution to print all fields
> expcept first 2 or such?
> Besides taking $NF and for-looping from '2' to '$NF'?
>
> Just curious ;)
>
> ciao

wow, last post here and I'm looking for a way to grab "the rest of the
fields" without for-looping (if its possible)
As I'm reading a config file and after I get the first 4 fields, the 5th
field is a comment (which contains spaces) so I want $5..$(NF) to return
to the calling script.
So far I see $1="";$2=""...etc... any other solution?
Nothing in Sed&Awk book or the awk FAQ

Thanks
Lisa

Jürgen Kahrs

unread,

Feb 4, 2006, 3:01:56 PM2/4/06

to

Lisa wrote:

> As I'm reading a config file and after I get the first 4 fields, the 5th
> field is a comment (which contains spaces) so I want $5..$(NF) to return
> to the calling script.
> So far I see $1="";$2=""...etc... any other solution?

No other solution. But you can write $1=$2="".
AWK (like C) is an ALGOLoid language.

Kenny McCormack

unread,

Feb 4, 2006, 3:30:29 PM2/4/06

to

In article <44kfhkF...@individual.net>,

Note that the $1=$2=...="" approach is equivalent to the for (i=5; ...)
approach - both involve looping at one end or the other.

The best way is to somehow find out the char position where you want to
start (i.e., where the chosen field is located) and then do substr(str,pos)

The details of how you get that starting position are, of course,
problem-specific.

Lisa

unread,

Feb 4, 2006, 3:38:56 PM2/4/06

to

Think I got it... I was reading a config file in which the 5th field is
the last field and can contain spaces (a comment field) - so after
reading everywhere, this above substr gave me the answer:

print substr($0, match( $0, $5 ));

if the field is empty I'm hosed - actuall if any field is empty in my
config file, i'm hosed! Anyway, this seems to work for now.

Caio-caio,
Lisa

Joe User

unread,

Feb 4, 2006, 4:15:44 PM2/4/06

to

If you use gawk this untested snippet should work (run gawk with
--re-interval flag):

s = $0
Field5on = ""
if (sub(/([^[:space:]]+[[:space:]]+){4}/, "", s) Field5on = s;

You could do this without re-interval, by rewriting the regular
expression. Setting OFS and FS effects this.

Anything you do with NF causes $0 to be recomputed, so you lose spacing
information. But, I guess you could do something like this, if that did
not matter (untested):

$4 = $4 # Force recomputation of $0
s = $0
NF = 4 # Truncate $0
Field5on = substr(s, length($0)+1)

--
Huey Long was once asked if he thought America would ever
become fascist. He responded, "Of course it will, but
we'll call it anti-fascism."

Ed Morton

unread,

Feb 4, 2006, 8:39:28 PM2/4/06

to

Joe User wrote:
> On Sat, 04 Feb 2006 12:17:32 -0700, Lisa wrote:
>
>
>>Philipp Buehler wrote:
>>
>>>* Aharon (arn...@skeeve.com) wrote:
>>>
>>>
>>>> awk '{ print $NF, $(NF-2) }'
>>>
>>>
>>>Btw.. any "elegant" (lean) solution to print all fields expcept first 2
>>>or such?
>>>Besides taking $NF and for-looping from '2' to '$NF'?
>>>
>>>Just curious ;)
>>>
>>>ciao
>>
>>wow, last post here and I'm looking for a way to grab "the rest of the
>>fields" without for-looping (if its possible) As I'm reading a config file
>>and after I get the first 4 fields, the 5th field is a comment (which
>>contains spaces) so I want $5..$(NF) to return to the calling script.
>>So far I see $1="";$2=""...etc... any other solution? Nothing in Sed&Awk
>>book or the awk FAQ
>
>
> If you use gawk this untested snippet should work (run gawk with
> --re-interval flag):
>
> s = $0
> Field5on = ""
> if (sub(/([^[:space:]]+[[:space:]]+){4}/, "", s) Field5on = s;

That's close but it'd require the record to start with non-space
characters. I'd use this with a POSIX awk:

awk 'sub(/^[[:space:]]*([^[:space:]]*[[:space:]]*){4}/,"")'

With gawk:
gawk --re-interval '...'
or:
gawk --posix '...'

Note that "gensub()" is not available with "--posix" but it is available
with "--re-interval" so if you need to use an interval expression (e.g.
{1,} or {8} or {2,4}) with gensub() then you must use --re-interval
rather than --posix so --re-interval is generally the preferred method.

Regards,

Ed.

Patrick TJ McPhee

unread,

Feb 5, 2006, 9:10:23 PM2/5/06

to

In article <r_-dnRXm7oG...@adelphia.com>,
Lisa <"Lisa "@nowhere.com> wrote:

% Think I got it... I was reading a config file in which the 5th field is
% the last field and can contain spaces (a comment field) - so after
% reading everywhere, this above substr gave me the answer:
%
% print substr($0, match( $0, $5 ));

The problem with this is that it fails if $5 contains characters which
are special in regular expressions, or if the contents of $5 appear
earlier in the line.

What will work is something like

BEGIN {
# the default FS ignores leading whitespace
firstfour = "^[ \t]*"

# now add in a RE that matches non-whitespace followed by whitespace,
# once for each field you want to skip
for (i = 1; i <= 4; i++)
firstfour = firstfour "[^ \t]+[ \t]+"
}

# the fifth field starts just after the string that matches firstfour
{
match($0, firstfour)
print substr($0, RSTART+RLENGTH)
}
--

Patrick TJ McPhee
North York Canada
pt...@interlog.com

Bill Seivert

unread,

Feb 6, 2006, 11:02:12 PM2/6/06

to

If your fifth field began with a pound sign (#), you could probably
do something simple like comment=$0; sub (/.*#/, "", comment).

That would probably require reformatting your config files, but that
should be a one-time operation. The benefit is that if you needed
to add a fifth field that is not part of the comment, just put it in
front of the pound sign. Also, to keep the non-comment fields, use
noncomment=$0; sub (/#.*/, "", noncomment);

Bill Seivert

Ed Morton

unread,

Feb 6, 2006, 11:16:15 PM2/6/06

to

Bill Seivert wrote:
>
>
> Lisa wrote:
>
>> Philipp Buehler wrote:
>>
>>> * Aharon (arn...@skeeve.com) wrote:
>>>
>>>> awk '{ print $NF, $(NF-2) }'
>>>
>>>
>>>
>>>
>>> Btw.. any "elegant" (lean) solution to print all fields
>>> expcept first 2 or such? Besides taking $NF and for-looping from '2'
>>> to '$NF'?
>>>
>>> Just curious ;)
>>>
>>> ciao
>>
>>
>> wow, last post here and I'm looking for a way to grab "the rest of the
>> fields" without for-looping (if its possible)
>> As I'm reading a config file and after I get the first 4 fields, the
>> 5th field is a comment (which contains spaces) so I want $5..$(NF) to
>> return to the calling script.
>> So far I see $1="";$2=""...etc... any other solution?
>> Nothing in Sed&Awk book or the awk FAQ
>>
>> Thanks
>> Lisa
>
>
> If your fifth field began with a pound sign (#), you could probably
> do something simple like comment=$0; sub (/.*#/, "", comment).

ITYM:

sub (/[^#]*#/, "", comment)

to handle "#"s within comments, e.g. the incredibly useful:

i = 1 # set i to #1

> That would probably require reformatting your config files, but that
> should be a one-time operation. The benefit is that if you needed
> to add a fifth field that is not part of the comment, just put it in
> front of the pound sign. Also, to keep the non-comment fields, use
> noncomment=$0; sub (/#.*/, "", noncomment);

or something like:

sub(comment"$","",noncomment)

just in case you want to change the way you identify comments in future...

Ed.

Bill Seivert

unread,

Feb 6, 2006, 11:32:51 PM2/6/06

to

Ed Morton wrote:
> Bill Seivert wrote:
>
snip

>>> wow, last post here and I'm looking for a way to grab "the rest of
>>> the fields" without for-looping (if its possible)
>>> As I'm reading a config file and after I get the first 4 fields, the
>>> 5th field is a comment (which contains spaces) so I want $5..$(NF) to
>>> return to the calling script.
>>> So far I see $1="";$2=""...etc... any other solution?
>>> Nothing in Sed&Awk book or the awk FAQ
>>>
>>> Thanks
>>> Lisa
>>
>>
>>
>> If your fifth field began with a pound sign (#), you could probably
>> do something simple like comment=$0; sub (/.*#/, "", comment).
>
>
> ITYM:
>
> sub (/[^#]*#/, "", comment)

ITYM:

sub (/^[^#]*#/, "", comment)

Thanks, Ed, sometimes I forget to anchor my REs.

Bill Seivert

Ed Morton

unread,

Feb 7, 2006, 12:55:08 AM2/7/06

to

Bill Seivert wrote:
>
>
> Ed Morton wrote:
>
>> Bill Seivert wrote:
>>
> snip
>
>
>>>> wow, last post here and I'm looking for a way to grab "the rest of
>>>> the fields" without for-looping (if its possible)
>>>> As I'm reading a config file and after I get the first 4 fields, the
>>>> 5th field is a comment (which contains spaces) so I want $5..$(NF)
>>>> to return to the calling script.
>>>> So far I see $1="";$2=""...etc... any other solution?
>>>> Nothing in Sed&Awk book or the awk FAQ
>>>>
>>>> Thanks
>>>> Lisa
>>>
>>>
>>>
>>>
>>> If your fifth field began with a pound sign (#), you could probably
>>> do something simple like comment=$0; sub (/.*#/, "", comment).
>>
>>
>>
>> ITYM:
>>
>> sub (/[^#]*#/, "", comment)
>
>
>
> ITYM:
>
> sub (/^[^#]*#/, "", comment)

It'll match the same strings either way, i.e. any sequence of zero or
more non-# characters followed by a #.

> Thanks, Ed, sometimes I forget to anchor my REs.

It doesn't hurt, but you don't actually need to anchor it to the start
of the string in this case.

Ed.

Bill Seivert

unread,

Feb 7, 2006, 11:50:22 PM2/7/06

to

But, Ed, in your previous post you mentioned

"to handle "#"s within comments, e.g. the incredibly useful:

i = 1 # set i to #1
"

which your unanchored RE would remove " set i to #", leaving comment as
" i = 1 #1", I think.

My anchored version should leave comment as " set i to #1".

Though I haven't tried either.

Bill Seivert
's/$\..$.$.$$/\1e\2/'

Ed Morton

unread,

Feb 8, 2006, 12:00:06 AM2/8/06

to

Bill Seivert wrote:

They both would since the string is analysed left to right so both REs
match from the first non-# character (i.e. the first character at the
start of the string) to the first "#". Look:

$ echo "i = 1 # set i to #1" | awk 'sub (/[^#]*#/,"")'
set i to #1
$ echo "i = 1 # set i to #1" | awk 'sub (/^[^#]*#/,"")'
set i to #1

Maybe the confusion's because within the square brackets the "^" is the
negation symbol whereas outside it's the "start of string" symbol so
maybe you thought I was anchoring the # to something instead of negating it?

Regards,

Ed.