Nested address ranges in awk

haakon

unread,

Nov 18, 2009, 10:14:59 AM11/18/09

to

Here's what I want to do:

I want to grab a block of text from 'from_pattern' to 'to_pattern',
and with that block of text I want to print to screen the last column
of those lines containing the secondary patterns 'local_pattern1' and
'local_pattern2'. I.e. something like this:

#
/from_pattern/,/to_pattern/ {
/local_pattern1/ printf ($NF" ")
/local_pattern2/ printf ($6"\n")
}
#

This keeps giving me syntax errors though. Is it even possible to use
nested ranges in awk or do I have to use sed for this?

Kenny McCormack

unread,

Nov 18, 2009, 10:22:39 AM11/18/09

to

In article <525cce5c-9f6b-43df...@r24g2000yqd.googlegroups.com>,

haakon <haakon...@gmail.com> wrote:
>Here's what I want to do:
>
>I want to grab a block of text from 'from_pattern' to 'to_pattern',
>and with that block of text I want to print to screen the last column
>of those lines containing the secondary patterns 'local_pattern1' and
>'local_pattern2'. I.e. something like this:
>
>#
>/from_pattern/,/to_pattern/ {
> /local_pattern1/ printf ($NF" ")
> /local_pattern2/ printf ($6"\n")
>}
>#

No. Once you are inside {}, you have to use ordinary code. There only
one level of "automatic" pattern matching.

So, the above becomes:

/from_pattern/,/to_pattern/ {
if (/local_pattern1/) printf ($NF" ")
if (/local_pattern2/) printf ($6"\n")
}

Side comments:
1) GAWK (and most "normal/standard" AWKs) allow the: if (/foo/)
syntax - i.e., a reg exp appearing by itself is shorthand for:
$0 ~ /foo/
But TAWK requires the "$0 ~" to be explicitly spelled out.

2) The usual comments about mis-use of printf().

haakon

unread,

Nov 18, 2009, 10:23:40 AM11/18/09

to

Corrected a typo..

Here's what I want to do:

I want to grab a block of text from 'from_pattern' to 'to_pattern',
and with that block of text I want to print to screen the last column
of those lines containing the secondary patterns 'local_pattern1' and
'local_pattern2'. I.e. something like this:

#
/from_pattern/,/to_pattern/ {
/local_pattern1/ printf ($NF" ")

/local_pattern2/ printf ($NF"\n")}

haakon

unread,

Nov 18, 2009, 10:26:15 AM11/18/09

to

On 18 Nov, 16:22, gaze...@shell.xmission.com (Kenny McCormack) wrote:
> In article <525cce5c-9f6b-43df-8620-92f78ed83...@r24g2000yqd.googlegroups.com>,

Thank you!

About side comment 2) What would be a better way of using printf here?

Janis Papanagnou

unread,

Nov 18, 2009, 1:10:26 PM11/18/09

to

He probably means to specify a format string

printf ("%s ", $NF)
printf ("%s\n", $6)

(But maybe he's nit-picking about the brackets, dunno.)

Janis

mss

unread,

Nov 18, 2009, 2:42:22 PM11/18/09

to

On 2009-11-18, Kenny McCormack <gaz...@shell.xmission.com> wrote:

> No. Once you are inside {}, you have to use ordinary code. There only
> one level of "automatic" pattern matching.

Just thinking aloud here about an example recently posted by Janis...

By 'automatic', do you mean only between BEGIN{} & END{} and then,
only without the middle {}, ie:

BEGIN{}

'automatic'

END{}

but not:

BEGIN{}

{automatic}

END{}

--
later on,
Mike

Grant

unread,

Nov 18, 2009, 3:12:09 PM11/18/09

to

What about (untested):

#
!(/from_pattern/,/to_pattern/) { next }
/local_pattern1/ { printf "%s ", $NF }
/local_pattern2/ { printf "%s\n", $NF }
#

Grant.
--
http://bugsplatter.id.au

Janis Papanagnou

unread,

Nov 18, 2009, 3:14:38 PM11/18/09

to

If your 'automatic' is a condition, e.g. something like /pattern/ or
flag1||flag2 then you are correct. The key part in Kenny's posting is
"ordinary code". Mind that an awk program is basically[*] a sequence of

condition { action }

constructs. That "action" is a conventional "ordinary code" as found in
many imperative (procedural) programming languages, "condition", OTOH,
is an expression that evaluates to a predicate (effectively a boolean
value). /pattern/ is such an expression (it can be also viewed as an
implicit shortcut for $0~/pattern/. But neither /pattern/ alone nor
$0~/pattern/ alone is a valid statement ("ordinary code") for an action;
you need to embed it in a control construct if($0~/pattern/) to be a
valid action.

N.B.: In case I see, in awk, something like

{
if ($0~/p/) { x="hi" }
if ($0~/q/) { y="there" }
if (r) { z="!" }
}

I often rewrite it to

/p/ { x="hi" }
/q/ { y="there" }
r { z="!" }

which has less syntactical ballast and is therefore much clearer. For
people who are fixed to their imperative programming skill and new to
awk that syntax might not be obvious even though it's one of the basics
in awk and typically easily explained.

This syntactical clearer way to write programs has its limits, though,
where deeply nested control constructs are involved. OTOH, I've also
often seen convoluted awk programs with deeply nested control structures
that could be simplified to something that was again representable in
this cute awk'ish way.

Let me close my rant with a quote from an interview with A.Robbins at
awk.info (http://awk.info/?news/robbinsTalks):

Q: In retrospect, what are the best/worst features of gawk?

A: The best feature is the pattern/action paradigm. The
implicit read-a-record loop is wonderful. This is the
language's data-driven nature, as opposed to the
imperative nature of most languages. [...]

(Which is apparently not only related to gawk but to awk in general.)

Janis

[*] There are also function definitions.

Ed Morton

unread,

Nov 18, 2009, 3:52:04 PM11/18/09

to

On Nov 18, 2:12 pm, Grant <g_r_a_n...@bugsplatter.id.au> wrote:

> On Wed, 18 Nov 2009 07:23:40 -0800 (PST), haakon <haakon.sk...@gmail.com> wrote:
> >Corrected a typo..
>
> >Here's what I want to do:
>
> >I want to grab a block of text from 'from_pattern' to 'to_pattern',
> >and with that block of text I want to print to screen the last column
> >of those lines containing the secondary patterns 'local_pattern1' and
> >'local_pattern2'. I.e. something like this:
>
> >#
> >/from_pattern/,/to_pattern/ {
> > /local_pattern1/ printf ($NF" ")
> > /local_pattern2/ printf ($NF"\n")}
>
> >#
>
> >This keeps giving me syntax errors though. Is it even possible to use
> >nested ranges in awk or do I have to use sed for this?
>
> What about (untested):
>
> #
> !(/from_pattern/,/to_pattern/) { next }

The above, if it works (I'm truly not sure what awk does with a
negated range expression like that), would presumably mean "if the
current record is NOT within the desired range then do NOT continue to
process this record" so that's a double negative which makes the code
hard to understand. Rather than:

!(/from_pattern/,/to_pattern/) { next }
/local_pattern1/ { printf "%s ", $NF }
/local_pattern2/ { printf "%s\n", $NF }

for clarity I'd write it as:

/from_pattern/ { inRange=1 }
inRange && /local_pattern1/ { printf "%s ", $NF }
inRange && /local_pattern2/ { printf "%s\n", $NF }
/to_pattern/ { inRange=0 }

or as the OP originally structured it:

/from_pattern/,/to_pattern/ {
if (/local_pattern1/) { printf "%s ", $NF }
if (/local_pattern2/) { printf "%s\n", $NF }
}

This "if"s don't seem like too much of a burden there.

Regards,

Ed.

mss

unread,

Nov 18, 2009, 4:52:23 PM11/18/09

to

On 2009-11-18, Janis Papanagnou <janis_pa...@hotmail.com> wrote:

Great post. Much brain food to chew on here, thank you.

> {
> if ($0~/p/) { x="hi" }
> if ($0~/q/) { y="there" }
> if (r) { z="!" }
> }
>
> I often rewrite it to
>
> /p/ { x="hi" }
> /q/ { y="there" }
> r { z="!" }

Yes, understood (I'm on the right track then).

> might not be obvious...

Actually, its usage is obvious to me (nomenclature not withstanding),
its context that I seem to be lacking. From what I've studied up to
this point, the main difference I note is the lack of opening/closing
curly brackets when using the '/p/ {a}' paradigm.

Seems like a superb way to code an expression, 'cute' or not. A simple
boolean notation denoting 'if condition then do_something'.

Thanks again, Janis. Appreciate all the insight laden responses here.

--
later on,
Mike

Kenny McCormack

unread,

Nov 18, 2009, 10:42:51 PM11/18/09

to

In article <he1kje$a0r$1...@svr7.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
...

>Let me close my rant with a quote from an interview with A.Robbins at
>awk.info (http://awk.info/?news/robbinsTalks):
>
> Q: In retrospect, what are the best/worst features of gawk?
>
> A: The best feature is the pattern/action paradigm. The
> implicit read-a-record loop is wonderful. This is the
> language's data-driven nature, as opposed to the
> imperative nature of most languages. [...]

So, what does Arnold say is/are the *worst* feature(s) of gawk?

Janis Papanagnou

unread,

Nov 19, 2009, 12:18:35 AM11/19/09

to

In a nutshell... (see above link for the complete response)

* lack of an explicit concatenation operator

* lack of real multi-dimensional arrays

* i18n on "awk level" (a "waste of time")

* IGNORECASE ("a huge pain to get right")

* lack of extensibility concept

Some of the features (or lack thereof) are just inherited from
original awk language.

Janis

w_a_x_man

unread,

Nov 19, 2009, 12:09:19 PM11/19/09

to

printf ($NF" ") will crash the program if the last field is "%s".
Correct:
printf "%s", $NF

And printf ($6"\n") should simply be
print $6

Kenny McCormack

unread,

Nov 19, 2009, 3:45:41 PM11/19/09

to

In article <bff1b74e-f208-4ed0...@x31g2000yqx.googlegroups.com>,
w_a_x_man <w_a_...@yahoo.com> wrote:
...

>> About side comment 2) What would be a better way of using printf here?
>
>printf ($NF" ") will crash the program if the last field is "%s".
>Correct:
>printf "%s", $NF
>
>And printf ($6"\n") should simply be
>print $6

Give the man a cigar!

Ed Morton

unread,

Nov 19, 2009, 3:57:52 PM11/19/09

to

On Nov 19, 2:45 pm, gaze...@shell.xmission.com (Kenny McCormack)
wrote:

> In article <bff1b74e-f208-4ed0-ab80-17834aea3...@x31g2000yqx.googlegroups.com>,w_a_x_man <w_a_x_...@yahoo.com> wrote:
>
> ...
>
> >> About side comment 2) What would be a better way of using printf here?
>
> >printf ($NF" ") will crash the program if the last field is "%s".
> >Correct:
> >printf "%s", $NF
>
> >And printf ($6"\n") should simply be
> >print $6
>
> Give the man a cigar!

You might have to take it back because his reponse to the question of
"how do I use awk to skip leading fields" in another comp.lang.awk
thread was:

ruby -pne 'sub(/^\s*\S+\s+/, "")' file

;-).

Ed.

Kenny McCormack

unread,

Nov 19, 2009, 4:30:25 PM11/19/09

to

In article <bdc3a271-faa7-4124...@c3g2000yqd.googlegroups.com>,
Ed Morton <morto...@gmail.com> wrote:
...

>You might have to take it back because his reponse to the question of
>"how do I use awk to skip leading fields" in another comp.lang.awk
>thread was:
>
> ruby -pne 'sub(/^\s*\S+\s+/, "")' file
>
>;-).
>
> Ed.

Yes, I saw that. And I thought about doing a topicality flame on it,
but didn't bother.

Really, though, he just left off the AWK wrapper. He meant to say:

awk 'BEGIN {system("ruby -pne '"'"'sub(/^\s*\S+\s+/, \"\")'"'"' file")}'