Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

difficulty with gawk sub()

44 views
Skip to first unread message

Anton Treuenfels

unread,
May 5, 2015, 9:55:46 AM5/5/15
to
So I have a string like this:

"^(+|-|/|*|"

which is "automagically" generated during the initialization phase of an
expression parser. Ultimately it's to be used as a regular expression
pattern. As the form it's in is not a regular expression, I attempted to
remedy that by doing this:

sub( /|$/, ")", theVariableThatHoldsTheString )

with the goal of replacing that last "|" with a ")", ie., to look like this:

"^(+|-|/|*)"

But what I got was this:

")^(+|-|/|*|"

So...is there something wrong with the regular expression I used? I thought
it said "a '|' character followed by the end of the string". Is an anchored
match not a useful concept with sub() (and presumably gsub())? Something
else?

- Anton Treuenfels

Luuk

unread,
May 5, 2015, 11:56:09 AM5/5/15
to
~/tmp> echo "^(+|-|/|*|" | awk '{ sub(/\|$/,")", $0); print $0 }'
^(+|-|/|*)


Ed Morton

unread,
May 5, 2015, 7:36:44 PM5/5/15
to
`|` is an ERE metacharacter meaning `or` so your expression `/|$/` means
`nothing or end of string`. You need to escape it or put it in a bracket
expression to make it a literal: `/\|$/` or `/[|]$/`.

Ed.
>
> - Anton Treuenfels

Anton Treuenfels

unread,
May 5, 2015, 11:16:58 PM5/5/15
to

"Ed Morton" <morto...@gmail.com> wrote in message
news:mibk4a$ksh$1...@dont-email.me...
Thanks - that eventually did occur to me (several hours later). Escaping, at
any rate, not making it a single member class.

I still wonder a bit that anything other than an error occurs when the
original expression is parsed. I did not know a complete lack of anything
had a defined meaning - though obviously it explains the result.

- Anton Treuenfels

Janis Papanagnou

unread,
May 6, 2015, 4:39:15 AM5/6/15
to
Am 06.05.2015 um 05:16 schrieb Anton Treuenfels:
>
> "Ed Morton" <morto...@gmail.com> wrote in message
> news:mibk4a$ksh$1...@dont-email.me...
>>
>> `|` is an ERE metacharacter meaning `or` so your expression `/|$/`
>> means `nothing or end of string`. You need to escape it or put it in a
>> bracket expression to make it a literal: `/\|$/` or `/[|]$/`.
>>
>
> I still wonder a bit that anything other than an error occurs when the
> original expression is parsed. I did not know a complete lack of
> anything had a defined meaning - though obviously it explains the result.

As a simple example, matching "beaf" and "deadbeaf" can be expessed
with an "empty" subclause (avoiding repetitions of common parts).

Janis

>
> - Anton Treuenfels
>

0 new messages