Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

`or' in sed.

9 views

Skip to first unread message

John

unread,

May 7, 2007, 9:43:36 AM5/7/07

Isn't the or construction (foo|bar) supposed to work in sed? For
instance,

echo "foo" | sed 's/(foo|bar)/foobar/'

print "foo", not "foobar". How does one do `or' with sed? And why
doesn't grep and sed use the same regular expressions? Grep seems to
understand the above or construction.

Stephane CHAZELAS

unread,

May 7, 2007, 10:08:33 AM5/7/07

2007-05-7, 06:43(-07), John:

Both grep and sed use basic regular expressions.

() and | are a feature of extended regular expressions (as in
grep -E or awk (though awk has a special flavour of them)) not
basic regexps. Some sed implementations have \|, but that's not
standard. Note that () are only for grouping (and backrefs) (use
\(\) in BREs), they shouldn't be necessary here.

echo foo | awk '{gsub(/foo|bar/, "foobar");print}'

Or you can use perl regexps that are yet another flavour or
regexps:

echo foo | perl -pe 's/foo|bar/foobar/'

--
Stéphane

bsh

unread,

May 7, 2007, 6:09:17 PM5/7/07

On May 7, 6:43 am, John <jan...@gmail.com> wrote:
> Isn't the or construction (foo|bar) supposed to work in sed? For
> instance,
> echo "foo" | sed 's/(foo|bar)/foobar/'
> print "foo", not "foobar". How does one do `or' with sed?

> Grep seems to understand the above or construction.

No, it doesn't. Only egrep(1) does, unless you are using a
relabelled GNU grep.

> And why doesn't grep and sed use the same regular expressions?

They (mostly) do. What you mean is: why don't they use the
same RE _syntax_?

Grep(1) was created as a standalone software tool from the
RE component of ed(1) by Ken Thompson (q.v. Thompson's
Algorithm) for Unix v4 (1973); sed(1) was authored by Lee M.
McMahon for Unix v7 (1978). POSIX (and other standards)
and RFCs were then mostly a promise for the future.

> > echo foo | awk '{gsub(/foo|bar/, "foobar");print}'

> > echo foo | perl -pe 's/foo|bar/foobar/'

As efficient as awk(1) is, it has a hundred times the parsing
overhead (excluding the building of the DFAs) as sed(1), and
perl(1) has up to ten times even that:

"Timing Trials, or, the Trials of Timing: Experiments with
Scripting and User-Interface Languages"
http://cm.bell-labs.com/cm/cs/who/bwk/interps/pap.html

Actually, it is quite easy to simulate the logical ANDs and
ORs of extended regular expression alternations.

# simulate OR across arbitrary number of expressions
# (not tested)
/foo/b do
/bar/b do
b dont
: do
s///foobar/
: dont

# simulate AND across arbitrary number of expressions
# (not tested)
/foo/!b dont
/bar/!b dont
s///foobar/
: dont

The above is the general solution; applicable cases may
use "/foo.*bar/{...}" and/or "/bar.*foo/{..}" or the "t" command
(RTFM) of sed(1). Also, Boole's Law -- which states that
a || b is equivalent to !a && !b -- is of course always applicable
to the control logic of such code.

=Brian

0 new messages