When I write a regexp that has a `$` in the middle of it I write it as
either of:
sed 's/foo\$bar/stuff/'
sed 's/foo[$]bar/stuff/'
so that it's clear the `$` should be treated literally. Given that, I've
never noticed before that an unescaped `$` mid-regexp is treated
differently in BREs vs EREs, e.g.:
$ echo 'foo$bar' | sed 's/foo$bar/stuff/'
stuff
$ echo 'foo$bar' | sed -E 's/foo$bar/stuff/'
foo$bar
As far as I can see, the relevant quotes of the POSIX spec
(
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html)
for BREs are:
-----
$
The <dollar-sign> shall be special when used as an anchor.
A <dollar-sign> ( '$' ) shall be an anchor when used as the last
character of an entire BRE. The implementation may treat a <dollar-sign>
as an anchor when used as the last character of a subexpression. The
<dollar-sign> shall anchor the expression (or optionally subexpression)
to the end of the string being matched; the <dollar-sign> can be said to
match the end-of-string following the last character.
-----
and for EREs (emphasis mine):
-----
$
The <dollar-sign> shall be special when used as an anchor.
A <dollar-sign> ( '$' ) outside a bracket expression shall anchor the
expression or subexpression it ends to the end of a string; such an
expression or subexpression can match only a sequence ending at the last
character of a string. For example, the EREs "ef$" and "(ef$)" match
"ef" in the string "abcdef", but fail to match in the string "cdefab",
and **the ERE "e$f" is valid, but can never match because the 'f'
prevents the expression "e$" from matching ending at the last character**.
-----
So, the BRE section doesn't explicitly state what `$` means when it's
not at the end of a regexp but given the "special when used as an
anchor" statement, it makes sense to take that as meaning it's literal
otherwise and that is how the various tools I've tried are interpreting it.
The ERE section, however, has that same statement about `$` being
special when used as an anchor, but then goes on to state that when it's
mid-regexp, e.g. `e$f`, it should NOT be treated literally even though
doing so means the regexp that includes it can never match anything.
That ERE specification seems odd - why interpret `$` in a way that's
different from BREs and results in a regexp that can never match
anything instead of simply treating it as literal, same as BREs do?
Does anyone have any insight into why a `$` mid-regexp is treated that
way in EREs?
Ed.