Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Procmail regular expression: word end boundary matching

30 views
Skip to first unread message

T o n g

unread,
Dec 29, 2011, 12:30:01 AM12/29/11
to
Hi,

I seems not able to get the word end boundary matching for procmail
works. Here is my test rc file:

:0 HB
* 1^0 \<test\>
* 1^0 test
/dev/null

and there is a ' test ' in my test email.

However, from the dry run log, I can see that '\<test\>' did not match
yet 'test' did.

Anything I'm missing?

Thanks

PS.

1. my procmail

$ procmail -v
procmail v3.22 2001/09/10
. . .

2. About word end boundary matching, I found the following from the web:

\<
A shorthand for the character class [^a-zA-Z0-9_] except it can also
match newlines.
This is an incompatible imitation of the "word end boundary" operator
found in some extended regular expression implementations. Note that \<
and \> are actually identical.

--
Tong (remove underscore(s) to reply)
http://xpt.sourceforge.net/techdocs/
http://xpt.sourceforge.net/tools/


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/jdgtf8$f6s$1...@dough.gmane.org

Bob Proulx

unread,
Dec 29, 2011, 1:20:01 AM12/29/11
to
T o n g wrote:
> I seems not able to get the word end boundary matching for procmail
> works. Here is my test rc file:
>
> :0 HB
> * 1^0 \<test\>
> * 1^0 test
> /dev/null
>
> and there is a ' test ' in my test email.
>
> However, from the dry run log, I can see that '\<test\>' did not match
> yet 'test' did.

I think it is a bug in the \< expansion. Try this with one extra
backslash in front of the left < only. Not the right > one. Only the
left one.

:0 HB
* 1^0 \\<test\>
* 1^0 test
/dev/null

> 2. About word end boundary matching, I found the following from the web:

That documentation is in the procmailrc man page.

> \<
> A shorthand for the character class [^a-zA-Z0-9_] except it can also
> match newlines.
> This is an incompatible imitation of the "word end boundary" operator
> found in some extended regular expression implementations. Note that \<
> and \> are actually identical.

I think that is outdated because from testing the above the
documentation in the man page doesn't seem to match the behavior. I
think the ERE engine may have been swapped out for a different one
after the documentation was written.

Bob
signature.asc

T o n g

unread,
Dec 29, 2011, 3:10:01 PM12/29/11
to
On Wed, 28 Dec 2011 23:19:27 -0700, Bob Proulx wrote:

> I think it is a bug in the \< expansion.

OMG, I thought it would be very hard for anyone to find out the answer.

THANKS A LOT. works like a charm.

--
Tong (remove underscore(s) to reply)
http://xpt.sourceforge.net/techdocs/
http://xpt.sourceforge.net/tools/


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/jdigsr$giv$1...@dough.gmane.org

Bob Proulx

unread,
Dec 29, 2011, 4:20:02 PM12/29/11
to
T o n g wrote:
> Bob Proulx wrote:
> > I think it is a bug in the \< expansion.
>
> OMG, I thought it would be very hard for anyone to find out the answer.
>
> THANKS A LOT. works like a charm.

That seems to be an okay workaround but it still looks like a bug.
Since I use procmail a lot it motivated me to look at least that far
into the problem. Hopefully at some point the bug will be fixed and
then the two backslashes would need to be collapsed back to one again.

I reported the problem it here:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=653624

Bob
signature.asc
0 new messages