[Boost-users] Boost.Regex - problem with non-marking parenthesis

44 views
Skip to first unread message

Stuart Dootson

unread,
Dec 13, 2005, 9:35:03 AM12/13/05
to boost...@lists.boost.org
I have a regular expression that compiles & matches as expected, viz:

boost::regex rxBlah("([a-z](_?[a-z0-9])*)_dvs",
boost::regex_constants::icase);

I would like to make the inner parentheses non-marking, so I modified
the regex as below:

boost::regex rxBlah("([a-z](?:_?[a-z0-9])*)_dvs",
boost::regex_constants::icase);

However, this throws a bad_expression exception on compilation, with
the message "Invalid preceding regular expression". I've tested the
second regex in another regex tool (The Regex Coach,
http://weitz.de/regex-coach/) and it was accepted, and performed as
expected - when presented with the text "a_dvs", it showed one
capture, the text "a".

So - help, what have I done wrong?

Stuart Dootson

_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users

John Maddock

unread,
Dec 13, 2005, 11:09:27 AM12/13/05
to boost...@lists.boost.org
> I have a regular expression that compiles & matches as expected, viz:
>
> boost::regex rxBlah("([a-z](_?[a-z0-9])*)_dvs",
> boost::regex_constants::icase);
>
> I would like to make the inner parentheses non-marking, so I modified
> the regex as below:
>
> boost::regex rxBlah("([a-z](?:_?[a-z0-9])*)_dvs",
> boost::regex_constants::icase);
>
> However, this throws a bad_expression exception on compilation, with
> the message "Invalid preceding regular expression". I've tested the
> second regex in another regex tool (The Regex Coach,
> http://weitz.de/regex-coach/) and it was accepted, and performed as
> expected - when presented with the text "a_dvs", it showed one
> capture, the text "a".
>
> So - help, what have I done wrong?

The flags you're passing to the expression are wrong: just passing icase is
roughly the same as basic|icase and POSIX basic expressions don't support
Perl style features like non-marking parenthesis, but:

boost::regex rxBlah("([a-z](?:_?[a-z0-9])*)_dvs",

boost::regex_constants::perl | boost::regex_constants::icase);

will do what you want.

BTW from 1.33.0 onwards boost::regex_constants::perl now has a value of 0
precisely to avoid this problem.

John.

Stuart Dootson

unread,
Dec 13, 2005, 2:18:04 PM12/13/05
to boost...@lists.boost.org
On 12/13/05, John Maddock <jo...@johnmaddock.co.uk> wrote:
> > I have a regular expression that compiles & matches as expected, viz:
> >
> > boost::regex rxBlah("([a-z](_?[a-z0-9])*)_dvs",
> > boost::regex_constants::icase);
> >
> > I would like to make the inner parentheses non-marking, so I modified
> > the regex as below:
> >
> > boost::regex rxBlah("([a-z](?:_?[a-z0-9])*)_dvs",
> > boost::regex_constants::icase);
> >
> > However, this throws a bad_expression exception on compilation, with
> > the message "Invalid preceding regular expression". I've tested the
> > second regex in another regex tool (The Regex Coach,
> > http://weitz.de/regex-coach/) and it was accepted, and performed as
> > expected - when presented with the text "a_dvs", it showed one
> > capture, the text "a".
> >
> > So - help, what have I done wrong?
>
> The flags you're passing to the expression are wrong: just passing icase is
> roughly the same as basic|icase and POSIX basic expressions don't support
> Perl style features like non-marking parenthesis, but:
>
> boost::regex rxBlah("([a-z](?:_?[a-z0-9])*)_dvs",
> boost::regex_constants::perl | boost::regex_constants::icase);
>
> will do what you want.
>
> BTW from 1.33.0 onwards boost::regex_constants::perl now has a value of 0
> precisely to avoid this problem.
>
> John.
>

Thanks, John - the change in 1.33 looks very sensible.

Stuart Dootson

Reply all
Reply to author
Forward
0 new messages