Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

named capchering groups (regex)

33 views
Skip to first unread message

RM

unread,
Apr 8, 2020, 10:57:43 AM4/8/20
to
Are there is C++'s regexes named capchering groups and can I use them
with sub_match object? I have a problem with translating from PHP to C++
some code that uses PHP's preg_match_all function.

Melzzzzz

unread,
Apr 8, 2020, 11:12:00 AM4/8/20
to
I haven't use C++ regex, because I used regex much earlier then
they are introduced. What I would do I would simply find
reference documentation on internet.


--
press any key to continue or any other to quit...
U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec
Svi smo svedoci - oko 3 godine intenzivne propagande je dovoljno da jedan narod poludi -- Zli Zec
Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi
bili naoruzani. -- Mladen Gogala

James Kuyper

unread,
Apr 8, 2020, 11:46:32 AM4/8/20
to
Melzzzzz has already given you the main advice I would have given you.
I'll just point out that you'll find what you're looking for more easily
if you spell it "capturing".

Christian Gollwitzer

unread,
Apr 8, 2020, 11:49:27 AM4/8/20
to
Am 08.04.20 um 16:57 schrieb RM:
> Are there is C++'s regexes named capchering groups and can I use them
> with sub_match object? I have a problem with translating from PHP to C++
> some code that uses PHP's preg_match_all function.

PHP uses the PCRE library in order to handle regexps. So if you also
link to PCRE, the expressions will be compatible.
If this named capturing is the only problem, then you might get away
without these names. Just delete the names in the regexps and manually
count the groups. After all, named capturing does not add any
functionality, it is just a convenience for programmers such that it is
easier to refer to the subgroup, but you can always do so by the number
of the group.

Christian

Alf P. Steinbach

unread,
Apr 8, 2020, 12:46:03 PM4/8/20
to
Just note that the still young regex sub-library is scheduled for
deprecation, if it isn't deprecated already in C++17, because it just
doesn't know how to deal with UTF-8.

I think that's weird since the default syntax is Javascript's (IIRC).

But.


- Alf

Jorgen Grahn

unread,
Apr 8, 2020, 3:17:48 PM4/8/20
to
On Wed, 2020-04-08, Alf P. Steinbach wrote:
> On 08.04.2020 16:57, RM wrote:
>> Are there is C++'s regexes named capchering groups and can I use them
>> with sub_match object? I have a problem with translating from PHP to C++
>> some code that uses PHP's preg_match_all function.
>
> Just note that the still young regex sub-library is scheduled for
> deprecation, if it isn't deprecated already in C++17, because it just
> doesn't know how to deal with UTF-8.

Any reference for this? cppreference.com doesn't mention this; instead
the regex library seems to have grown somewhat in C++17.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Öö Tiib

unread,
Apr 8, 2020, 4:57:55 PM4/8/20
to
On Wednesday, 8 April 2020 22:17:48 UTC+3, Jorgen Grahn wrote:
> On Wed, 2020-04-08, Alf P. Steinbach wrote:
> > On 08.04.2020 16:57, RM wrote:
> >> Are there is C++'s regexes named capchering groups and can I use them
> >> with sub_match object? I have a problem with translating from PHP to C++
> >> some code that uses PHP's preg_match_all function.
> >
> > Just note that the still young regex sub-library is scheduled for
> > deprecation, if it isn't deprecated already in C++17, because it just
> > doesn't know how to deal with UTF-8.
>
> Any reference for this? cppreference.com doesn't mention this; instead
> the regex library seems to have grown somewhat in C++17.

Perhaps it is for to make C++ regex to differ from PCRE.
For me such things always weaken, add controversy to and
confusion with everything that is involved.


Alf P. Steinbach

unread,
Apr 8, 2020, 6:07:11 PM4/8/20
to
On 08.04.2020 21:17, Jorgen Grahn wrote:
> On Wed, 2020-04-08, Alf P. Steinbach wrote:
>> On 08.04.2020 16:57, RM wrote:
>>> Are there is C++'s regexes named capchering groups and can I use them
>>> with sub_match object? I have a problem with translating from PHP to C++
>>> some code that uses PHP's preg_match_all function.
>>
>> Just note that the still young regex sub-library is scheduled for
>> deprecation, if it isn't deprecated already in C++17, because it just
>> doesn't know how to deal with UTF-8.
>
> Any reference for this? cppreference.com doesn't mention this; instead
> the regex library seems to have grown somewhat in C++17.

Not sure exactly where I picked that up but Mr. Google now found a
February 15. 2020 comment by Tom Honermann,

“In Prague, during the SG16 discussion of P1844R1 - Enhancement of
regex, we had consensus to move forward with a proposal to deprecate
std::regex due to performance and ABI concerns.”

https://github.com/sg16-unicode/sg16/issues/57

There's no more than that and a link to the enhancement paper, but one
possible problem may be that e.g. regex-searching for three consecutive
non-space characters can find a single UTF-8 char, if the regex engine
is specified as or implemented as simple byte sequence searching.

Since non-ASCII UTF-8 chars consist entirely of bytes >= 128 and start
with special pattern I don't think erroneously finding things /within/
UTF-8 characters is a problem. But until such time as C++ support for
all kinds of character encodings is removed there could be that problem
with other encodings, e.g. in particular my association circuit now pops
up Shift-JIS. So maybe they have considered also that, but it would be
nice with some feature where you can say, “Hey, support only UTF-8!”. 😃

- Alf
0 new messages