Option ^ can't be put with hyphen, not documented?

18 views
Skip to first unread message

David Wahlstedt

unread,
Sep 27, 2024, 10:41:39 AM9/27/24
to PCRE2 discussion list
I just noticed that option ^ (unset imnrsx) can't be put in constructions of the form 
"options to turn on, hyphen, options to turn off"
for instance:

(?^a-U:a bc d), (?^-U), (?-^:abc), etc

are not allowed, which makes sense, since it would be a bit weird otherwise. But I can't see that mentioned in the man pages.

Best regards,

David


David Wahlstedt

unread,
Sep 27, 2024, 11:10:17 AM9/27/24
to PCRE2 discussion list
I also noticed that ^ can't be combined with some other options in particular orders, like:

(?^x)a b
allowed, and turns on x, so it matches 'ab'
(?x^)a b
not allowed
(?x^:a b)
allowed, and turns on x, so it matches 'ab'
(?^x:a b)
not allowed

With U it is the opposite order:

(?^U: a b)
allowed
(?U^: a b)
not allowed

This is confusing!

Wouldn't it be better to allow all combinations, with and without hyphens, and treat ^ just as a shortcut of imnrsx appearing, but when to the left of the hyphen, interpret them as if they where on the right side, and vice versa if it appears on the right? If it appears in a group without hyphen, treat it as a group with hyphen and they appear on the right.

Best regards,

David

Philip Hazel

unread,
Sep 27, 2024, 11:39:11 AM9/27/24
to David Wahlstedt, PCRE2 discussion list
It is documented in pcre2pattern: In the section "Internal option setting" there is this paragraph:

"If the first character following (? is a circumflex, it causes all of the above
options to be unset. Letters may follow the circumflex to cause some options to
be re-instated, but a hyphen may not appear."

Regards,
Philip


--
You received this message because you are subscribed to the Google Groups "PCRE2 discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pcre2-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pcre2-dev/c7d7557a-0b0e-4176-ba76-ac229e9d51a9n%40googlegroups.com.

Philip Hazel

unread,
Sep 27, 2024, 11:48:57 AM9/27/24
to David Wahlstedt, PCRE2 discussion list
Which version of PCRE2 are you using? My tests disagree with you. As documented (see my previous message) the circumflex ^ is recognized only if it is the first character after (? and so if it appears anywhere else there is an error.

PCRE2 version 10.44 2024-06-07 (8-bit)
/(?^x)a b/

/(?x^:a b)/
Failed: error 111 at offset 3: unrecognized character after (? or (?-

/(?^x:a b)/

/(?^U: a b)/

/(?U^: a b)/
Failed: error 111 at offset 3: unrecognized character after (? or (?-



Regards,
Philip


--
You received this message because you are subscribed to the Google Groups "PCRE2 discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pcre2-dev+...@googlegroups.com.

David Wahlstedt

unread,
Sep 27, 2024, 12:34:16 PM9/27/24
to PCRE2 discussion list
Many thanks, then I understand! I didn't get that ^ must be first.

Best,

David
Reply all
Reply to author
Forward
0 new messages