Regex not matching "not in range" for unicode character

28 views
Skip to first unread message

Petr Vasiliev

unread,
Dec 11, 2024, 6:28:28 AM12/11/24
to git-for-windows
We're using regex to validate commit message. Commit message is formatted and line breaks are replaced with non-ASCII Unicode character.
Unicode character itself can be matched, but when it's put within "not in range" token, the whole expression is not matching seemingly on recent versions of Git for Windows.

The following simplified example matches with Git 2.43 and does not match with Git 2.47.<0|1>:

1. Create a file named "regex" with content (using alt+1 ☺ char):

#!/bin/sh

re="Something[^☺]+"
msg="Something wrong ☺"
if [[ $msg =~ $re ]]; then
    echo "match"
else
    echo "no match!"
fi

2. Launch git-bash.exe, change directory to where "regex" file is located
3. Execute "sh regex" in bash

Appreciate any advice. Maybe this is an issue with bash, or maybe we don't handle Unicode characters properly and need to escape them (even though this regex is working on older version)?
Reply all
Reply to author
Forward
0 new messages