[vim/vim] WIP: regex: \%G ignores diacrectic chars (PR #10444)

17 views
Skip to first unread message

Christian Brabandt

unread,
May 18, 2022, 1:48:28 PM5/18/22
to vim/vim, Subscribed

In #8026, a way to ignore diacretic characters was requested.

Here, let's make use of the \%G atom, to ignore diacrectics when a
simple character is found. Internally, this works by making use of
equivalence classes for each character (similar to as if you would add
[[=<char>=]] around each character.

The implementation is a bit messy, since I do not know the regexp code
very well. Also using \%G changes quite a bit how the pattern matching
works and how to count the size of the regexp beforehand, so this needs to
be added at the very beginning of the pattern (only after the engine
selection), it can't be done in the middle of the pattern once found,
since it changes significantly how to add atoms and how to match ordinary
items.

This is currently work in progress, so no tests are here. I just like to
get some feedback, if the approach used here seems okay or not.

Tests will be added later.


You can view, comment on, or merge this pull request online at:

  https://github.com/vim/vim/pull/10444

Commit Summary

  • fa3855e regex: \%G ignores diacrectic chars

File Changes

(5 files)

Patch Links:


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/10444@github.com>

codecov[bot]

unread,
May 18, 2022, 2:08:19 PM5/18/22
to vim/vim, Subscribed

Codecov Report

Merging #10444 (fa3855e) into master (28d032c) will decrease coverage by 0.07%.
The diff coverage is 31.03%.

@@            Coverage Diff             @@

##           master   #10444      +/-   ##

==========================================

- Coverage   81.69%   81.61%   -0.08%     

==========================================

  Files         158      158              

  Lines      184571   184759     +188     

  Branches    41685    41772      +87     

==========================================

+ Hits       150780   150795      +15     

- Misses      21298    21472     +174     

+ Partials    12493    12492       -1     
Flag Coverage Δ
huge-clang-none 82.53% <31.03%> (-0.01%) ⬇️
linux 82.53% <31.03%> (-0.01%) ⬇️
mingw-x64-HUGE 0.00% <0.00%> (?)
mingw-x64-HUGE-gui 78.02% <30.43%> (-0.01%) ⬇️
windows 76.80% <30.43%> (-1.23%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/regexp.c 87.98% <25.00%> (-0.24%) ⬇️
src/regexp_bt.c 86.89% <25.00%> (-0.55%) ⬇️
src/regexp_nfa.c 89.63% <44.44%> (-0.18%) ⬇️
src/locale.c 75.43% <0.00%> (-1.35%) ⬇️
src/profiler.c 83.14% <0.00%> (-1.13%) ⬇️
src/os_win32.c 56.98% <0.00%> (-0.88%) ⬇️
src/if_python3.c 72.54% <0.00%> (-0.68%) ⬇️
src/os_mswin.c 49.47% <0.00%> (-0.64%) ⬇️
src/gui_dwrite.cpp 45.82% <0.00%> (-0.56%) ⬇️
src/gui_w32.c 34.88% <0.00%> (-0.34%) ⬇️
... and 47 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 28d032c...fa3855e. Read the comment docs.


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/10444/c1130341954@github.com>

Christian Brabandt

unread,
May 28, 2022, 5:40:09 PM5/28/22
to vim/vim, Push

@chrisbra pushed 2 commits.

  • a3fd2de regexp: set ignore_diacritics flag only when \%G is at the start of the pattern
  • 1ccf440 regexp: add a test


View it on GitHub or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/10444/push/10009414786@github.com>

Christian Brabandt

unread,
May 28, 2022, 5:43:51 PM5/28/22
to vim/vim, Push

@chrisbra pushed 3 commits.

  • e4509b2 regex: \%G ignores diacrectic chars
  • 889616a regexp: set ignore_diacritics flag only when \%G is at the start of the pattern
  • 1806e4f regexp: add a test

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/10444/push/10009424944@github.com>

Christian Brabandt

unread,
Nov 3, 2024, 3:34:47 AM11/3/24
to vim/vim, Push

@chrisbra pushed 2 commits.

  • 5b632d0 cb: Enable Debugging builds, disable debugging mode for libvterm
  • 1150bc0 regex: \%G ignores diacrectic chars

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/10444/before/1806e4f6bec95e357e6763a053c4d222a86da697/after/1150bc04a0bf0a920e6a92e18b610669d8ea921a@github.com>

Christian Brabandt

unread,
Nov 3, 2024, 12:57:02 PM11/3/24
to vim/vim, Push

@chrisbra pushed 1 commit.

  • b1c139e regex: \%G ignores diacrectic chars

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/10444/before/1150bc04a0bf0a920e6a92e18b610669d8ea921a/after/b1c139e9f626c851df8ba555df6108fc318adcbf@github.com>

Christian Brabandt

unread,
Nov 3, 2024, 3:06:04 PM11/3/24
to vim/vim, Subscribed

I think this is ready now. But the implementation is ugly, especially passing the \%G flag around. Not sure if this is really useful.


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/10444/c2453560442@github.com>

Christian Brabandt

unread,
Jun 13, 2026, 4:15:22 PM (10 hours ago) Jun 13
to vim/vim, Subscribed
chrisbra left a comment (vim/vim#10444)

There is not enough interest in this, so closing


Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS and Android. Download it today!
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/10444/c4699665663@github.com>

Reply all
Reply to author
Forward
0 new messages