Feel free to modify it as you see fit. I included some details about the fzy algorithm, but not sure if they are needed.
https://github.com/vim/vim/pull/17988
(1 file)
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.
Thanks
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.
I got slightly confused about case sensitivity, since the sentence Case is ignored. was removed. So I checked how fzy handles case and it mentions it does case insensitive matches. But in reality, case is only ignored for the pattern, not for the haystack (I checked has_match() function in fuzzy.c and fzy) which got me worried, if I hit a language barrier. I always thought case insensitive matching means matches happens regardless if the search pattern or the haystack differs in case 😕
E.g. try this (:set ignorecase does not matter):
:echo matchfuzzy(['Vim', 'vimeo', 'Violin'], 'vim')
['Vim', 'vimeo']
:echo matchfuzzy(['Vim', 'vimeo', 'Violin'], 'Vim')
['Vim']
So it does seem that case of the pattern does matter.
So I made the following adjustments now:
11. Fuzzy matching *fuzzy-matching*
Fuzzy matching scores how well a string matches a pattern when the pattern
characters appear in order but not necessarily contiguously.
Example: >
Pattern: "vim"
Candidates: "vim" -> perfect
"vimeo" -> good (v i m)
"voice mail" -> weaker (v _ i _ _ _ m)
"vintage" -> no match (no "m")
<
If the search string has multiple words, each word is matched separately and
may appear in any order in the candidate. For example "get pat" matches
"GetPattern", "PatternGet", "getPattern", "patGetter", "getSomePattern",
"MatchpatternGet", etc.
The 'ignorecase' and 'smartcase' options do not apply, case is ignored if the
pattern is all lower case.
Vim's implementation is based on the algorithm from the fzy project:
https://github.com/jhawthorn/fzy
It uses dynamic programming to compute an optimal score for a given pattern
and candidate.
The algorithm works in two stages:
1. Forward pass
Scan the candidate left to right, tracking the best score for each
pattern position. Matches score higher when they occur at the start
of the candidate, the start of a word (space, underscore, dash,
camelCase), or directly after the previous match.
2. Backward pass
Start from the best-scoring end position and step back to find match
positions, ensuring the alignment is optimal.
Vim extends the original algorithm to support multibyte codepoints, allowing
correct matching for UTF-8 and other encodings.
Time complexity is O(pattern * candidate). Memory usage is proportional
to the same.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.
Looks good, thanks.
You can check the scores for the examples. I think when pattern is all lowercase the scores remain the same for both uppercase and lowercase matches. Uppercase match in a camelcase gets a score boost (https://github.com/jhawthorn/fzy/blob/master/ALGORITHM.md).
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.
I got slightly confused about case sensitivity, since the sentence
Case is ignored.was removed. So I checked how fzy handles case and it mentions it does case insensitive matches. But in reality, case is only ignored for the pattern, not for the haystack (I checkedhas_match()function infuzzy.candfzy) which got me worried, if I hit a language barrier. I always thought case insensitive matching means matches happens regardless if the search pattern or the haystack differs in case 😕 So it does seem that case of the pattern does matter.
It seems treat it as 'smartcase' vs not real 'case ignored'
…
-- shane.xb.qian
Case is not ignored. Although the algorithm assigns the same score for uppercase and lowercase matches (when the pattern is lowercase), I modified the sorting criteria to prefer exact lowercase matches. This is to preserve the old behavior.
In the example, foo is preferred over Foo -- case is not ignored when sorting list items. While the scoring treats uppercase and lowercase equally when the pattern is lowercase, the sorting step favors lowercase for exact matches (see fuzzy_match_item_compare()). I removed the “Case is ignored” note because it’s not strictly true, and explaining all the nuances would be overkill. The behavior is intuitive and works as users would expect.
:echo matchfuzzypos(['Foo', 'foo'], 'fo')
[['foo', 'Foo'], [[0, 1], [0, 1]], [1895, 1895]]
@chrisbra Perhaps we can remove/modify the following line and let users figure it out--since the behavior is intuitive?
The 'ignorecase' and 'smartcase' options do not apply, case is ignored if the
pattern is all lower case.
Or, say "The 'ignorecase' and 'smartcase' options do not apply".
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.
:echo matchfuzzypos(['Foo', 'foo'], 'fo')
Try the above with a pattern Fo. You'll see that foo does not match and so does not get any score. That's what I mean with case is ignored if the pattern is all lower case.
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.
:echo matchfuzzypos(['Foo', 'foo'], 'fo')
Try the above with a pattern
Fo. You'll see thatfoodoes not match and so does not get any score. That's what I mean withcase is ignored if the pattern is all lower case.
There are three phases: matching, scoring, and sorting.
Case is ignored during matching and scoring only when the pattern is entirely lowercase. This explains why we see identical scores.
However, (in Vim) case in the haystack is not ignored during sorting, even if the pattern is all lowercase--hence the reversed order of ['Foo', 'foo'].
In contrast, the sorting function in the fzy repository ignores case even during sorting, whereas Vim does not.
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.