[vim/vim] Thesaurus completion matches more than one line in thesaurus file (#4667)

70 views
Skip to first unread message

runrin

unread,
Jul 13, 2019, 5:24:18 PM7/13/19
to vim/vim, Subscribed

using the link provided in :h thesaurus i am encountering the error described when this issue was initially opened. i've chosen to open a new issue as the one linked is marked as enhancement, and i believe what i am encountering is unintended behavior.
notworking

in order to diagnose the issue i created an extremely simple thesaurus file, and the completion seems to mostly work, but is still giving results that are on different lines:
working

thinking there was a chance i was getting results from each line beginning with the prefix i was completing, regardless of whether it was part of a longer word, i edited my thesaurus file, but still got the same result:
stillno

:h i_CTRL-X_CTRL-T states:

If a match is found in the thesaurus file, all the remaining words on the same line are included as matches, even though they don't complete the word.

:h thesaurus states:

Each line in the file should contain words with similar meaning, separated by non-keyword characters (white space is preferred). Maximum line length is 510 bytes.

unless i am missing something, my thesaurus file is formatted correctly and i should only be getting matches on one line. if this is a simple formatting error on my part please let me know.

if that is the case then :h thesaurus does not provide enough information explaining how to properly format the thesaurus file. out of curiosity, if there are other "non-keyword characters" aside from white space, what are they?

Environment:
VIM - Vi IMproved 8.1 (2018 May 18, compiled Jun 15 2019 16:41:15) Included patches: 1-875, 878, 884, 948, 1046, 1365-1368, 1382, 1401
Debian GNU/Linux 10
Konsole


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub

Christian Brabandt

unread,
Jul 14, 2019, 8:05:40 AM7/14/19
to vim/vim, Subscribed

please do not open the same issue twice. I am closing this as duplicate of #1611

Christian Brabandt

unread,
Jul 14, 2019, 8:05:40 AM7/14/19
to vim/vim, Subscribed

Closed #4667.

Bram Moolenaar

unread,
Jul 14, 2019, 10:50:19 AM7/14/19
to vim/vim, Subscribed

About the "write" vs "writer" match: I think it's correct that "writer" is included, as it is one of the possible words that one could insert. However, since it's a partial match the other words in this line should not be used, thus "creator" should be omitted.
However, it does work as documented, thus we would need to add some option to get the improved behavior. At the same time it would be good to support other file formats.

runrin

unread,
Jul 14, 2019, 7:22:44 PM7/14/19
to vim/vim, Subscribed

@brammool if the thesaurus intentionally includes all lines which contain any word that begins with our match, regardless of if it is an exact match, then that is my mistake. though if so, it is not clear from the help pages.

if that is intended behavior, then i believe this feature is flawed in a way that makes it unusable. when looking for synonyms for cat, it would not be correct to also include every word on lines containing, anywhere within it, a word beginning with those letters. words on the same lines as catholic and catharsis are not relevant matches for cat.

it would be a lot more useful to only include words on a line that begins with an exact match of our word.

also, any thesaurus will have duplicate words. if i want synonyms for hide, and am expecting results such as conceal, it wouldn't make sense to get results on the line beginning with the word skin, even though it may contain the word hide on its line. only matching words from lines beginning with our exact match will reduce the likelihood of such an occurrence, especially for words with less common uses (as in my example).

this could also make it easier to implement a feature in the future that would allow cycling through lines that begin with the same exact match. that would allow multiple lines for a word that can be used as different parts of speech or a is homonym.

@chrisbra as i said, the reason i opened a new issue was because #1611 is marked as enhancement, and from my understanding this is unintended behavior, not a request for a new feature. if you feel this issue should be closed, then the original issue should also be changed to no longer be marked as an enhancement.

if i am misunderstanding the documentation, and this is indeed intended behavior, please let me know why, rather than simply closing the issue. perhaps the help docs should be updated to clearly state how the feature works so that it doesn't continue to be reported as a bug.

Bram Moolenaar

unread,
Jul 17, 2019, 4:25:02 PM7/17/19
to vim/vim, Subscribed

> @brammool if the thesaurus intentionally includes all lines which
> contain any word that begins with our match, regardless of if it is an
> exact match, then that is my mistake. though if so, it is not clear
> from the help pages.
>
> if that is intended behavior, then i believe this feature is flawed in
> a way that makes it unusable. when looking for synonyms for `cat`, it

> would not be correct to also include every word on lines containing,
> anywhere within it, a word beginning with those letters. words on the
> same lines as `catholic` and `catharsis` are not relevant matches for
> `cat`.
>
> it would be a lot more useful to only include words on a line that
> _begins_ with an _exact match_ of our word.

>
> also, any thesaurus will have duplicate words. if i want synonyms for
> `hide`, and am expecting results such as `conceal`, it wouldn't make
> sense to get results on the line beginning with the word `skin`, even

> though it may contain the word `hide` on its line. only matching words
> from lines beginning with our exact match will reduce the likelihood
> of such an occurrence, especially for words with less common uses (as
> in my example).
>
> this could also make it easier to implement a feature in the future
> that would allow cycling through lines that begin with the same exact
> match. that would allow multiple lines for a word that can be used as
> different parts of speech or a is homonym.
>
> @chrisbra as i said, the reason i opened a new issue was because #1611
> is marked as `enhancement`, and from my understanding this is

> unintended behavior, not a request for a new feature. if you feel this
> issue should be closed, then the original issue should also be changed
> to no longer be marked as an enhancement.
>
> if i am misunderstanding the documentation, and this is indeed
> intended behavior, please let me know why, rather than simply closing
> the issue. perhaps the help docs should be updated to clearly state
> how the feature works so that it doesn't continue to be reported as a
> bug.

The current implementation is aiming for:
- Finding as many matches as possible
- Working with a reasonably small thesaurus file, where duplicates are
avoided

Clearly you want something else: Only match with the first word in the
line and include the words after it in the list of matches. And if the
match is partial, only include the word itself, not what follows.

I think we can do this by keeping the one 'thesaurus' option but adding
flags for each file, something like:
set thesaurus=strict@/usr/share/thesaurus,loose@/usr/share/similarwords

--
How do you know when you have run out of invisible ink?

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

runrin

unread,
Jul 19, 2019, 12:26:54 PM7/19/19
to vim/vim, Subscribed

thanks for the info. appreciate it. i was really just hoping to get a better understanding of how the feature works. i may look into hacking something in myself sometime in the future.

lacygoill

unread,
Nov 8, 2021, 2:29:26 PM11/8/21
to vim/vim, Subscribed

@runrin: If you can upgrade Vim to get the patch 8.2.3520, then you might be interested in :help 'thesaurusfunc' and :help compl-thesaurusfunc.


You are receiving this because you are subscribed to this thread.

Reply to this email directly, view it on GitHub.
Triage notifications on the go with GitHub Mobile for iOS or Android.

Reply all
Reply to author
Forward
0 new messages