charidx() return value when the string length in bytes is passed as the index

5 views
Skip to first unread message

Yegappan Lakshmanan

unread,
Jun 7, 2023, 2:03:45 AM6/7/23
to vim_dev
Hi,

The byteidx() function returns the length of a string in bytes when
the specified character
index is equal to the number of characters in the string:

echo byteidx("abc", 3)
3

But the charidx() function returns -1 when the specified byte index is
equal to the number
of bytes in the string:

echo charidx("abc", 3)
-1

When I implemented the support for charidx() in 8.2.2233, I didn't
handle this case properly.
Should we change charidx() to return the number of characters in the
string in this case?
This will help in the LSP plugin where the language server specifies
the index after the
last character in some cases (e.g. completion). The LSP plugin
currently checks for this
case and then uses strcharlen() to get the number of characters. This
involves computing
the string length two times.

Thanks,
Yegappan

Bram Moolenaar

unread,
Jun 7, 2023, 1:27:43 PM6/7/23
to vim...@googlegroups.com, Yegappan Lakshmanan

Yegappan wrote:

> The byteidx() function returns the length of a string in bytes when
> the specified character
> index is equal to the number of characters in the string:
>
> echo byteidx("abc", 3)
> 3

When increasing the index we get:
echo byteidx("abc", 0) byteidx("abc", 1) byteidx("abc", 2) byteidx("abc", 3) byteidx("abc", 4)
0 1 2 3 -1

> But the charidx() function returns -1 when the specified byte index is
> equal to the number
> of bytes in the string:
>
> echo charidx("abc", 3)
> -1

echo charidx("abc", 0) charidx("abc", 1) charidx("abc", 2) charidx("abc", 3) charidx("abc", 4)
0 1 2 -1 -1

That is unexpected, for single-byte characters the byte index and
character index are the same.

> When I implemented the support for charidx() in 8.2.2233, I didn't
> handle this case properly.
> Should we change charidx() to return the number of characters in the
> string in this case?
> This will help in the LSP plugin where the language server specifies
> the index after the last character in some cases (e.g. completion).
> The LSP plugin currently checks for this case and then uses
> strcharlen() to get the number of characters. This involves computing
> the string length two times.

It would be good if we can make this consistent. There is a tiny
backwards compatibility problem, but does it matter? I can't think of a
situation where a plugin would rely on getting -1 instead of the number
of characters.

If we change this, your plugin would still need to handle using an older
Vim version. I suppose that's not much of a problem.

--
hundred-and-one symptoms of being an internet addict:
126. You brag to all of your friends about your date Saturday night...but
you don't tell them it was only in a chat room.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// \\\
\\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///
Reply all
Reply to author
Forward
0 new messages