Thanks for working on this. My main concern at first is that the user
will be confused by seeing three functions:
charidx({string}, {idx} [, {countcc} [, {utf16}]])
charidx_addcc({string}, {idx})
charidx_dropcc({string}, {idx})
Only when reading the details we can find out that the {idx} of
charidx() is a byte index, the other two are character indexes.
Changing the argument name to {byteidx} would help. We may have to do
that for other functions as well, to keep consistency.
Having the {countcc} argument for charidx() and a separate function name
for the other two is confusing. Also because "addcc" and "dropcc" can
be seen as an alternative for {countcc} (and that's not really
incorrect), but there is no hint that the {idx} argument is used
differently.
Alternatively there would be a function that does have the {countcc}
argument and the name indicating that {idx} is a character index:
charidx_XXX({string}, {idx}, {countcc})
However, is this {countcc} argument really doing the same thing? The
help for charidx() says:
When {countcc} is omitted or |FALSE|, then composing characters
are not counted separately, their byte length is added to the
preceding base character.
When {countcc} is |TRUE|, then composing characters are
counted as separate characters.
We can't use exactly the same for charidx_XXX(), since the index is not
in bytes. And using a character index, we would have to mention whether
composing characters are counted separately. This gets confusing, an
argument {countcc} which actually means something else, depending on
whether you look at the input or the result.
It's probably better to use two separate functions. I hope we find
better names though.
The help for the new functions should be extra clear, since it's easy to
misunderstand. We can discuss that on the PR.
--
Drink wet cement and get really stoned.