SCI_GETCOLUMN/SCI_FINDCOLUMN does not take character representations into account

19 views
Skip to first unread message

Robin Haberkorn

unread,
Mar 8, 2026, 5:47:07 PM (5 days ago) Mar 8
to scintilla...@googlegroups.com
Dear Scintilla hackers,

I notice that SCI_GETCOLUMN and SCI_FINDCOLUMN do not take the length of
character representations into account. A character with a representation is
always counted as one even if it takes up several characters. Looking into
Document::GetColumn() it's pretty obvious that there is currently no
expansion of character representations.

Considering that these messages do take the display size of tabs into
account, IMHO it would make sense if it took the size of representations into
account as well. Imagine a document with many character representations and
trying something like "put the cursor into the same column, just 10 lines
earlier".

If Neil agrees that this is a bug/shortcoming, I could look into this and
prepare a patch.

Best regards,
Robin
signature.asc

Neil Hodgson

unread,
Mar 10, 2026, 5:02:31 PM (3 days ago) Mar 10
to scintilla...@googlegroups.com
Robin:

> I notice that SCI_GETCOLUMN and SCI_FINDCOLUMN do not take the length of
> character representations into account.

Yes. One of the roles of these APIs is to operate with other pieces of
software that use column numbers. Some tools, for example, report
warning locations in terms of line and column. These APIs allow
automatically selecting the warning location or displaying a column
number UI element that will match the diagnostics.

Representations are more of an appearance (or cosmetic) feature so do
not influence these APIs.

Neil

Robin Haberkorn

unread,
Mar 10, 2026, 5:59:51 PM (3 days ago) Mar 10
to scintilla...@googlegroups.com
On Tue Mar 10, 2026 at 22:02:14 GMT +01, Neil Hodgson wrote:
> Yes. One of the roles of these APIs is to operate with other pieces of
> software that use column numbers. Some tools, for example, report
> warning locations in terms of line and column. These APIs allow
> automatically selecting the warning location or displaying a column
> number UI element that will match the diagnostics.
>

I would suspect that most tools will count only characters if not bytes
since a) they don't know the tab size and b) they don't always know the document's
encoding.

But apparently, there is a lot of variation out there.
GCC expands tabs and counts Unicode glyphs:

hello.c: In function 'main':
hello.c:5:10: error: expected expression before ';' token
5 | +;
| ^

hello.c: In function 'main':
hello.c:5:15: error: expected expression before ';' token
5 | /*Ö*/+;
| ^

This would be compatible with SCI_FINDCOLUMN.
Clang apparently counts plain bytes:

hello.c:5:3: error: expected expression
5 | +;
| ^

hello.c:5:9: error: expected expression
5 | /*Ö*/+;
| ^

Anyway, I will just write my own versions of SCI_GETCOLUMN/SCI_FINDCOLUMN
then.

Regards,
Robin
signature.asc
Reply all
Reply to author
Forward
0 new messages