Dmitrij D. Czarkoff
unread,Oct 14, 2016, 7:11:51 AM10/14/16You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to te...@openbsd.org, mar...@openbsd.org, ni...@openbsd.org, schw...@openbsd.org, t...@openbsd.org
Hi!
I've noticed that in ksh's vi mode ranged operations are performed
without respect to cursor's position within utf8 byte sequence. Eg.:
1. type "echo тест | hexdump -C"
2. leave inseart mode
3. "0", "2E", "dh", Enter
4. you end up with "те" and 0x82 (last byte of letter under cursor).
This happens because Endword() moves cursor to the whitespace after word
and decrements cursor position by 1, so that it points to last byte of
last letter. Then del_range() removes bytes between cursor position and
preceding utf8 start byte, which include start byte of letter under
cursor.
My diff makes del_range() (and yank_range() which operates in the same
manner) always skip to the beginning of utf8 sequence it is in.
Although this is a more of a bandaid - proper fix would be to make sure
that cursor never rests on continuation byte - it is less invasive and
does not hurt code readability too much.
Comments? OKs?
--
Dmitrij D. Czarkoff
Index: vi.c
===================================================================
RCS file: /var/cvs/src/bin/ksh/vi.c,v
retrieving revision 1.40
diff -u -p -r1.40 vi.c
--- vi.c 11 Oct 2016 19:52:54 -0000 1.40
+++ vi.c 14 Oct 2016 10:47:25 -0000
@@ -1323,6 +1323,10 @@ redo_insert(int count)
static void
yank_range(int a, int b)
{
+ while (isu8cont((unsigned char)es->cbuf[a]))
+ a--;
+ while (isu8cont((unsigned char)es->cbuf[b]))
+ b--;
yanklen = b - a;
if (yanklen != 0)
memmove(ybuf, &es->cbuf[a], yanklen);
@@ -1493,6 +1497,10 @@ putbuf(const char *buf, int len, int rep
static void
del_range(int a, int b)
{
+ while (isu8cont((unsigned char)es->cbuf[a]))
+ a--;
+ while (isu8cont((unsigned char)es->cbuf[b]))
+ b--;
if (es->linelen != b)
memmove(&es->cbuf[a], &es->cbuf[b], es->linelen - b);
es->linelen -= b - a;