[PATCH] Fix backwards search from multibyte character

50 views
Skip to first unread message

Sung Pae

unread,
Dec 18, 2012, 7:06:55 AM12/18/12
to vim...@googlegroups.com
Fix case where searching backwards from a multibyte character on the
same line results in a wrong cursor offset. Given the buffer:

0123❤

With the cursor on the 3-byte UTF-8 character ❤ (U+2764), calling the
command

:call search('.', 'b')

places the cursor on "1" instead of "3". This is due to erroneously
counting the length of the character as an extra offset, which is not
needed when searching backwards.
---
src/search.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/search.c b/src/search.c
index d7bfc43..8e058eb 100644
--- a/src/search.c
+++ b/src/search.c
@@ -572,7 +572,8 @@ searchit(win, buf, pos, dir, pat, count, options, pat_use, stop_lnum, tm)
extra_col = 0;
#ifdef FEAT_MBYTE
/* Watch out for the "col" being MAXCOL - 2, used in a closed fold. */
- else if (has_mbyte && pos->lnum >= 1 && pos->lnum <= buf->b_ml.ml_line_count
+ else if (dir != BACKWARD && has_mbyte
+ && pos->lnum >= 1 && pos->lnum <= buf->b_ml.ml_line_count
&& pos->col < MAXCOL - 2)
{
ptr = ml_get_buf(buf, pos->lnum, FALSE) + pos->col;
--
1.8.0.2

Bram Moolenaar

unread,
Dec 18, 2012, 4:40:26 PM12/18/12
to Sung Pae, vim...@googlegroups.com
Thanks for the patch!

If you can, it would be nice to have a test for this. So that it
doesn't break again when making other changes.

--
If Microsoft would build a car...
... You'd have to press the "Start" button to turn the engine off.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Sung Pae

unread,
Dec 18, 2012, 7:07:08 PM12/18/12
to vim...@googlegroups.com, Sung Pae
On Tue, Dec 18, 2012 at 10:40:26PM +0100, Bram Moolenaar wrote:

> Thanks for the patch!
>
> If you can, it would be nice to have a test for this. So that it
> doesn't break again when making other changes.

Gladly! I added one test case to test44, since that deals with regexp
search of multi-byte characters.

Failing test transforms:

j 0123❤x

into

j 023❤

instead of

j 012❤

Cheers,
Sung Pae
---
src/testdir/test44.in | 4 ++++
src/testdir/test44.ok | 1 +
2 files changed, 5 insertions(+)

diff --git a/src/testdir/test44.in b/src/testdir/test44.in
index b8b8d4f..e486869 100644
--- a/src/testdir/test44.in
+++ b/src/testdir/test44.in
@@ -29,6 +29,9 @@ x/[\U1234abcd\u1234\uabcd]
x/\%d21879b
x/ [[=A=]]* [[=B=]]* [[=C=]]* [[=D=]]* [[=E=]]* [[=F=]]* [[=G=]]* [[=H=]]* [[=I=]]* [[=J=]]* [[=K=]]* [[=L=]]* [[=M=]]* [[=N=]]* [[=O=]]* [[=P=]]* [[=Q=]]* [[=R=]]* [[=S=]]* [[=T=]]* [[=U=]]* [[=V=]]* [[=W=]]* [[=X=]]* [[=Y=]]* [[=Z=]]*/e
x/ [[=a=]]* [[=b=]]* [[=c=]]* [[=d=]]* [[=e=]]* [[=f=]]* [[=g=]]* [[=h=]]* [[=i=]]* [[=j=]]* [[=k=]]* [[=l=]]* [[=m=]]* [[=n=]]* [[=o=]]* [[=p=]]* [[=q=]]* [[=r=]]* [[=s=]]* [[=t=]]* [[=u=]]* [[=v=]]* [[=w=]]* [[=x=]]* [[=y=]]* [[=z=]]*/e
+x:" Test backwards search from a multi-byte char
+/x
+x?.
x:?^1?,$w! test.out
:e! test.out
G:put =matchstr(\"אבגד\", \".\", 0, 2) " ב
@@ -57,3 +60,4 @@ f
g a啷bb
h AÀÁÂÃÄÅĀĂĄǍǞǠẢ BḂḆ CÇĆĈĊČ DĎĐḊḎḐ EÈÉÊËĒĔĖĘĚẺẼ FḞ GĜĞĠĢǤǦǴḠ HĤĦḢḦḨ IÌÍÎÏĨĪĬĮİǏỈ JĴ KĶǨḰḴ LĹĻĽĿŁḺ MḾṀ NÑŃŅŇṄṈ OÒÓÔÕÖØŌŎŐƠǑǪǬỎ PṔṖ Q RŔŖŘṘṞ SŚŜŞŠṠ TŢŤŦṪṮ UÙÚÛÜŨŪŬŮŰŲƯǓỦ VṼ WŴẀẂẄẆ XẊẌ YÝŶŸẎỲỶỸ ZŹŻŽƵẐẔ
i aàáâãäåāăąǎǟǡả bḃḇ cçćĉċč dďđḋḏḑ eèéêëēĕėęěẻẽ fḟ gĝğġģǥǧǵḡ hĥħḣḧḩẖ iìíîïĩīĭįǐỉ jĵǰ kķǩḱḵ lĺļľŀłḻ mḿṁ nñńņňʼnṅṉ oòóôõöøōŏőơǒǫǭỏ pṕṗ q rŕŗřṙṟ sśŝşšṡ tţťŧṫṯẗ uùúûüũūŭůűųưǔủ vṽ wŵẁẃẅẇẘ xẋẍ yýÿŷẏẙỳỷỹ zźżžƶẑẕ
+j 0123❤x
diff --git a/src/testdir/test44.ok b/src/testdir/test44.ok
index 2bd5bda..d98ac2e 100644
--- a/src/testdir/test44.ok
+++ b/src/testdir/test44.ok
@@ -16,6 +16,7 @@ f z
g abb
h AÀÁÂÃÄÅĀĂĄǍǞǠẢ BḂḆ CÇĆĈĊČ DĎĐḊḎḐ EÈÉÊËĒĔĖĘĚẺẼ FḞ GĜĞĠĢǤǦǴḠ HĤĦḢḦḨ IÌÍÎÏĨĪĬĮİǏỈ JĴ KĶǨḰḴ LĹĻĽĿŁḺ MḾṀ NÑŃŅŇṄṈ OÒÓÔÕÖØŌŎŐƠǑǪǬỎ PṔṖ Q RŔŖŘṘṞ SŚŜŞŠṠ TŢŤŦṪṮ UÙÚÛÜŨŪŬŮŰŲƯǓỦ VṼ WŴẀẂẄẆ XẊẌ YÝŶŸẎỲỶỸ ZŹŻŽƵẐ
i aàáâãäåāăąǎǟǡả bḃḇ cçćĉċč dďđḋḏḑ eèéêëēĕėęěẻẽ fḟ gĝğġģǥǧǵḡ hĥħḣḧḩẖ iìíîïĩīĭįǐỉ jĵǰ kķǩḱḵ lĺļľŀłḻ mḿṁ nñńņňʼnṅṉ oòóôõöøōŏőơǒǫǭỏ pṕṗ q rŕŗřṙṟ sśŝşšṡ tţťŧṫṯẗ uùúûüũūŭůűųưǔủ vṽ wŵẁẃẅẇẘ xẋẍ yýÿŷẏẙỳỷỹ zźżžƶẑ
+j 012❤
ב
בג
א
--
1.8.0.2

Bram Moolenaar

unread,
Dec 25, 2012, 7:56:45 PM12/25/12
to Sung Pae, vim...@googlegroups.com

Sung Pae wrote:

> On Tue, Dec 18, 2012 at 10:40:26PM +0100, Bram Moolenaar wrote:
>
> > Thanks for the patch!
> >
> > If you can, it would be nice to have a test for this. So that it
> > doesn't break again when making other changes.
>
> Gladly! I added one test case to test44, since that deals with regexp
> search of multi-byte characters.
>
> Failing test transforms:
>
> j 0123❤x
>
> into
>
> j 023❤
>
> instead of
>
> j 012❤

Great, thanks.

--
Some of the well known MS-Windows errors:
EMEMORY Memory error caused by..., eh...
ELICENSE Your license has expired, give us more money!
EMOUSE Mouse moved, reinstall Windows
EILLEGAL Illegal error, you are not allowed to see this
EVIRUS Undetectable virus found
Reply all
Reply to author
Forward
0 new messages