Patch 7.4.835

67 views
Skip to first unread message

Bram Moolenaar

unread,
Aug 25, 2015, 10:32:01 AM8/25/15
to vim...@googlegroups.com

Patch 7.4.835
Problem: Comparing utf-8 sequences does not handle different byte sizes
correctly.
Solution: Get the byte size of each character. (Dominique Pelle)
Files: src/misc2.c


*** ../vim-7.4.834/src/misc2.c 2015-07-17 13:22:43.153523709 +0200
--- src/misc2.c 2015-08-25 16:27:31.565982817 +0200
***************
*** 5058,5064 ****
char_u *s1;
char_u *s2;
{
! int i;
int prev1 = NUL;
int prev2 = NUL;

--- 5058,5064 ----
char_u *s1;
char_u *s2;
{
! int i, j;
int prev1 = NUL;
int prev2 = NUL;

***************
*** 5068,5086 ****
if (s1 == NULL || s2 == NULL)
return FALSE;

! if (STRLEN(s1) != STRLEN(s2))
! return FAIL;
!
! for (i = 0; s1[i] != NUL && s2[i] != NUL; i += MB_PTR2LEN(s1 + i))
{
int c1 = PTR2CHAR(s1 + i);
! int c2 = PTR2CHAR(s2 + i);

if ((p_fic ? MB_TOLOWER(c1) != MB_TOLOWER(c2) : c1 != c2)
&& (prev1 != '*' || prev2 != '*'))
return FAIL;
prev2 = prev1;
prev1 = c1;
}
return TRUE;
}
--- 5068,5086 ----
if (s1 == NULL || s2 == NULL)
return FALSE;

! for (i = 0, j = 0; s1[i] != NUL;)
{
int c1 = PTR2CHAR(s1 + i);
! int c2 = PTR2CHAR(s2 + j);

if ((p_fic ? MB_TOLOWER(c1) != MB_TOLOWER(c2) : c1 != c2)
&& (prev1 != '*' || prev2 != '*'))
return FAIL;
prev2 = prev1;
prev1 = c1;
+
+ i += MB_PTR2LEN(s1 + i);
+ j += MB_PTR2LEN(s2 + j);
}
return TRUE;
}
***************
*** 5814,5827 ****
const char *p, *q;
int maxlen;
{
! int i;
int c1, c2;
const char *s = NULL;

! for (i = 0; maxlen < 0 || i < maxlen; i += MB_PTR2LEN((char_u *)p + i))
{
c1 = PTR2CHAR((char_u *)p + i);
! c2 = PTR2CHAR((char_u *)q + i);

/* End of "p": check if "q" also ends or just has a slash. */
if (c1 == NUL)
--- 5814,5827 ----
const char *p, *q;
int maxlen;
{
! int i, j;
int c1, c2;
const char *s = NULL;

! for (i = 0, j = 0; maxlen < 0 || (i < maxlen && j < maxlen);)
{
c1 = PTR2CHAR((char_u *)p + i);
! c2 = PTR2CHAR((char_u *)q + j);

/* End of "p": check if "q" also ends or just has a slash. */
if (c1 == NUL)
***************
*** 5829,5834 ****
--- 5829,5835 ----
if (c2 == NUL) /* full match */
return 0;
s = q;
+ i = j;
break;
}

***************
*** 5854,5861 ****
return p_fic ? MB_TOUPPER(c1) - MB_TOUPPER(c2)
: c1 - c2; /* no match */
}
}
! if (s == NULL) /* "i" ran into "maxlen" */
return 0;

c1 = PTR2CHAR((char_u *)s + i);
--- 5855,5865 ----
return p_fic ? MB_TOUPPER(c1) - MB_TOUPPER(c2)
: c1 - c2; /* no match */
}
+
+ i += MB_PTR2LEN((char_u *)p + i);
+ j += MB_PTR2LEN((char_u *)q + j);
}
! if (s == NULL) /* "i" or "j" ran into "maxlen" */
return 0;

c1 = PTR2CHAR((char_u *)s + i);
*** ../vim-7.4.834/src/version.c 2015-08-25 16:19:01.587296525 +0200
--- src/version.c 2015-08-25 16:22:58.444828674 +0200
***************
*** 743,744 ****
--- 743,746 ----
{ /* Add new patch number below this line */
+ /**/
+ 835,
/**/

--
If an elephant is left tied to a parking meter, the parking fee has to be paid
just as it would for a vehicle.
[real standing law in Florida, United States of America]

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Christ van Willegen

unread,
Aug 25, 2015, 2:46:22 PM8/25/15
to vim...@googlegroups.com
Was the test for s2[j] != NUL left out intentionally, or does another
code path catch that one?

Christ van Willegen

Bram Moolenaar

unread,
Aug 25, 2015, 3:30:32 PM8/25/15
to Christ van Willegen, vim...@googlegroups.com
There is a test for s1[i] != NUL. If s2[j] then is NUL the characters
differ, c1 and c2 differ, and it gets to "return FAIL".

--
Biting someone with your natural teeth is "simple assault," while biting
someone with your false teeth is "aggravated assault."
[real standing law in Louisana, United States of America]

Dominique Pellé

unread,
Aug 25, 2015, 4:31:54 PM8/25/15
to vim_dev
I assume that you're talking about line misc2.c:5071:

!5071 for (i = 0, j = 0; s1[i] != NUL;)
5072 {
5073 int c1 = PTR2CHAR(s1 + i);
5074 int c2 = PTR2CHAR(s2 + j);
5075
5076 if ((p_fic ? MB_TOLOWER(c1) != MB_TOLOWER(c2) : c1 != c2)
5077 && (prev1 != '*' || prev2 != '*'))
5078 return FAIL;
5079 prev2 = prev1;
5080 prev1 = c1;
5081
5082 i += MB_PTR2LEN(s1 + i);
5083 j += MB_PTR2LEN(s2 + j);
5084 }
5085 return TRUE;

At first I thought that testing for s2[j] != NUL was useless
at line 5071, since if s2[j] is NUL, then the test at line 5076
would be false and so function would return at line 5078.

But I now see 2 reasons why that may not be true:

- if s2 ends with "**" then (prev1 != '*' || prev2 != '*')
at line 5077 will be false and the loop will access beyond
of string for s2! (bug!)

- or if the is s1[i] contains an invalid utf8 sequence
such as: 0xc0 0x80 for which PTR2CHAR(...) is 0.
and s2[j] is NUL, then c1 and c2 will be equal and
the loop will continue, hence also accessing beyond
end of string s2 (bug!).

So it's buggy :-(

It's also odd that function returns TRUE, FALSE or FAIL.
That was not introduced by patch 7.4.835.
The return FAIL should be return FALSE at line 5078.

How about following patch?

Regards
Dominique
fix-ff_wc_equal-7.4.837.patch

Bram Moolenaar

unread,
Aug 27, 2015, 4:25:34 PM8/27/15
to Dominique Pellé, vim_dev
Thanks!

--
Zen Microsystems: we're the om in .commmmmmmmm
Reply all
Reply to author
Forward
0 new messages