> Still wondering about the issue with different values for my empty
> string comparison, though. Seems like it must be a bug in either
> design or implementation of 'ignorecase'. I wonder whether with
> 'ignorecase' set the expression: 'abc' > '' returns 0 on 64-bit vim
> and 1 on 32-bit. . .
I guess it's time to post this to vim-dev as a bug since we managed to deduce
conditions under which it is reproduced. I personally can say that I was able to
reproduce it on vim-7.3.198 (--with-features=huge --enable-perlinterp --enable-
tclinterp --enable-luainterp --enable-rubyinterp --enable-python3interp,
revision f0cc719cd129) and vim-7.3.189 (USE='X acl bash-completion cscope gpm
nls perl python ruby vim-pager -debug -minimal') from gentoo repos on amd64.
Original message:
> On May 23, 8:12 am, hsitz <hes...@gmail.com> wrote:
> > > This is a reason why I never use `==', `!=', `>', `>=', `<', `<=' for
> > > comparing strings, only `is'/`isnot' (it looks better then `==#' and
> > > `!=#') and operators with either `?' or `#' at the end.
>
> Zyx -- I didn't even realize 'is'/'isnot' were defined for strings.
> However, it seems that they are equivalent to '==' and '!=' and not
> the matchcase operators you suggest. From the docs:
> "the original |List|. When using "is" without a |List| it is
> equivalent to
> using "equal", using "isnot" equivalent to using "not equal". Except
> that a
> different type means the values are different. "4 == '4'" is true, "4
> is '4'"
> is false."
>
> E.g.,
>
> :set ignorecase
> :echo 'abc' is 'ABC' (output is 1)
> :echo 'abc' == 'ABC' (output is 1)
> :echo 'abc' ==# 'ABC' (output is 0)
>
> Your point about specifying matchcase or ignorecase expressly is a
> good one. I will be modifying my code to do that.
>
> Still wondering about the issue with different values for my empty
> string comparison, though. Seems like it must be a bug in either
> design or implementation of 'ignorecase'. I wonder whether with
> 'ignorecase' set the expression: 'abc' > '' returns 0 on 64-bit vim
> and 1 on 32-bit. . .
>
> -- Herb
>
> > Zyx -- Thanks very much, I think you're onto something.
> >
> > However on my machine the two expressions you give above both evaluate
> > to 1. What is the explanation for the difference?:
> > ------------------------------------------
> > :echo 'DONE' ># ''
> > 1
> >
> > :echo 'DONE' >? ''
> >
> > 1
> > ----------------------------------------
On Tue, May 24, 2011 at 18:14, Ivan Krasilnikov <inf...@gmail.com> wrote:
> I confirm the problem. Looks like there's a bug in UTF-8 handling in
> function mb_strnicmp() in mbyte.c, specifically in the following "if"
> which was introduced by patch 7.3.040:
>
> /* Don't case-fold illegal bytes or truncated characters. */
> if (utf_ptr2len(s1 + i) < l || utf_ptr2len(s2 + i) < l)
> return -1;
>
> The check "utf_ptr2len(s2 + i) < l" is wrong.
$ python -c 'print " ".join(["0x%.2X" % n for n in range(65536) if
len(unichr(n).encode("utf8")) !=
len(unichr(n).lower().encode("utf8"))])'
0x130 0x23A 0x23E 0x1E9E 0x2126 0x212A 0x212B 0x2C62 0x2C64 0x2C6D 0x2C6E 0x2C6F
So I think the UTF-8 part of mb_strncimp() needs to be completely rewritten.
Yes, and in Turkish (i.e. with ":lang ctype tr" and 'casemap' empty), I
and i (1 byte each) have as respective case-counterparts ı and İ (2
bytes each).
Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
94. Now admit it... How many of you have made "modem noises" into
the phone just to see if it was possible? :-)
Hi, here's my patch for mbyte.c and a few testcases.
I've eliminated those return -1's by doing a bytewise comparison of
strings after the first corrupted character. This should make the
comparisons transitive at least.
> On Wed, May 25, 2011 at 14:09, Bram Moolenaar <Br...@moolenaar.net> wrote:
> > Yes, this code just returns -1, no matter if the first or second string
> > is bigger.
> >
> > Your other remark about difference in byte length of a character is
> > right, but it's not so easy to fix. =A0Can you suggest a patch?
> > Preferably with a test.
>
> Hi, here's my patch for mbyte.c and a few testcases.
>
> I've eliminated those return -1's by doing a bytewise comparison of
> strings after the first corrupted character. This should make the
> comparisons transitive at least.
Thanks, I'll look into it soon.
--
hundred-and-one symptoms of being an internet addict:
113. You are asked about a bus schedule, you wonder if it is 16 or 32 bits.
/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///
Had a bug in the patch - incorrectly checked for utf_ptr2char()'s
failure. Fixed patch and more tests in vimscript, suitable for
src/testdir/, are attached.