When 'enc' is utf-8, executable() may fail on very long filename which is
longer than _MAX_PATH bytes in UTF-8 and shorter than _MAX_PATH character in
UTF-16.
Here is an example on Japanese Windows:
C:\tmp>gvim -N -u NONE --cmd "set enc=utf-8" ああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああ.bat
:w
:echo glob('あ*.bat')
ああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああ.bat
:echo strlen(glob('あ*.bat'))
604 " longer than 260
:echo strchars(glob('あ*.bat'))
204 " shorter than 260
:echo executable(glob('あ*.bat'))
0 " 1 is expected.
Attached patch fixes the problem.
Regards,
Ken Takata
2016/2/2 Tue 7:11:32 UTC+9 Bram Moolenaar wrote:
> Ken Takata wrote:
>
> > When 'enc' is utf-8, executable() may fail on very long filename which is
> > longer than _MAX_PATH bytes in UTF-8 and shorter than _MAX_PATH character in
> > UTF-16.
> > Here is an example on Japanese Windows:
> >
> > C:\tmp>gvim -N -u NONE --cmd "set enc=utf-8" ああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああ.bat
> > :w
> > :echo glob('あ*.bat')
> > ああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああ.bat
> > :echo strlen(glob('あ*.bat'))
> > 604 " longer than 260
> > :echo strchars(glob('あ*.bat'))
> > 204 " shorter than 260
> > :echo executable(glob('あ*.bat'))
> > 0 " 1 is expected.
> >
> >
> > Attached patch fixes the problem.
>
> Thanks!
I have updated the patch for 8.1.0453:
* Fixed conflicts.
* Fixed a typo in a comment which was added in 8.1.0453.
Regards,
Ken Takata
Of course, the buffer size "_MAX_PATH * 3" takes in to account those
non-BMP characters. A non-BMP character will be stored as two WCHARs
which are 4 bytes in UTF-16. And if it is converted to UTF-8, it is
also 4 bytes. So the buffer size is correct. No need to multiply by 4.
Regards,
Ken Takata
And a WCHAR from U+0800 to U+FFFF will be converted to a 3-bytes UTF-8
sequence. So it is really needs to be multiply by 3.
Regards,
Ken Takata