[win32][patch] executable() may fail on very long filename

61 views
Skip to first unread message

Ken Takata

unread,
Feb 1, 2016, 8:29:25 AM2/1/16
to vim_dev
Hi,

When 'enc' is utf-8, executable() may fail on very long filename which is
longer than _MAX_PATH bytes in UTF-8 and shorter than _MAX_PATH character in
UTF-16.
Here is an example on Japanese Windows:

C:\tmp>gvim -N -u NONE --cmd "set enc=utf-8" ああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああ.bat
:w
:echo glob('あ*.bat')
ああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああ.bat
:echo strlen(glob('あ*.bat'))
604 " longer than 260
:echo strchars(glob('あ*.bat'))
204 " shorter than 260
:echo executable(glob('あ*.bat'))
0 " 1 is expected.


Attached patch fixes the problem.

Regards,
Ken Takata

fix-mch_can_exe-with-long-filename.patch

Bram Moolenaar

unread,
Feb 1, 2016, 5:11:32 PM2/1/16
to Ken Takata, vim_dev
Thanks!

--
hundred-and-one symptoms of being an internet addict:
100. The most exciting sporting events you noticed during summer 1996
was Netscape vs. Microsoft.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Ken Takata

unread,
Oct 7, 2018, 9:50:46 PM10/7/18
to vim_dev
Hi,

2016/2/2 Tue 7:11:32 UTC+9 Bram Moolenaar wrote:
> Ken Takata wrote:
>
> > When 'enc' is utf-8, executable() may fail on very long filename which is
> > longer than _MAX_PATH bytes in UTF-8 and shorter than _MAX_PATH character in
> > UTF-16.
> > Here is an example on Japanese Windows:
> >
> > C:\tmp>gvim -N -u NONE --cmd "set enc=utf-8" ああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああ.bat
> > :w
> > :echo glob('あ*.bat')
> > ああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああああ.bat
> > :echo strlen(glob('あ*.bat'))
> > 604 " longer than 260
> > :echo strchars(glob('あ*.bat'))
> > 204 " shorter than 260
> > :echo executable(glob('あ*.bat'))
> > 0 " 1 is expected.
> >
> >
> > Attached patch fixes the problem.
>
> Thanks!

I have updated the patch for 8.1.0453:

* Fixed conflicts.
* Fixed a typo in a comment which was added in 8.1.0453.

Regards,
Ken Takata

fix-mch_can_exe-with-long-filename.patch

Tony Mechelynck

unread,
Oct 7, 2018, 10:32:39 PM10/7/18
to vim_dev
On Mon, Oct 8, 2018 at 3:50 AM Ken Takata <ktakat...@gmail.com> wrote:
> [...]
> I have updated the patch for 8.1.0453:
>
> * Fixed conflicts.
> * Fixed a typo in a comment which was added in 8.1.0453.
>
> Regards,
> Ken Takata

In UTF-8, characters outside the BMP (i.e. characters in the range
U+10000 to U+10FFFD), including some "CJK Extension" characters in
plane 2, use 4 bytes each, not 3. However, in UTF-16le as used by
Windows, each of those non-BMP characters takes up 2 words (one high
surrogate and one low surrogate) instead of 1, so maybe (I don't know)
they might "count double" towards the allowed _MAX_PATH characters.

Best regards,
Tony.

Ken Takata

unread,
Oct 7, 2018, 10:40:07 PM10/7/18
to vim_dev
Hi Tony,

Of course, the buffer size "_MAX_PATH * 3" takes in to account those
non-BMP characters. A non-BMP character will be stored as two WCHARs
which are 4 bytes in UTF-16. And if it is converted to UTF-8, it is
also 4 bytes. So the buffer size is correct. No need to multiply by 4.

Regards,
Ken Takata

Ken Takata

unread,
Oct 7, 2018, 10:45:56 PM10/7/18
to vim_dev
Hi,

And a WCHAR from U+0800 to U+FFFF will be converted to a 3-bytes UTF-8
sequence. So it is really needs to be multiply by 3.

Regards,
Ken Takata

ktakat...@gmail.com

unread,
Feb 19, 2019, 9:22:35 AM2/19/19
to vim_dev
Hi,

I have updated the patch for the latest source code:

* Fixed conflicts.
* Use C++ style comments.

Regards,
Ken Takata

fix-mch_can_exe-with-long-filename.patch

ktakat...@gmail.com

unread,
Feb 20, 2019, 5:32:55 AM2/20/19
to vim_dev
Hi,
I added a test for this and created a PR so that we can check if the test
passes on AppVeyor: https://github.com/vim/vim/pull/4015

Regards,
Ken Takata
Reply all
Reply to author
Forward
0 new messages