Can't pass arguments into external command when enc=utf-8

290 views
Skip to first unread message

mattn

unread,
Aug 5, 2013, 3:05:16 AM8/5/13
to vim...@googlegroups.com
Hi Bram.

When enc=utf-8, vim pass the command arguments as utf-8 string on windows. Thus, :grep doesn't work with multbyte string. If ACP is CP932 and type ":echo XXX" as utf-8, vim calls:

cmd /c (echo XXX)

If XXX is 3 byte utf-8 string, windows command prompt treat the arguments like "XXX)".

https://gist.github.com/mattn/6153332

Please check.

Thanks.
- Yasuhiro Matsumoto

Tony Mechelynck

unread,
Aug 5, 2013, 6:05:39 AM8/5/13
to vim...@googlegroups.com
1. What is 'encoding' set to if you start Vim with no vimrc?

vim -u NORC -N


2. When running normally, what is 'termencoding' set to? If it is the
empty string (which is the defaut) try using the code snippet from
http://vim.wikia.com/wiki/Working_with_Unicode and see if you get a
better result.


I'm assuming you're talking about Console Vim (vim.exe, if on Windows)
and not GUI Vim (aka gvim).


Best regards,
Tony.
--
Renning's Maxim:
Man is the highest animal. Man does the classifying.

mattn

unread,
Aug 5, 2013, 6:59:01 AM8/5/13
to vim...@googlegroups.com
On Monday, August 5, 2013 7:05:39 PM UTC+9, Tony Mechelynck wrote:
<snip>

> 1. What is 'encoding' set to if you start Vim with no vimrc?
>
> vim -u NORC -N

cp932

> 2. When running normally, what is 'termencoding' set to? If it is the
> empty string (which is the defaut) try using the code snippet from
> http://vim.wikia.com/wiki/Working_with_Unicode and see if you get a
> better result.

No, it must set encoding that command prompt can treat. It must not depend on vim's termencoding.

> I'm assuming you're talking about Console Vim (vim.exe, if on Windows)
> and not GUI Vim (aka gvim).

I'm talking CUI/GUI both. When ":!echo XXX", it run cmd.exe /c (...) and pass the arguments. So evenhough if GUI is running, it should convert argument strings into CUI's encoding.

Thanks.

Bram Moolenaar

unread,
Aug 5, 2013, 3:34:33 PM8/5/13
to mattn, vim...@googlegroups.com
Thanks. I wonder why nobody reported a problem with this before. Just
not so many users in this situation or are they not using multi-byte
characters with external commands?

:vimgrep works properly, right?

--
hundred-and-one symptoms of being an internet addict:
51. You put a pillow case over your laptop so your lover doesn't see it while
you are pretending to catch your breath.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

mattn

unread,
Aug 9, 2013, 8:21:59 AM8/9/13
to vim...@googlegroups.com, mattn
On Tuesday, August 6, 2013 4:34:33 AM UTC+9, Bram Moolenaar wrote:
> Thanks. I wonder why nobody reported a problem with this before. Just
> not so many users in this situation or are they not using multi-byte
> characters with external commands?
>
> :vimgrep works properly, right?

Maybe it is. I don't set enc=utf-8 always on windows.

Ken Takata

unread,
Aug 9, 2013, 8:57:31 PM8/9/13
to vim...@googlegroups.com, mattn
Hi,

I also found that some functions in os_win32.c don't use Unicode APIs.

* fname_case()
I think this should be fixed. Currently the case of a filename is not
set properly.

* mch_get_user_name()
Fixed with fix-utf8-username.patch.

* mch_get_host_name()
Fixed with fix-utf8-hostname.patch.
Actually I don't know that Windows can use multibyte characters in hostname.

* mch_nodetype()
I'm not sure this should be fixed. This doesn't seem to cause an obvious
problem.

* mch_system_piped()
Fixed with fix-utf8-msgloop.patch.

Regards,
Ken Takata

fix-utf8-username.patch
fix-utf8-hostname.patch
fix-utf8-msgloop.patch

Bram Moolenaar

unread,
Aug 10, 2013, 6:45:26 AM8/10/13
to Ken Takata, vim...@googlegroups.com, mattn
Thanks! I'll add the patches in the todo list.


--
hundred-and-one symptoms of being an internet addict:
72. Somebody at IRC just mentioned a way to obtain full motion video without
a PC using a wireless protocol called NTSC, you wonder how you never
heard about it

Ken Takata

unread,
Aug 22, 2013, 8:38:24 AM8/22/13
to vim...@googlegroups.com, mattn
Hi,

I found some more UTF-8 issues and I wrote additional patches to fix them.

2013/08/10 Sat 9:57:31 UTC+9 Ken Takata wrote:
> * fname_case()
> I think this should be fixed. Currently the case of a filename is not
> set properly.

Fixed with this patch:
https://bitbucket.org/k_takata/vim-win32-mq/src/96100afa35da7d0f5c0ac51ebad436083bf701a3/fix-utf8-fname_case.patch

> * mch_nodetype()
> I'm not sure this should be fixed. This doesn't seem to cause an obvious
> problem.

Now I think it is better to fix this. I also found another issue related
to mch_nodetype().
mch_nodetype() is intended to be called by readfile() in fileio.c, but
actually it is not called because a #if block for Windows is inside of
a '#ifdef UNIX' block!! Because of this, the 'opendevice' option doesn't
work when opening a device file.

Fixed with these patches:
https://bitbucket.org/k_takata/vim-win32-mq/src/5ac9bea8445125dddf03f6980bdb4f39ba1ce265/fix-device-check-on-Windows.patch
https://bitbucket.org/k_takata/vim-win32-mq/src/54bcdfe31710bfc3019fdf848b7bff60b498deea/fix-utf8-nodetype.patch

* mch_isFullName() and vim_stat() in os_mswin.c
Buffer size is not enough for UTF-8.
WinNT and later can use _MAX_PATH wide characters for a pathname, which
means that the maximum pathname is _MAX_PATH * 3 bytes when 'enc' is UTF-8,
but currently the size is _MAX_PATH bytes. (When a UTF-16 code unit is
converted to UTF-8, the size becomes 1 to 3 bytes. For a surrogate pair,
two UTF-16 code units are converted to a 4-byte UTF-8 character.)

Fixed with this patch:
https://bitbucket.org/k_takata/vim-win32-mq/src/98d42e4cde77d200ededbbd14e71f60f6705ff98/fix-utf8-buffer-length.patch

* mch_resolve_shortcut() in os_mswin.c
Wide APIs are not used.

Fixed with this patch:
https://bitbucket.org/k_takata/vim-win32-mq/src/98d42e4cde77d200ededbbd14e71f60f6705ff98/fix-utf8-mch_resolve_shortcut.patch

Regards,
Ken Takata

Bram Moolenaar

unread,
Aug 22, 2013, 11:23:12 AM8/22/13
to Ken Takata, vim...@googlegroups.com, mattn
Thanks, I'll put them all in the todo list.

--
hundred-and-one symptoms of being an internet addict:
106. When told to "go to your room" you inform your parents that you
can't...because you were kicked out and banned.

Ken Takata

unread,
Aug 28, 2013, 8:35:22 AM8/28/13
to vim...@googlegroups.com, mattn
Hi,

2013/08/22 Thu 21:38:24 UTC+9 Ken Takata wrote:

> * mch_resolve_shortcut() in os_mswin.c
> Wide APIs are not used.
>
> Fixed with this patch:
> https://bitbucket.org/k_takata/vim-win32-mq/src/98d42e4cde77d200ededbbd14e71f60f6705ff98/fix-utf8-mch_resolve_shortcut.patch

I have updated this patch. #if..#endifs were not properly indented.
https://bitbucket.org/k_takata/vim-win32-mq/src/c040acc974cc89478ec253c9daef56c09023218c/fix-utf8-mch_resolve_shortcut.patch

Regards,
Ken Takata

mattn

unread,
Dec 10, 2013, 1:29:44 AM12/10/13
to vim...@googlegroups.com, mattn
Bram, I guess, many users which uses multi-byte hope this patch.

Bram Moolenaar

unread,
Dec 11, 2013, 7:22:05 AM12/11/13
to mattn, vim...@googlegroups.com

Yasuhiro Matsumoto wrote:

> Bram, I guess, many users which uses multi-byte hope this patch.

Ken's patches have drifted down in the todo list.
I'll move them up a bit.

--
A day without sunshine is like, well, night.

Ken Takata

unread,
Dec 11, 2013, 7:39:11 AM12/11/13
to vim...@googlegroups.com, mattn
Hi Bram,

2013/12/11 Wed 21:22:05 UTC+9 Bram Moolenaar wrote:
> Yasuhiro Matsumoto wrote:
>
> > Bram, I guess, many users which uses multi-byte hope this patch.
>
> Ken's patches have drifted down in the todo list.
> I'll move them up a bit.

No, what he mentioned are not my patches.
I think that mattn's patch is more important than mine:
https://groups.google.com/d/msg/vim_dev/UsAv0LIIEug/GoD3DLbTDusJ .

Regards,
Ken Takata

mattn

unread,
Dec 11, 2013, 8:07:18 AM12/11/13
to vim...@googlegroups.com, mattn
On Wednesday, December 11, 2013 9:39:11 PM UTC+9, Ken Takata wrote:
> No, what he mentioned are not my patches.
> I think that mattn's patch is more important than mine:
> https://groups.google.com/d/msg/vim_dev/UsAv0LIIEug/GoD3DLbTDusJ .

Yes, I mean that.

Recently, many users uses gvim.exe with utf-8 on windows. But currently, gvim.exe doesn't work to pass argument to external command as well if encoding=utf-8.

Reply all
Reply to author
Forward
0 new messages