[vim/vim] encoding issue when using job_start (#6953)

31 views
Skip to first unread message

Melandel

unread,
Sep 14, 2020, 12:30:46 PM9/14/20
to vim/vim, Subscribed

Describe the bug
I am working on replacing a synchronous make with something asynchronous based on job_start with {'out_io': 'buffer'} and cgetbuffer. Everything works except for some weird characters, amongst which, characters like é or à but also, it seems, some space characters.

  • Given set makeprg=dir and set errorformat=%m, running make shows the following quickfix list:
    image
  • Using job_start and set errorformat=%m, running the following function:
function! JobStartExample()

	let cmd = 'dir'

	let s:job = job_start(

		\'cmd /C '.cmd,

		\{

			\'out_cb':   { chan,msg-> execute('echomsg "'.msg.'"',  1) },

			\'err_cb':   { chan,msg-> execute('echomsg "'.msg.'"',  1) },

			\'close_cb': { chan    -> execute('echomsg "'.chan.'"', 1) },

			\'exit_cb':  { job,status-> execute('echomsg "foo"', '') }

		\}

	\)

endfunction

shows the following quickfix list:
image

Environment:

  • gVim version :
VIM - Vi IMproved 8.2 (2019 Dec 12, compiled Jul  4 2020 22:02:18)

MS-Windows 64-bit GUI version with OLE support

Included patches: 1-1127

Compiled by appveyor@APPVYR-WIN

  • OS: Windows 10
  • Terminal: GUI (gVim)

additional notes

  • chcp is cmd returns 65001
  • I have:
    • set encoding=utf-8
    • set fileencoding=


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

Melandel

unread,
Sep 20, 2020, 2:43:44 PM9/20/20
to vim/vim, Subscribed

Bump!

Bram Moolenaar

unread,
Sep 20, 2020, 3:25:35 PM9/20/20
to vim/vim, Subscribed

I suspect "cmd" does not use utf-8. Thus the output is in another encoding. You may need to use iconv() to convert to utf-8, or have the command output utf-8 somehow (not sure that is possible, the Windows console hasn't supported that).

Christian Brabandt

unread,
Sep 20, 2020, 3:39:45 PM9/20/20
to vim/vim, Subscribed

don't you need to run the chcp command also in your job_start() command?

Melandel

unread,
Sep 20, 2020, 5:28:52 PM9/20/20
to vim/vim, Subscribed

@chrisbra I tried running chcp && before my command. Below, the output:
image
65001 is the correct chcp for utf-8

@brammool It reminds me that I had to tweak some code in fzf for a similar issue:

  function! s:enc_to_cp(str)
    if !has('iconv')
      return a:str
    endif
    if !exists('s:codepage')
			if &encoding == 'utf-8'
				let s:codepage = 65001
			else
      	let s:codepage = libcallnr('kernel32.dll', 'GetACP', 0)
			endif
    endif
    return iconv(a:str, &encoding, 'cp'.s:codepage)
  endfunction

The code that is oddly (too much) indented is the one I had to add to make fzf work without encoding issues. However, when I'm using that function:

function! JobStartExample(...)
	let cmd = 'dir'
	let g:a = ''
	let s:job = job_start(
		\'cmd /C '.cmd,
		\{
			\'callback': { chan,msg  -> execute('echomsg "[cb] '.escape(msg,'"\').'"',  1)                              },
			\'out_cb':   { chan,msg  -> execute('echomsg "'.iconv(escape(msg,'"\', 'utf-8', 'cp65001').'"',  1)                                   },
			\'err_cb':   { chan,msg  -> execute('echohl Constant | echomsg "\'.escape(msg,'"\').'" | echohl Normal',  1)},
			\'close_cb': { chan      -> execute('echomsg "[close] '.chan.'"', 1)                                        },
			\'exit_cb':  { job,status-> execute('echomsg "[exit] '.status.'"', '')                                      }
		\}
	\)
endfunction

I still get an output with problematic characters:
image

Melandel

unread,
Sep 20, 2020, 6:14:15 PM9/20/20
to vim/vim, Subscribed

What I don't understand is that :r !dir works fine in terms of encoding...

Christian Brabandt

unread,
Sep 21, 2020, 4:55:48 AM9/21/20
to vim/vim, Subscribed

:r !dir this does some additional processing and I believe it uses the &termencoding setting.

it looks like the encoding conversion does not correctly work. Is cp65001 a correct encoding for utf-8? Does it work if you use utf-8 instead?

Melandel

unread,
Sep 21, 2020, 5:10:00 AM9/21/20
to vim/vim, Subscribed

I am not sure I understand what you are asking. Where do you want me to put utf-8 instead of cp65001 ?

K.Takata

unread,
Sep 21, 2020, 5:24:55 AM9/21/20
to vim/vim, Subscribed

I also comfirmed that using chcp 65001 didn't change the output encoding of dir. Not sure why.

Melandel

unread,
Sep 21, 2020, 5:39:40 AM9/21/20
to vim/vim, Subscribed

Closed #6953.

Melandel

unread,
Sep 21, 2020, 5:39:42 AM9/21/20
to vim/vim, Subscribed

I still don't have a clue why that-which-should-have-worked did not work. I don't get it. It still stresses me out.

However, I found a way to make it work. Demonstration:

function! JobStartExample(...)
	
let cmd = 'dir'
	let s:job = job_start(
		\'cmd /C '.cmd,
		\{
\'callback': { chan,msg  -> execute('echomsg "[cb] '.escape(msg,'"\').'"',  1
)                              },
			\'out_cb':   { chan,msg  -> execute('echomsg "'.escape(msg,'"\').'"',  1)                                   },
			
\'err_cb':   { chan,msg  -> execute('echohl Constant | echomsg "\'.escape(msg,'"\').'" | echohl Normal',  1)},
			\'close_cb': { chan      -> execute('echomsg "[close] '.chan.'"', 1)                                        },
			\'exit_cb':  { job,status-> execute('echomsg "[exit] '.status.'"', '')                                      }
		\}
	\)
endfunction

image

Here's the source on stackoverflow, credit to mklement0.

It's a configuration on Windows. Even though every time in the past, chcp65001 has always made the trick in my experience, it seems this time it doesn't.

In Windows 10 version 1903+, follow these steps:

  1. Run intl.cpl (which opens the regional settings in Control Panel)
  2. Go to Administrative tab
  3. Click on Change system locale
  4. Check the Beta: Use Unicode UTF-8 for worldwide language support box

image

Is it worth getting written somewhere in the doc, so people spend less time stretching their hair? I sure know I did.

I hope this will help others!

K.Takata

unread,
Sep 21, 2020, 6:06:01 AM9/21/20
to vim/vim, Subscribed

Changing system locale to UTF-8 is still beta. So,

Is this configuration point worth getting written somewhere in the doc,

I don't think so.

Bram Moolenaar

unread,
Sep 21, 2020, 11:20:32 AM9/21/20
to vim/vim, Subscribed


> What I don't understand is that `:r !dir` works fine in terms of encoding...

There are the options 'termencoding', 'makeencoding' and 'fileencoding'.
But I don't think any of them apply to channels.

--
ARTHUR: Be quiet! I order you to shut up.
OLD WOMAN: Order, eh -- who does he think he is?
ARTHUR: I am your king!
OLD WOMAN: Well, I didn't vote for you.
"Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

ladayaroslav

unread,
Jun 18, 2023, 4:42:57 AM6/18/23
to vim/vim, Subscribed

I don't think so.

Well, I think it really should be mentioned --- encountered the same problem (wrapping psql using vim channels), and this is the only thing that helped in Win10.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1596037208@github.com>

Christian Brabandt

unread,
Jun 19, 2023, 3:40:26 AM6/19/23
to vim/vim, Subscribed

then may be you want to contribute a doc patch to os_win32s.txt ?


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1596669192@github.com>

K.Takata

unread,
Jun 19, 2023, 6:35:32 AM6/19/23
to vim/vim, Subscribed

Normally, 'tenc' is set to the current code page.
Converting the output from 'tenc' to 'enc' by using iconv() should be enough.

function! JobStartExample(...)
  let cmd = 'dir'
  let g:a = ''
  let s:job = job_start(
	\ 'cmd /C '.cmd,
	\ 
{
	\ 'callback': { chan,msg  -> execute('echomsg "[cb] '.escape(msg,'"\').'"',  1)                              },
	\ 'out_cb':   { chan,msg  -> execute('echomsg "'.iconv(escape(msg,'"\'), &tenc, &enc).'"',  1)               },
	\ 'err_cb':   { chan,msg  -> execute('echohl Constant | echomsg "\'.escape(msg,'"\').'" | echohl Normal',  1)},
	\ 'close_cb': { chan      -> execute('echomsg "[close] '.chan.'"', 1)                                        },
	\ 'exit_cb':  { job,status-> execute('echomsg "[exit] '.status.'"', '')                                      }
	\ }
	\ )
endfunction

Using the UTF-8 system setting may "fix" some issues, but it can easily cause other issues.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1596939445@github.com>

ladayaroslav

unread,
Jun 19, 2023, 9:30:48 AM6/19/23
to vim/vim, Subscribed

then may be you want to contribute a doc patch to os_win32s.txt ?

I didn't find "Known bugs" section in there? ;)
But really, this actually smells like a gvim bug --- why other programs don't need this?
It would be nice if someone re-tested that...


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1597198885@github.com>

ladayaroslav

unread,
Jun 19, 2023, 9:35:05 AM6/19/23
to vim/vim, Subscribed

Normally, 'tenc' is set to the current code page.

But the docs claim: "For the Win32 GUI and console versions 'termencoding' is not used, because the Win32 system always passes Unicode characters."

BTW, it's not what I see after trying the above workaround:

set tenc?

termencoding=cp65001

Converting the output from 'tenc' to 'enc' by using iconv() should be enough.

It should not be required (i.e. gvim should do this), IMNSHO.

Using the UTF-8 system setting may "fix" some issues, but it can easily cause other issues.

Indeed.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1597205426@github.com>

Christian Brabandt

unread,
Jun 19, 2023, 9:59:21 AM6/19/23
to vim/vim, Subscribed

@ladayaroslav let me repeat: If you have specific suggestions, we gladly take any doc enhancements, preferably in the form of a patch.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1597242622@github.com>

ladayaroslav

unread,
Jun 19, 2023, 10:03:20 AM6/19/23
to vim/vim, Subscribed

@ladayaroslav let me repeat: If you have specific suggestions, we gladly take any doc enhancements, preferably in the form of a patch.

Sigh. Let me repeat my specific suggestion: this ticket most probably describes a real gvim bug on windows (note "my" case involved no shell), and it better be not documented, but actually looked into and fixed.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1597249193@github.com>

Christian Brabandt

unread,
Jun 19, 2023, 10:09:07 AM6/19/23
to vim/vim, Subscribed

psql using vim channels

are you sure it's not psql outputing into your default Windows locale?


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1597259661@github.com>

K.Takata

unread,
Jun 19, 2023, 8:13:53 PM6/19/23
to vim/vim, Subscribed

But the docs claim: "For the Win32 GUI and console versions 'termencoding' is not used, because the Win32 system always passes Unicode characters."

This means Vim doesn't use 'tenc' when showing the screen. Vim holds the internal data in 'enc' encoding (default is utf-8) and converts it to utf-16 (not to 'tenc') when showing the data.
Even 'tenc' is not used, it is set to the current code page.

It should not be required (i.e. gvim should do this), IMNSHO.

Adding an option into job_start() to convert the encoding might be an option.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1597899457@github.com>

Bram Moolenaar

unread,
Jun 22, 2023, 9:20:41 AM6/22/23
to vim/vim, Subscribed

We cannot predict what encoding a program uses. It might be 'termencoding', if the text was meant to be displayed. But more and more programs use utf-8, since it makes things simpler.
I don't think there is a bug in Vim. One may just as well argue that the executed program uses the wrong encoding.
Since there is a way to make it work I think we can leave it at that. We can add something in the help, but it will require quite a lot of text to explain it properly, for something that few users will encounter.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1602628069@github.com>

ladayaroslav

unread,
Jun 22, 2023, 10:06:21 AM6/22/23
to vim/vim, Subscribed

are you sure it's not psql outputing into your default Windows locale?

Hmm... am I, indeed? Sorry, I haven't had time to test it... not sure when I'll do.
Perhaps, someone else could try it, if it's urgent?

But Bram might be right in #6953 (comment) --- if a program really doesn't use unicode (or does it incorrectly) on Windows, it's not vim problem.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1602703190@github.com>

ladayaroslav

unread,
Jun 22, 2023, 10:08:12 AM6/22/23
to vim/vim, Subscribed

but it will require quite a lot of text to explain it properly, for something that few users will encounter.

Why not make a reference to this issue in the docs, then?
You had no problem referencing (external!) https://github.com/lacygoill/wiki/blob/master/vim/vim9.md in vim9.txt, after all. ;)


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/6953/1602706679@github.com>

Reply all
Reply to author
Forward
0 new messages