[vim/vim] UTF words copy/paste looks display wrong. (#8706)

11 views
Skip to first unread message

Shane-XB-Qian

unread,
Aug 5, 2021, 2:44:41 AM8/5/21
to vim/vim, Subscribed

Describe the bug
os level copied e.g (a chinese word), reg + or * is \u554a and displayed as it too.

To Reproduce
Detailed steps to reproduce the behavior:

  1. 'ctrl+c' copy e.g at os level.
  2. vim --clean
  3. reg +*
  4. or "+p displayed \u554a

Expected behavior
displayed as

Environment (please complete the following information):

VIM - Vi IMproved 8.2 (2019 Dec 12, compiled Aug 5 2021 09:28:13)
Included patches: 1-3290
Compiled by shane@shanesmbxpro
Huge version with GTK3 GUI.

ubuntn 20.04
xterm


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.

Christian Brabandt

unread,
Aug 5, 2021, 3:23:43 AM8/5/21
to vim/vim, Subscribed

is this in the terminal? Perhaps it's just the font that cannot display that character.

Shane-XB-Qian

unread,
Aug 5, 2021, 3:27:17 AM8/5/21
to vim/vim, Subscribed

Perhaps it's just the font that cannot display that character.

looks not, since if pasted by 'ctrl-shift-v' (when in vim insert mode), then it displayed correctly which is .

Martin Tournoij

unread,
Aug 5, 2021, 5:38:36 AM8/5/21
to vim/vim, Subscribed

I can't really reproduce this on my Linux system. I copy/pasted 啊 using the exact steps you mentioned, and it works well for me.

What are your locale settings? Maybe it's a locale issue somehow? You can usually find them with the locale command.

And just to be absolutely sure, you're using Vim inside xterm, and not gVim?

依云

unread,
Aug 5, 2021, 6:18:19 AM8/5/21
to vim/vim, Subscribed

What locale are you using? Try run locale and check if your locale is setup correctly. With LANG=C I can reproduce this issue.

Shane-XB-Qian

unread,
Aug 5, 2021, 6:19:30 AM8/5/21
to vim/vim, Subscribed

Where are you copy/pasting it from? As in, which application? Maybe that affects things?

looks not, or e.g 'mousepad' if you like to know.

What are your locale settings?

LANG=en_HK.UTF-8

// and of course, it is Tui vim.

Martin Tournoij

unread,
Aug 5, 2021, 6:29:54 AM8/5/21
to vim/vim, Subscribed

I can reproduce it with LANG=C, both when copying from Firefox and Mousepad, but not with LANG=en_HK.UTF-8: that always seems to work correct as far as I can see.

Are you sure your locale is set up correctly? Does the locale command show any errors?

Shane-XB-Qian

unread,
Aug 5, 2021, 7:04:16 AM8/5/21
to vim/vim, Subscribed

i think my locale is correct, and no error showed.

LANG=en_HK.UTF-8
LANGUAGE=en_HK:en
LC_CTYPE="en_HK.UTF-8"
LC_NUMERIC="en_HK.UTF-8"
LC_TIME="en_HK.UTF-8"
LC_COLLATE="en_HK.UTF-8"
LC_MONETARY="en_HK.UTF-8"
LC_MESSAGES="en_HK.UTF-8"
LC_PAPER="en_HK.UTF-8"
LC_NAME="en_HK.UTF-8"
LC_ADDRESS="en_HK.UTF-8"
LC_TELEPHONE="en_HK.UTF-8"
LC_MEASUREMENT="en_HK.UTF-8"
LC_IDENTIFICATION="en_HK.UTF-8"
LC_ALL=

Tony Mechelynck

unread,
Aug 5, 2021, 7:32:34 AM8/5/21
to vim/vim, Subscribed

What does Vim answer to

:language
:verbose set encoding?

See also https://vim.fandom.com/wiki/Working_with_Unicode
and remember that the right place to set 'encoding' is in your vimrc, before any editfile has been loaded.

Best regards,
Tony.

Shane-XB-Qian

unread,
Aug 5, 2021, 9:00:02 AM8/5/21
to vim/vim, Subscribed

tony, i ran with 'vim --clean', and the 'encoding' is 'utf-8'.

Shane-XB-Qian

unread,
Aug 5, 2021, 9:01:41 AM8/5/21
to vim/vim, Subscribed

when the locale is:

LANG=en_US.UTF-8

LANGUAGE=en_HK:en

LC_CTYPE="en_US.UTF-8"

LC_NUMERIC=en_US.UTF-8

LC_TIME=en_US.UTF-8

LC_COLLATE="en_US.UTF-8"

LC_MONETARY=en_US.UTF-8

LC_MESSAGES="en_US.UTF-8"

LC_PAPER=en_US.UTF-8

LC_NAME=en_US.UTF-8

LC_ADDRESS=en_US.UTF-8

LC_TELEPHONE=en_US.UTF-8

LC_MEASUREMENT=en_US.UTF-8

LC_IDENTIFICATION=en_US.UTF-8

LC_ALL=

// i cannot reproduce it IF the locale is like above.
// the + and * both is .

Shane-XB-Qian

unread,
Aug 5, 2021, 9:04:59 AM8/5/21
to vim/vim, Subscribed

however then change the locale LANG=en_HK.UTF-8 :

LANG=en_HK.UTF-8
LANGUAGE=en_HK:en
LC_CTYPE="en_HK.UTF-8"
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE="en_HK.UTF-8"
LC_MONETARY=en_US.UTF-8
LC_MESSAGES="en_HK.UTF-8"
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=

// now then the + or * both or one of them is \u554a.

Shane-XB-Qian

unread,
Aug 5, 2021, 9:14:54 AM8/5/21
to vim/vim, Subscribed

so the territory would be the problem ?!

Christian Brabandt

unread,
Aug 5, 2021, 9:29:31 AM8/5/21
to vim/vim, Subscribed

so the only differences in territory is for LANG, LANGUAGE and LC_MESSAGES , LC_COLLATE and LC_CTYPE. can you find out which one is responsible? Or just leave it at en_US.UTF-8

Shane-XB-Qian

unread,
Aug 5, 2021, 10:14:17 AM8/5/21
to vim/vim, Subscribed

i think it is lc_ctype, however perhaps normally it should be same value like lang.

// the territory of lc_ctype somehow looks it is the problem.........................................

chrisma

unread,
Aug 26, 2021, 5:19:03 AM8/26/21
to vim/vim, Subscribed

i think it is lc_ctype, however perhaps normally it should be same value like lang.

// the territory of lc_ctype somehow looks it is the problem.........................................

是不是在iTerm2下打开的vim呢?我刚刚也遇到了这个问题,但是用terminal打开vim,再输入汉字就没问题,感觉大概率是iTerm2的问题😢

chrisma

unread,
Aug 26, 2021, 5:21:40 AM8/26/21
to vim/vim, Subscribed

~/.vimrc中添加了这个就可以了 囧rz

set fileencodings=utf-8,ucs-bom,gb18030,gbk,gb2312,cp936

Shane-XB-Qian

unread,
Jun 19, 2023, 2:41:33 AM6/19/23
to vim/vim, Subscribed

Closed #8706 as completed.


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issue/8706/issue_event/9564223258@github.com>

Reply all
Reply to author
Forward
0 new messages