Windows console: unable to read/write russian text

299 views
Skip to first unread message

Maxim Kim

unread,
Sep 28, 2009, 4:06:56 PM9/28/09
to vim_use
Hi,

Can you tell me how can I setup my vim to handle russian (utf-8) text
in win console?
Simple russian phrase

Привет, меня зовут Максим.

is unreadable. Though I can copy and it paste here.

My settings are:
:set enc? tenc?
encoding=utf-8
termencoding=cp866

I thought it might be font issue, but with <C-Z> I can paste the same
text into shell and it looks good.
So this is not font issue.

Thanks,
Maxim.

Tony Mechelynck

unread,
Sep 28, 2009, 6:15:30 PM9/28/09
to vim...@googlegroups.com

I'm not sure.

- IIUC, typing

echo Здравствуйте мир!

at the cmd.exe prompt works OK (echoes the correct Russian greeting in
reply) and the result can be pasted (but not into Vim)?

- Does it make a difference if you leave 'termencoding' empty?

- Are you sure your console uses cp866? The email I'm replying to was in
koi8-r.

Until (or unless) this issue is resolved, you ought to be able to use
gvim (with the same settings), where anything can be stored in memory
(as a UTF-8 bytestring) and displayed in the GUI in some well-chosen
font. (I recommend _not_ to use Lucida_Console because its Cyrillic bold
glyphs are a tiny wee bit wider than its unbold glyphs -- or used to be
when I was on XP; Courier_New should be OK even if less "pretty".) A
well-chosen 'fileencodings' setting and/or use of the ++enc modifier
(see ":help ++opt") in commands such as :e <filename>, :new <filename>,
:tabedit <filename>, etc. should let you edit files in any Cyrillic
encoding such as koi8-r, koi8-u, ISO-8859-5, Windows-1251 or cp866 -- as
well, of course, as in Unicode encodings such as UTF-8. (It might not be
possible to get _automatic_ recognition of all Cyrillic encodings but
you can always _tell_ gvim which 'fileencoding' to use, e.g. by means of

:e ++enc=cp866 russtext.txt

)


Best regards,
Tony.
--
A new dramatist of the absurd
Has a voice that will shortly be heard.
I learn from my spies
He's about to devise
An unprintable three-letter word.

Maxim Kim

unread,
Sep 29, 2009, 12:37:28 AM9/29/09
to vim_use
On 29 сен, 02:15, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:
> On 28/09/09 22:06, Maxim Kim wrote:
>
> > My settings are:
> > :set enc? tenc?
> > encoding=utf-8
> > termencoding=cp866
>
> I'm not sure.
>
> - IIUC, typing
>
> echo Здравствуйте мир!
echo Здравствуй мир!
sounds better. :)
>
> at the cmd.exe prompt works OK (echoes the correct Russian greeting in
> reply) and the result can be pasted (but not into Vim)?
Yes.

> - Does it make a difference if you leave 'termencoding' empty?
No. I tried different termencodings (empty too) with no visible
effect.

> - Are you sure your console uses cp866? The email I'm replying to was in
> koi8-r.
I am not 100% sure it is cp866 but I use WinXP. Is there a way I can
check
what encoding it uses? Btw I do not setup termencoding in _vimrc so it
is vim
makes termencoding=cp866.

>
> Until (or unless) this issue is resolved, you ought to be able to use
> gvim (with the same settings), where anything can be stored in memory
> (as a UTF-8 bytestring) and displayed in the GUI in some well-chosen
> font.
I usually do use gVim but sometimes console vim comes in handy.
(commiting to
svn, hg etc). I have no encoding problems with gVim but consVim.

Thanks,
Maxim.

Tony Mechelynck

unread,
Sep 29, 2009, 1:36:00 AM9/29/09
to vim...@googlegroups.com
On 29/09/09 06:37, Maxim Kim wrote:
[...]

>> - Are you sure your console uses cp866? The email I'm replying to was in
>> koi8-r.
> I am not 100% sure it is cp866 but I use WinXP. Is there a way I can
> check
> what encoding it uses? Btw I do not setup termencoding in _vimrc so it
> is vim
> makes termencoding=cp866.
[...]

IIUC, the Vim default for 'termencoding' is the empty string. Maybe that
option is set elsewhere, maybe in the UTF-8-setting script that I
published at vim-online, or maybe in some other script. What does
Console Vim answer to

:verbose set enc? tenc?

immediately after startup (the way you normally start it, with vimrc and
all)?

To know what console encoding yout WinXP uses, start Vim as

vim -N -u NONE

(which loads neither your vimrc nor any global plugins), then, after
startup, ask

:set enc?

That should show you the "default encoding" used by the underlying terminal.


Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
143. You dream in pallettes or 256 colors.

Maxim Kim

unread,
Sep 29, 2009, 2:11:28 AM9/29/09
to vim_use



On 29 сен, 09:36, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:
> On 29/09/09 06:37, Maxim Kim wrote:
> [...]
>
> IIUC, the Vim default for 'termencoding' is the empty string. Maybe that
> option is set elsewhere, maybe in the UTF-8-setting script that I
> published at vim-online, or maybe in some other script. What does
> Console Vim answer to
>
> :verbose set enc? tenc?
>
> immediately after startup (the way you normally start it, with vimrc and
> all)?
encoding=utf-8
Last set from ~\_vimrc
termencoding=cp866
>
> To know what console encoding yout WinXP uses, start Vim as
>
> vim -N -u NONE
>
> (which loads neither your vimrc nor any global plugins), then, after
> startup, ask
>
> :set enc?
>
> That should show you the "default encoding" used by the underlying terminal.
cp1251

PS
This is quite strange. If I (using my _vimrc with set enc=utf-8)
1. Change font to Lucida Console.
2. Write some text -- everything is ok. I can see correct russian
text.
3. Change font to standard bitmap font -- everything is ok. I can
see
previously entered russian text.
4. Write the same text -- previously entered text is ok, current
is crap
with a lot of triangles.
5. Change font to Lucida Console -- text from 2. is ok, text from
4. is
still crap but with questions and incorrect letters.
6. Press <C-L> and all the entered text (2, 4) is correct.

So the only option I can see for now is using Lucida Console.

Tony Mechelynck

unread,
Sep 29, 2009, 2:57:20 AM9/29/09
to vim...@googlegroups.com

Oho! Sounds like a missing screen redraw somewhere. You aren't using
'lazyredraw' by any chance? Also, what Vim version and patchlevel are
you using? (as shown on the second non-blank line of the ":intro"
screen, or as the first two lines -- starting "VIM - Vi Improved" and
"Included patches" respectively -- in the output of ":version")

Or rather -- Vim is probably not aware that the font has been changed
(see bottom paragraph before my sig below) so it doesn't redraw anything.

What happens if you hit Ctrl-L (in Normal mode) between steps 4 and 5?
My guess would be that the text from step 2 turns to crap, which might
indicate that your bitmapped font has wrong glyphs for your current
terminal encoding. I expect that the Russian would reappear after step
6, even where it had changed to crap at step 4½.

>
> So the only option I can see for now is using Lucida Console.

Is that so bad? (in Console Vim, not gvim)


The font in Console Vim is in any case a function of the terminal -- Vim
has no action on it: it can neither determine what is in use nor change
it -- unless maybe by running the appropriate OS-dependent commands as
external programs, for instance via system()


Best regards,
Tony.
--
Albert Einstein, when asked to describe radio, replied: "You see, wire
telegraph is a kind of a very, very long cat. You pull his tail in New
York and his head is meowing in Los Angeles. Do you understand this?
And radio operates exactly the same way: you send signals here, they
receive them there. The only difference is that there is no cat."

Christian Brabandt

unread,
Sep 29, 2009, 3:21:49 AM9/29/09
to vim...@googlegroups.com
On Tue, September 29, 2009 6:37 am, Maxim Kim wrote:

> I am not 100% sure it is cp866 but I use WinXP. Is there a way I can
> check what encoding it uses?

In Console execute chcp and it should display the default value. Use
chcp <encoding> to change to any other encoding. I am not sure, if the
Windows console supports utf-8 though.

regards,
Christian
--
:wq!

Maxim Kim

unread,
Sep 29, 2009, 3:53:35 AM9/29/09
to vim_use

On 29 сен, 10:57, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:
> On 29/09/09 08:11, Maxim Kim wrote:
> > PS
> > This is quite strange. If I (using my _vimrc with set enc=utf-8)
> > 1. Change font to Lucida Console.
> > 2. Write some text -- everything is ok. I can see correct russian
> > text.
> > 3. Change font to standard bitmap font -- everything is ok. I can
> > see
> > previously entered russian text.
> > 4. Write the same text -- previously entered text is ok, current
> > is crap
> > with a lot of triangles.
> > 5. Change font to Lucida Console -- text from 2. is ok, text from
> > 4. is
> > still crap but with questions and incorrect letters.
> > 6. Press<C-L> and all the entered text (2, 4) is correct.
>
> Oho! Sounds like a missing screen redraw somewhere. You aren't using
> 'lazyredraw' by any chance? Also, what Vim version and patchlevel are
> you using? (as shown on the second non-blank line of the ":intro"
> screen, or as the first two lines -- starting "VIM - Vi Improved" and
> "Included patches" respectively -- in the output of ":version")
Aha, I have 'set lazyredraw' in my _vimrc. (included patches 1-160)

> Or rather -- Vim is probably not aware that the font has been changed
> (see bottom paragraph before my sig below) so it doesn't redraw anything.
Actually at step 5 vim changes triangles to questions -- partial
redraw? :)

>
> What happens if you hit Ctrl-L (in Normal mode) between steps 4 and 5?
> My guess would be that the text from step 2 turns to crap,
Yep.

> which might indicate that your bitmapped font has wrong glyphs for your current
> terminal encoding.
Actually at step 3 I can see russian letters of that bitmapped(!)
font. Then do <c-l> and they turn to crap, again in that bitmapped
font.

> I expect that the Russian would reappear after step
> 6, even where it had changed to crap at step 4½.
True.

>
> > So the only option I can see for now is using Lucida Console.
>
> Is that so bad? (in Console Vim, not gvim)
Not it isn't
If I knew Lucida console is okay for vim console I wouldn't email
here.

Thanks,
Maxim

Mikalai Chaly

unread,
Sep 29, 2009, 4:10:40 AM9/29/09
to vim...@googlegroups.com
On Tue, Sep 29, 2009 at 10:53 AM, Maxim Kim <hab...@gmail.com> wrote:


If I knew Lucida console is okay for vim console I wouldn't email
here.


But that just really strange - if it is possible to type russian chars in console directly with bitmap font set, why vim behaves in a different way?

Mikalai

Tony Mechelynck

unread,
Sep 29, 2009, 5:38:54 AM9/29/09
to vim...@googlegroups.com

I don't know. Maybe setting the bitmapped font changes the terminal to a
different encoding, and Vim, if already running, would of course be
unaware of the change (it's only at startup that Vim "asks" the OS what
encoding it should use).

About the "triangles": IIRC, the glyph used in bitmapped fonts (such as
the PC ROM-BIOS cp437 font used in the text console before any other
font has been loaded) for the byte 0x7F is a kind of triangle, usually
with its two bottom corners slightly lopped off. However, in ASCII or in
most character sets based on ASCII, that byte is a "control character"
which is either represented as an "invalid character" (which could be
the Unicode "reverse-video question mark in a diamond" glyph, or the
"hollow triangle" glyph, or something meaning "unknown character" such
as ? or ¿) -- or even not displayed at all.

I should have gone to sleep yesterday evening (it's 11:37 my time now)
so you should expect a delay of several hours before I reply to the next
message iun this thread.


Best regards,
Tony.
--
Everybody is somebody else's weirdo.
-- Dykstra

Reply all
Reply to author
Forward
0 new messages