Menu translation & wrong encoding in Windows 7

50 views
Skip to first unread message

Mojca Miklavec

unread,
Dec 2, 2009, 4:19:14 AM12/2/09
to vim...@vim.org
Hello,

I have submitted Slovenian Vim menu translation a while ago while I
was still using XP (and then didn't use Windows for ages; menu
translations don't work with Aqua Vim at all) and the encoding worked properly.

Now I tried it on Windows 7 and discovered that the language is set to
Slovenian_Slovenia.1250. That might be a bit weird. The native
encoding used to be cp1250 indeed (at least in XP), but it could be
that Windows 7 now
tries to use utf-8 whenever possible ... but I don't really know.

The problem is that now I get translation
   Pomoé (eacute)
instead of
   Pomoč (ccaron)
for "Help". S and Z with caron (šž) are missing completely, so there's a
big bunch of wrong translations.

Some deeper insight: In cp1250 the character č is located at C8 and
E8. In ISO-8859-1 (or Unicode, for that matter) there is egrave or the
same location (maybe I need to check again once I'm on Windows whether
that character is egrave or eacute). The characters š and ž are at
(8A/9A) and (8E/9E). All four positions are empty in Latin1 and
Unicode. Which might be a reasonble explanation of the behaviour I'm
observing.

I didn't try to modify any setting. I'm just using the defaults of
what gvim 7.2 for windows provides. I tried to rename files
afterwards, but with zero success.

It might be a problem in either the source code or the script that
takes care of choosing the right menu encoding. The English version
doesn't suffer at all, and most other menu translations are in Latin1
anyway, so nobody using that translation should notice the problem
even if it is more widespread. The only languages apart from Slovenian
sharing the same encoding are Czech/Slovak/Polish.

Does anyone have a hint how to solve this problem? Does it work OK for
Czech/Slovak/Polish users on Windows 7?

Thanks a lot,
   Mojca

PS: I already had a related problem years ago. I read at
http://vimdoc.sourceforge.net/htmldoc/version7.html:

BUG FIXES *bug-fixes-7*
When $LANG is "sl" for slovenian, the slovak menu was used, since "slovak"
starts with "sl".

Could this be related, just in another way?

Mojca Miklavec

unread,
Dec 2, 2009, 5:04:12 AM12/2/09
to vim...@vim.org
Hello,

I have now tested the Polish version (I switched the system language
to Polish) and it shows exactly the same problem (screenshot attached
- hopefully 31k is still within the allowed limits).

Mojca
gvim_encoding2.png

Mojca Miklavec

unread,
Dec 2, 2009, 6:18:30 AM12/2/09
to vim...@vim.org
On Wed, Dec 2, 2009 at 10:19, Mojca Miklavec wrote:
> Hello,
>
> I have submitted Slovenian Vim menu translation a while ago while I
> was still using XP (and then didn't use Windows for ages; menu
> translations don't work with Aqua Vim at all) and the encoding worked properly.
>
> Now I tried it on Windows 7 and discovered that the language is set to
> Slovenian_Slovenia.1250. That might be a bit weird. The native
> encoding used to be cp1250 indeed (at least in XP), but it could be
> that Windows 7 now
> tries to use utf-8 whenever possible ... but I don't really know.

I'm sorry for overflooding the mailing list. I have now figured out
that I should be able to change the language with
gvim --cmd "lang Slovenian_Slovenia.65001"
where 65001 is the windows code for UTF-8. This still means that
"system-wide encoding for applications that are not UTF-8 aware" is
cp1250 though. Just gvim is taught to use 65001 encoding. I copied the
file menu_sl_si.utf-8.vim to menu_slovenian_slovenia.65001.vim if I
wanted to be able to set that encoding at all, but gvim nevertheless
seem to send cp1250-encoded data to Windows while windows interprets
that data as if it was proper UTF-8.

This seems like a "buglet" burried deep down in the source code. I
wonder that no Czeck/Slovak/Polish users have complained so far.

Mojca

PS: possibly related problem that I had in past (sorry, I don't know
where the offcial archive is):
http://markmail.org/message/5onmtz33tfohtvfz
Subject: vim + win + utf-8 => I'm lost
Date: Aug 4, 2005 5:23:05 pm

Tony Mechelynck

unread,
Dec 30, 2009, 11:36:48 PM12/30/09
to vim...@googlegroups.com, Mojca Miklavec

Hello Mojca,

I'm sorry no one answered your posts in almost one month (or only by
breaking the thread).

To see which locale the OS passes to Vim, load it as

gvim -N -u NONE

and then type

:language

This will show you all parts of the locale: LC_MESSAGES for menus &
messages translations, LC_CTYPE for character encoding, LC_TIME for
timestamp format, and a bunch of others which I don't think Vim uses.
(These are Unix-like names but I remember from my years on XP that
they're also used in Vim for Windows.)

Windows codepage 1250 is a "Central European" encoding which is OK for
formerly Czechoslovak countries and for those formerly Yugoslav
countries which use the Latin alphabet; but of course Vim and the OS
must agree on which encoding is to be used to transmit keystrokes from
the keyboard to Vim and, in console Vim, from Vim to the screen. (gvim
displays its text on its own graphical screen without passing _text_ to
the OS for displaying.)

Yes, if 65001 is the Windows name for UTF-8, then you could indeed copy
the menu (and helpfiles) with names ending in .utf-8 or .utf-8.vim etc.
to files of the same name with the .utf8 replaced by .65001 (Why can't
Windows use sl_SI.UTF-8 like Unix does? I suppose Bill Gate$$$ wants to
keep his users captive and prevent them for escaping to Unix/Linux. Oh
well...)

Beware though that any file in the $VIMRUNTIME tree can be modified or
erased without warning any time you upgrade Vim and/or its runtime
files, so when release 7.3 (or, who knows?, 8.0) gets published, be sure
to copy your files again. On Linux you could set up
$VIM/vimfiles/menu_slovenian_slovenia.65001.vim as a soft link to
../latest/menu_sl_si.utf-8.vim with $VIM/latest as a soft link to the
current $VIMRUNTIME but of course AFAIK soft links are foreign to
Windows so I suppose you'll have to resort to copying. Just be sure that
upgrading Vim doesn't remove your copies.

Or else, you could set 'encoding' to UTF-8 if it isn't already, before
loading or reloading the menus. Maybe the following (untested) could
work, at the very top of your vimrc, and regardless of the OS locale:

set nocompatible
if has('multi_byte') " no use to try if Vim can't do it
if &enc !~? '^u' " already Unicode?
if &tenc == '' " don't clobber keyboard encoding
let &tenc = &enc " the �old� value
endif
set enc=utf-8
endif
set fencs=ucs-bom,utf-8,cp1250 " how to detect
" existing files' encodings
" or:
" set fencs=ucs-bom,utf-8,default,iso-8859-2
" etc. (see :help 'fileencodings')
if has('multi_lang')
" I think the following commented-out line is unnecessary
" runtime delmenu.vim
if exists(':try') == 2
try
lang messages sl_si.utf-8
catch
silent! lang messages
\ Slovenian_Slovenia.65001
endtry
else
silent! lang messages sl_si.utf-8
if v:lang != 'sl_si.utf-8'
silent! lang messages
\ Slovenian_Slovenia.65001
endif
endif
endif
endif
runtime vimrc_example.vim
" add additional customizations here

(the vimrc_example.vim does, among others, ":filetype plugin indent on"
and ":syntax on", and the menus are sourced, in gvim, as a result of that.)


Best regards,
Tony.
--
Ambition is a poor excuse for not having sense enough to be lazy.
-- Charlie McCarthy

Mojca Miklavec

unread,
Jan 1, 2010, 8:46:24 AM1/1/10
to Tony Mechelynck, vim...@googlegroups.com

No problem.

> To see which locale the OS passes to Vim, load it as
>
>        gvim -N -u NONE
>
> and then type
>
>        :language
>
> This will show you all parts of the locale: LC_MESSAGES for menus & messages
> translations, LC_CTYPE for character encoding, LC_TIME for timestamp format,
> and a bunch of others which I don't think Vim uses. (These are Unix-like
> names but I remember from my years on XP that they're also used in Vim for
> Windows.)

Current language:
"LC_COLLATE=Slovenian_Slovenia.1250;LC_CTYPE=C;LC_MONETARY=Slovenian_Slovenia.1250;LC_NUMERIC=C;LC_TIME=Slovenian_Slovenia.1250"

BUT!!!

The problem is that, contrary to probably every single UNIX-like
machine, the encoding here doesn't mean "this is the encoding used by
the system", but rather "this is the encoding that programs that don't
support UTF-8 should be using". At least that's true for Windows 7 and
most probably Vista as well. I don't know for sure about XP.

> Windows codepage 1250 is a "Central European" encoding which is OK for
> formerly Czechoslovak countries and for those formerly Yugoslav countries
> which use the Latin alphabet; but of course Vim and the OS must agree on
> which encoding is to be used to transmit keystrokes from the keyboard to Vim
> and, in console Vim, from Vim to the screen. (gvim displays its text on its
> own graphical screen without passing _text_ to the OS for displaying.)
>
> Yes, if 65001 is the Windows name for UTF-8, then you could indeed copy the
> menu (and helpfiles) with names ending in .utf-8 or .utf-8.vim etc. to files
> of the same name with the .utf8 replaced by .65001 (Why can't Windows use
> sl_SI.UTF-8 like Unix does? I suppose Bill Gate$$$ wants to keep his users
> captive and prevent them for escaping to Unix/Linux. Oh well...)

Every single operating system has its own specifics. I wouldn't dare
to say it that way - windows developers would complain that unix and
mac are weird and vice versa. Windows version simply needs some
special care and so does Mac version (that doesn't even allow menu
translations at the moment which is really a pitty).

My very honest opinion is that gvim works *BEST* on Windows. I know
that's a kind of paradox, but to my taste it really has the nicest
user interface (including automatic support for ctrl+c/x/v/s, very
smooth mouse interactions etc.).

> Beware though that any file in the $VIMRUNTIME tree can be modified or
> erased without warning any time you upgrade Vim and/or its runtime files, so
> when release 7.3 (or, who knows?, 8.0) gets published, be sure to copy your
> files again.

My intention is not to fix my own copy of Vim (I switched to a
different OS anyway), but to make it work by default for every
"Central European" (or Slovenian) user. This means that if I was to
make any change, I would request to fix it upstream, rather than
fixing it on my computer only.

> On Linux you could set up
> $VIM/vimfiles/menu_slovenian_slovenia.65001.vim as a soft link to
> ../latest/menu_sl_si.utf-8.vim with $VIM/latest as a soft link to the
> current $VIMRUNTIME but of course AFAIK soft links are foreign to Windows so
> I suppose you'll have to resort to copying.

There are plenty of examples in $VIMRUNTIME that don't require
copying, but rather
source <sfile>:p:h/menu_de_de.latin1.vim

> Or else, you could set 'encoding' to UTF-8 if it isn't already, before
> loading or reloading the menus. Maybe the following (untested) could work,
> at the very top of your vimrc, and regardless of the OS locale:

I'll test it. My problem has been solved by playing with encoding
settings (a file that worked in XP stopped working in Windows 7). The
default settings work fine, but use cp1250 even though Windows 7
should be able to work with UTF-8 natively. In my opinion it would be
much better to use UTF-8 (Unicode) by default in newer Windows
versions.

Mojca

Reply all
Reply to author
Forward
0 new messages