Hi! I need to perform ANSI/UNICODE commands in my GVIM. I read many docs on the web about and I did set these:
"------------------------------------------ " .vimrc includes "---------------------- :set ffs=dos,unix,mac " autosense Dos,Unix,Mac :set fileencodings=ucs-bom,utf-8,latin1 " autosense coding " (no fileencoding is set in .vimrc)
"------------------------------------------ " my menu includes "----------------------
:set fileencoding=latin1<CR><Esc>:set ff=dos<CR>:w!<CR> " ANSI Dos :set fileencoding=utf-8<CR>:w!<CR><Esc> " UNICODE
"-------------------------------------------
but they don't work correctly (I check the files with an other editor)...
1) I open an ANSI file with GVim, I ask ":set fileencoding" and the file appears as "utf-8"
2) conversion between ANSI <> UNICODE and then reverse likes right ("converted" in bottom status line), but refreshing the file on the other editor I see the same coding...
> but they don't work correctly (I check the files with an
> other editor)...
> 1) I open an ANSI file with GVim, I ask ":set fileencoding"
> and the file appears as "utf-8"
What's the value for the option "encoding" ?
if the "encoding" is set to utf-8 when you don't set "fenc" , the file
will open with the same encoding as what you set to enc , which should
become 'utf8'
> 2) conversion between ANSI <> UNICODE and then reverse likes
> right ("converted" in bottom status line), but refreshing
> the file on the other editor I see the same coding...
What do you mean for the "The same coding "?
The coding is always ANSI or UNICODE?
And before this , there is something should be mentioned.
UTF-8 is a variable-length character encoding for UNICODE , and
if you original file is ansi encoding , when it is convert to utf8 ( without
BOM),
the file should be same ( seems like no convert ). Because utf8 uses the
single
octet encoding only for the 128 US-ASCII characters which is the same as
when it is encoding with ansi.
Paolo wrote: > I need to perform ANSI/UNICODE commands in my GVIM.
The procedure is pretty baffling. Generally, by the time you have read the file, it is too late. I used the following code to convert several files a year ago.
I have the following in my vimrc, but I _think_ that this does not matter given the following procedure: set encoding=utf-8
You would put the following in a file, say convert.vim, and edit it for the names of the files you want. You need to also specify the encoding for reading, and for writing.
" Convert specified files from cp1251 to utf-8. let files = 'first.txt second.txt third.txt' for f in split(files) exec 'edit ++enc=cp1251 '.f exec 'write ++enc=utf-8 '.f endfor
After saving the above file, open it in Vim (do NOT open any other file), and enter the following to execute the code:
:so %
That "sources" the current file. You should have a COPY of the files you want to convert in the current directory. They will be overwritten.
querying ":set", encoding doesn't appear - maybe because I don't setted it in .vimrc...
querying "set encoding", answer is "latin1" (which is correct)
>> refreshing the file on the other editor >> I see the same coding > What do you mean for the "The same coding "?
pls, forget this point: GVim is latin1, other editor is ANSI. GVim convert to utf-8: refreshing other editor shows "utf-8 w/signature". GVim convert to latin1: refreshing other editor it DID show again utf-8... I don't know why... now it works...
" also :set fenc=ucs-2le<CR> " for Windows Unicode
(note than convert to "Utf-8 with signature"... also, note than :set fenc=latin9 and then :w! doesn't convert... but anyway I have a sequence which works to Unicode and reverse to ANSI! thx to everybody ;)
On Fri, Nov 6, 2009 at 4:31 PM, Paolo Baruffa <win...@people.it> wrote:
> to winterTTr:
> > What's the value for the option "encoding" ?
> querying ":set", encoding doesn't appear - maybe because I don't setted it
> in .vimrc...
> querying "set encoding", answer is "latin1" (which is correct)
> >> refreshing the file on the other editor
> >> I see the same coding
Maybe i know the reason.
Because the sequence of setting you set to the fileencodings.
When a file is read , vim try to read the file via the each one you set
to the "fileencodings" until he find the one with which the file could be
read in
correctly. Then vim will set the fileencoding to the one he find.
You set "utf-8" before "latin1" in fileencodings as below :
:set fileencodings=ucs-bom,utf-8,latin1 " autosense coding
so when the vim try to read a file , he will try to use the encoding utf8
to
read the file , and success.
So ,the fileencoding is always set to utf8 .
> pls, forget this point:
> GVim is latin1, other editor is ANSI.
> GVim convert to utf-8: refreshing other editor shows "utf-8 w/signature".
> GVim convert to latin1: refreshing other editor it DID show again utf-8...
> I don't know why... now it works...
> " also :set fenc=ucs-2le<CR> " for Windows Unicode
> (note than convert to "Utf-8 with signature"... > also, note than :set fenc=latin9 and then :w! doesn't convert... > but anyway I have a sequence which works to Unicode and reverse to ANSI! > thx to everybody ;)
If the Euro sign is 0x80 then the file is NOT in Latin1 aka ISO-8859-1 (there is no Euro sign in Latin1), and also not in Latin9 aka ISO-8859-15 (where the Euro sign is 0xA4), but it could be Windows-1252 (sometimes known as cp1252), where 0x80 is indeed the Euro sign. In Unicode the Euro sign is assigned to codepoint U+20AC, represented on disk in UTF-8 as 0xE2 0x82 0xAC.
IIUC, Windows-1252 and Latin1 are identical except for 0x80 to 0xBF, which are (nonprinting) control characters in Latin1 and printable characters in Windows-1252. Many Windows systems abusively call their cp1252 charset "Latin1".
To see the available characters in any 8-bit non-EBCDIC encoding, use (in a gvim with 'encoding' set to UTF-8)
:view ++enc=<encoding> alphabet.txt
on the attached file, replacing <encoding> by the charset's name. You'll see characters 0x20 to 0xFF arranged in order, in 14 lines of 16. Vim may say [Conversion error at line <number>], with the line number of the first line where it couldn't convert, but that just means there are bytes which should never happen in a file coded in the encoding you chose. A "?" (question mark) placeholder appears instead of these characters.
Best regards, Tony. -- "Based on what you know about him in history books, what do you think Abraham Lincoln would be doing if he were alive today?
(1) Writing his memoirs of the Civil War. (2) Advising the President. (3) Desperately clawing at the inside of his coffin." -- David Letterman