Setting encoding for a type of file

163 views
Skip to first unread message

Marcio Gil

unread,
Nov 10, 2011, 2:06:50 PM11/10/11
to vim_use
When I edit a DOS batch (for example), I always need to put ':e +
+enc=cp850'.

Can I add in my _vimrc file a auto command for this? Example:

autocmd FileType dosbatch setlocal fileencoding=cp850
(don't works!)

Thanks,

Marcio.

Christian Brabandt

unread,
Nov 10, 2011, 3:09:25 PM11/10/11
to vim_use
Hi Marcio!

Probably the FileType event occurs to late. Try something like this:

autocmd FileType dosbatch :e! ++enc=cp850

regards,
Christian

Tony Mechelynck

unread,
Nov 10, 2011, 3:16:33 PM11/10/11
to vim...@googlegroups.com, Marcio Gil

The FileType autocommand event is too late for setting the
'fileencoding', because at that point the file has already been read.
For the same reason it isn't useful to set that option by means of a
modeline.

Try (untested)

au BufReadPre,BufNewFile *.bat,*.btm,*.sys setlocal fenc=cp850


Best regards,
Tony.
--
The notion of a "record" is an obsolete remnant of the days of the
80-column card.
-- Dennis M. Ritchie

Ben Fritz

unread,
Nov 10, 2011, 3:29:21 PM11/10/11
to vim_use
This doesn't work because changing fileencoding doesn't have any
effect on the characters in the buffer. You actually know how to do
this already, just change your :setlocal command, to an :e ++enc
command.

Here's why it fails. Without a specific encoding specified with ++enc,
Vim uses 'fileencodings' (note the 's' at the end) to determine which
encodings to try when reading a file. The first one in the list which
does not result in invalid data, is used (roughly...there's some other
considerations like a BOM for Unicode files). Vim then uses the
determined encoding to interpret the bytes contained in the file while
reading it in, setting the 'fileencoding' option to reflect the
choice. Using this encoding, Vim creates a buffer which is internally
stored in the encoding specified by the 'encoding' option, doing
conversions as necessary between 'fileencoding' and 'encoding'. Once
you have the buffer of text in 'encoding', you can change
'fileencoding' to anything you want with no effect on the buffer
content. The only further effect 'fileencoding' will have, is when
writing the file. When you finally write the file, Vim will do a
conversion from the charactes in the buffer (stored as bytes according
to 'encoding') to bytes in your chosen 'fileencoding'.

What :e ++enc=cp850 does, is to tell Vim to short-circuit the encoding
detection, and just use cp850 when reading. Everything else stays the
same.

While this should work:

autocmd FileType dosbatch e ++enc=cp850

I actually have something a bit more complex (I've removed some
irrelevant stuff for your immediate problem, if some of this is
confusing as-is). I use a different method, by changing
'fileencodings' prior to loading the file, so that Vim automatically
detects my desired fileencoding:

" Don't detect utf-8 without a BOM by default, I don't use UTF-8
normally
" and any files in latin1 will detect as UTF. Detect cp1252 rather
than
" latin1 so files are read in correctly. Fall back to latin1 if
system does
" not support cp1252 for some reason.
exec 'set fileencodings=ucs-bom,'.s:windows_enc.',latin1'
if has('autocmd')
augroup fenc_detect
au!

" batch files need to use the encoding of the cmd.exe prompt in
Windows
if has('win32') || has('win64')
" get the cmd.exe encoding by asking for it
let g:batcp = substitute(system('chcp'), '^\c\s*Active code
page: \(\d\+\)\s*[^[:print:]]*$', 'cp\1', '')
if g:batcp =~? '^cp\d\+$'
autocmd BufReadPre *.bat exec 'set fileencodings='.g:batcp
autocmd BufNewFile *.bat exec 'setlocal
fileencoding='.g:batcp
endif
endif
" restore default fileencodings after loading the files that use
a special
" value to force specific encodings
exec 'autocmd BufReadPost *.bat set fileencodings=ucs-
bom,'.s:windows_enc.',latin1'
augroup END
endif

Ben Fritz

unread,
Nov 10, 2011, 3:30:34 PM11/10/11
to vim_use


On Nov 10, 1:06 pm, Marcio Gil <marciom...@bol.com.br> wrote:
I forgot to mention, you could also use the AutoFenc plugin, and
specify the encoding in a comment within the file itself:

http://www.vim.org/scripts/script.php?script_id=2721

Marcio Gil

unread,
Nov 10, 2011, 4:26:35 PM11/10/11
to vim_use
On Nov 10, 6:09 pm, Christian Brabandt <cbli...@256bit.org> wrote:
>
> autocmd FileType dosbatch :e! ++enc=cp850
>
works, but put the syntax highlight off

On Nov 10, 6:16 pm, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:
>
> au BufReadPre,BufNewFile *.bat,*.btm,*.sys setlocal fenc=cp850
>
don't works.

On Nov 10, 6:29 pm, Ben Fritz <fritzophre...@gmail.com> wrote:
>
> autocmd FileType dosbatch e ++enc=cp850
>
Same as Christian Brabandt's: works, but put the syntax highlight off
This works only for DOS batch files, other files are also opened in
cp850.

But in the Cygwin vim don't recognizes the s:windows_enc variable, I
will substitute this for 'cp850'

This works for me:

exec 'autocmd BufReadPre *.bat set fileencodings=ucs-bom,cp850,latin1'

Thank you all.

Marcio.

Marcio Gil

unread,
Nov 10, 2011, 4:52:21 PM11/10/11
to vim_use
On Nov 10, 7:26 pm, Marcio Gil <marciom...@bol.com.br> wrote:
>
> This works for me:
>
> exec 'autocmd BufReadPre *.bat set fileencodings=ucs-bom,cp850,latin1'
>

I put this in my _vimrc:

autocmd BufNewFile,BufReadPre *.bat,*.sys,*.cmd,*.prg,*.ch set

Tony Mechelynck

unread,
Nov 10, 2011, 8:13:47 PM11/10/11
to vim...@googlegroups.com, Marcio Gil

This won't work if you edit a batch file and then some non-batch file
(*.c, *.htm, *.txt, whatever; even if you look at a Vim helpfile) in the
same Vim session. Since 'fileencodings' is a global-only option, it will
still be "ucs-bom,cp850,latin1" (where the latin1 part will never be
used, since it is after cp850 which is 8-bit and therefore cannot give a
"fail" signal), so Vim will treat that second file (if it has no BOM) as
if it were in cp850 which is probably not what you want.

Maybe

au BufNewFile,BufReadPre *
\ set fencs=ucs-bom,utf-8,latin1
au BufNewFile,BufReadPre
\ *.bat,*.sys,*.cmd,*.prg,*.ch
\ set fencs=ucs-bom,cp850

The autocommands will be run in the order they were defined, so that for
these 5 extensions the second one takes precedence. The first one should
be set to the defaults you want to use for all other files.


Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
214. Your MCI "Circle of Friends" are all Hayes-compatible.

Ben Fritz

unread,
Nov 10, 2011, 10:52:30 PM11/10/11
to vim_use


On Nov 10, 3:26 pm, Marcio Gil <marciom...@bol.com.br> wrote:
> On Nov 10, 6:09 pm, Christian Brabandt <cbli...@256bit.org> wrote:
>
> > autocmd FileType dosbatch :e! ++enc=cp850
>
> works, but put the syntax highlight off
>
> On Nov 10, 6:16 pm, Tony Mechelynck <antoine.mechely...@gmail.com>
> wrote:
>
> > au BufReadPre,BufNewFile *.bat,*.btm,*.sys setlocal fenc=cp850
>
> don't works.
>
> On Nov 10, 6:29 pm, Ben Fritz <fritzophre...@gmail.com> wrote:
>
> > autocmd FileType dosbatch e ++enc=cp850
>
> Same as Christian Brabandt's: works, but put the syntax highlight off
>
>

Oops, forgot the "nested" keyword. Try:
autocmd FileType dosbatch nested e ++enc=cp850

See :help autocmd-nested
Oops, that's an artifact of my using the same .vimrc on Unix and
Windows. I do something like this, before the code snippet:

if has('unix')
let s:windows_enc = '8bit-cp1252'
else " windows
let s:windows_enc = 'cp1252'
endif

s:windows_enc doesn't exist by default. Setting it to cp850 means ALL
files will be detected with this encoding, if they don't have a BOM.

> This works for me:
>
> exec 'autocmd BufReadPre *.bat set fileencodings=ucs-bom,cp850,latin1'
>

As Tony says, this will set ALL files to cp850, unless they have a
BOM.

The point of my script snippet was:

1. For most files, use ucs-bom to use a Unicode encoding if the file
has a BOM, then try windows-1252, but if the system doesn't recognize
windows-1252, try latin1 (I actually have more autocmds to check for
characters specific to windows-1252 and use latin1 if not present).
2. For *.bat files only, override this to ONLY try the cmd.exe
encoding
3. Restore option (1) after loading dos files, since the option is
global and the fenc is already set appropriately

Tony Mechelynck

unread,
Nov 11, 2011, 4:40:18 AM11/11/11
to vim...@googlegroups.com, Ben Fritz

You can also use the value 'Windows-1252' (the official name) which is
known by iconv and so should work both on Unix Vim statically linked
with +iconv and on Windows Vim dynamically linked with +iconv/dyn if
iconv.dll or libiconv.dll can be found.

>
> s:windows_enc doesn't exist by default. Setting it to cp850 means ALL
> files will be detected with this encoding, if they don't have a BOM.
>
>> This works for me:
>>
>> exec 'autocmd BufReadPre *.bat set fileencodings=ucs-bom,cp850,latin1'
>>
>
> As Tony says, this will set ALL files to cp850, unless they have a
> BOM.
>
> The point of my script snippet was:
>
> 1. For most files, use ucs-bom to use a Unicode encoding if the file
> has a BOM, then try windows-1252, but if the system doesn't recognize
> windows-1252, try latin1 (I actually have more autocmds to check for
> characters specific to windows-1252 and use latin1 if not present).
> 2. For *.bat files only, override this to ONLY try the cmd.exe
> encoding
> 3. Restore option (1) after loading dos files, since the option is
> global and the fenc is already set appropriately
>

Best regards,
Tony.
--
Mencken and Nathan's Fifteenth Law of The Average American:
The worst actress in the company is always the manager's wife.

Ben Fritz

unread,
Nov 11, 2011, 10:59:22 AM11/11/11
to vim_use


On Nov 11, 3:40 am, Tony Mechelynck <antoine.mechely...@gmail.com>
wrote:
> On 11/11/11 04:52, Ben Fritz wrote:
>
> > Oops, that's an artifact of my using the same .vimrc on Unix and
> > Windows. I do something like this, before the code snippet:
>
> > if has('unix')
> >    let s:windows_enc = '8bit-cp1252'
> > else " windows
> >    let s:windows_enc = 'cp1252'
> > endif
>
> You can also use the value 'Windows-1252' (the official name) which is
> known by iconv and so should work both on Unix Vim statically linked
> with +iconv and on Windows Vim dynamically linked with +iconv/dyn if
> iconv.dll or libiconv.dll can be found.
>

Yeah, I tried that too when I got frustrated it wasn't working.
Neither my Windows PC nor the Solaris system I primarily work on
recognize "windows-1252", and in fact the Solaris system doesn't
recognize any of "cp1252", "8bit-cp1252", "8bit-windows-1252",
"windows1252", or "8bit-windows1252" either. So I just stuck with this
version for now, and fall back to latin1 on the Solaris system.
Reply all
Reply to author
Forward
0 new messages