E670: Mix of help file encodings within a language

170 views
Skip to first unread message

Marco

unread,
Sep 25, 2012, 4:32:07 AM9/25/12
to vim...@googlegroups.com
Hi,

I have a bunch of help files in my ~/.vim/doc directory. Most of
them are UTF-8, a few ASCII. While adding a new file and running
:helptags I get the following error:

E670: Mix of help file encodings within a language

I found http://article.gmane.org/gmane.editors.vim/80084, but the
described solution didn't work for me. The “tags” file stays empty.
If I remove the recently added file, it works fine. The new added
file is UTF-8 encoded, but so are others.

What can I do to fix this?


Marco


Tony Mechelynck

unread,
Sep 25, 2012, 8:04:15 AM9/25/12
to vim...@googlegroups.com, Marco
On 25/09/12 10:32, Marco wrote:
> Hi,
>
> I have a bunch of help files in my ~/.vim/doc directory. Most of
> them are UTF-8, a few ASCII. While adding a new file and running
> :helptags I get the following error:
>
> E670: Mix of help file encodings within a language
>
> I found http://article.gmane.org/gmane.editors.vim/80084, but the
> described solution didn't work for me. The �tags� file stays empty.
> If I remove the recently added file, it works fine. The new added
> file is UTF-8 encoded, but so are others.
>
> What can I do to fix this?
>
>
> Marco
>
>
Does the new helpfile have a BOM? If it does, try removing it.

:e ~/.vim/doc/foobar.txt " replacing "foobar" by the filename
:verbose setlocal bomb?
bomb
:setlocal nobomb
:w
:helptags ~/.vim/doc

If that doesn't work, check your 'encoding':

:verbose set encoding?

Mine is UTF-8, and it sees almost all helpfiles shipped with VIM as
UTF-8 (eval.txt is an exception). But if a file contains as few as one
character which is invalid for UTF-8, the file won't be seen as UTF-8
even if the rest of it is OK

:setl fenc?
fileencoding=latin1
8g8

This (see :help 8g8) will move the cursor to the next (if any) character
in the file which is invalid for UTF-8. (If the cursor is already on
such a character, it will not move, and neither will it if there is no
such invalid character in the file, but in the latter case you'll get a
beep.)


Best regards,
Tony.
--
OMNIVERSAL AWARENESS?? Oh, YEH!! First you need four GALLONS of
JELL-O and a BIG WRENCH!! ... I think you drop th' WRENCH in the JELL-O
as if it was a FLAVOR, or an INGREDIENT ... or ... I ... um ...
WHERE'S the WASHING MACHINES?

Marco

unread,
Sep 25, 2012, 8:32:09 AM9/25/12
to vim...@googlegroups.com
2012-09-25 Tony Mechelynck <antoine.m...@gmail.com>:

Hi Tony!

> Does the new helpfile have a BOM? If it does, try removing it.
>
> :e ~/.vim/doc/foobar.txt " replacing "foobar" by the filename
> :verbose setlocal bomb?

nobomb

> :setlocal nobomb
> :w
> :helptags ~/.vim/doc

E670: Mix of help file encodings within a language

> If that doesn't work, check your 'encoding':
>
> :verbose set encoding?

encoding=utf-8

> Mine is UTF-8, and it sees almost all helpfiles shipped with VIM as
> UTF-8 (eval.txt is an exception). But if a file contains as few as one
> character which is invalid for UTF-8, the file won't be seen as UTF-8
> even if the rest of it is OK
>
> :setl fenc?

fileencoding=utf-8

> 8g8

The cursor does not move. If I set fileencoding=latin1, then it
moves to a non-ascii character, but I guess that's expected.


Marco


Tony Mechelynck

unread,
Sep 25, 2012, 11:17:08 AM9/25/12
to vim...@googlegroups.com, Marco
Well, is there a bomb on another helpfile?

:vimgrep /\%1l/ ~/.vim/*.txt
:setl fenc? bomb? " watch for "utf-8" together with "bomb"
" or for anything other than utf-8 or latin1
:cn|setl fenc? bomb?
:cn|setl fenc? bomb?
:cn|setl fenc? bomb?
etc.

You may of course use mapping to ease your typing:

:map <F2> :cn|setl fenc? bomb?
:map <S-F2> :cN|setl fenc? bomb?

Best regards,
Tony.
--
Harvard Law:
Under the most rigorously controlled conditions of pressure,
temperature, volume, humidity, and other variables, the organism will
do as it damn well pleases.

Marco

unread,
Sep 25, 2012, 11:58:42 AM9/25/12
to vim...@googlegroups.com
2012-09-25 Tony Mechelynck <antoine.m...@gmail.com>:

> Well, is there a bomb on another helpfile?
>
> :vimgrep /\%1l/ ~/.vim/*.txt
> :setl fenc? bomb? " watch for "utf-8" together with "bomb"
> " or for anything other than utf-8 or latin1
> :cn|setl fenc? bomb?
> :cn|setl fenc? bomb?
> :cn|setl fenc? bomb?

Vim says they're all utf-8 and nobomb.

I did some further experiments. The result is:

The mentioned error is thrown when “one or more but not all” files
have one or more non-ASCII characters in the *first line*. If all
files have at least one non-ASCII character, it works fine.
Non-ASCII characters elsewhere than the first line are not
problematic. That's seems weird. A bug or a feature?


Marco


Tony Mechelynck

unread,
Sep 25, 2012, 12:33:45 PM9/25/12
to vim...@googlegroups.com, Marco
On 25/09/12 17:58, Marco wrote:
> 2012-09-25 Tony Mechelynck <antoine.m...@gmail.com>:
>
>> Well, is there a bomb on another helpfile?
>>
>> :vimgrep /\%1l/ ~/.vim/*.txt
>> :setl fenc? bomb? " watch for "utf-8" together with "bomb"
>> " or for anything other than utf-8 or latin1
>> :cn|setl fenc? bomb?
>> :cn|setl fenc? bomb?
>> :cn|setl fenc? bomb?
>
> Vim says they're all utf-8 and nobomb.
>
> I did some further experiments. The result is:
>
> The mentioned error is thrown when �one or more but not all� files
> have one or more non-ASCII characters in the *first line*. If all
> files have at least one non-ASCII character, it works fine.
> Non-ASCII characters elsewhere than the first line are not
> problematic. That's seems weird. A bug or a feature?
>
>
> Marco
>
>

I don't know; but the first line is what magically appears under ":help
local-additions" as if it were part of $VIMRUNTIME/help.txt

It should contain the filename and title, as in matchit.txt:

*matchit.txt* Extended "%" matching

and Vim changes stars to bars around the filename in the "Local
additions" list.

Since help.txt must all be in a single encoding (but US-ASCII, Latin1
and UTF-8, and many others, all have identical representations for
codepoints U+0000 to U+007F), it comes to reason that the first lines of
all "locally added" help files must be in a "compatible" encoding. For
instance on a zOS EBCDIC system they would, I suppose, be all in some
EBCDIC encoding, and compatible with each other but not with ASCII.

Since the $VIMRUNTIME/doc/tags is also regenerated using Vim, the same
criterion applies to it, and this is how eval.txt and map.txt can have a
few Latin1 bytes above 0x7F (but not in the first line), while
options.txt, arabic.txt and hebrew.txt (and maybe others) are in UTF-8
with Greek, Arabic and Hebrew letters (respectively) in UTF-8
representation (but, again, not in the first line). I haven't succeeded
to display farsi.txt correctly, all I can say is that it seems to be in
some 8-bit encoding which is not Latin1 (and I tried iso-8859-6 too, and
even ":view ++enc=farsi", but without success).


Best regards,
Tony.
--
Suddenly, Professor Leibowitz realizes he has come to the seminar
without his duck ...

Christian Brabandt

unread,
Sep 25, 2012, 12:34:45 PM9/25/12
to vim...@googlegroups.com, vim...@googlegroups.com
Yes, Vim checks only the first line for each file.

This error can happen, if the first file does not have a
non-ASCII character but any of the other files has such a
character in the first line. (In this case, Vim expects each
other file to not have multi-byte characters in the first line).

This sounds like a bug (so forwarding to vim-dev).

Also currently, Vim only checks for multi-byte or non-multibyte
characters, but doesn't care whether the encoding is different for
multibyte characters. Not sure, if Vim should do this.

regards,
Christian

Marco

unread,
Sep 25, 2012, 1:08:24 PM9/25/12
to vim...@googlegroups.com
2012-09-25 Tony Mechelynck <antoine.m...@gmail.com>:

> I don't know; but the first line is what magically appears under ":help
> local-additions" as if it were part of $VIMRUNTIME/help.txt

This might be unrelated to my initial problem, but most custom files
appear on the list, but some don't. After some testing I found out
that a file that starts with the same string as another file gets
ignored. However, I didn't quite get the algorithm when a file is
included in the local-additions list and when it's ignored.

Some examples:

alpha-beta.txt
alpha-be.txt
alpha-gamma
alphabeta.txt
alphabe.txt
alphagamma.txt

running :helptags on these and checking the :h local-additions list,
only the following files show up

alpha-beta.txt
alpha-gamma
alphabeta.txt
alphagamma.txt

alpha-be.txt and alphabe.txt get ignored. Why?

> Since help.txt must all be in a single encoding (but US-ASCII, Latin1
> and UTF-8, and many others, all have identical representations for
> codepoints U+0000 to U+007F), it comes to reason that the first lines of
> all "locally added" help files must be in a "compatible" encoding. For
> instance on a zOS EBCDIC system they would, I suppose, be all in some
> EBCDIC encoding, and compatible with each other but not with ASCII.

All my files are either ASCII or UTF-8.


Marco


Reply all
Reply to author
Forward
0 new messages