Describe the bug
In Vim with french localisation, the localised comments are not properly encoded.
To Reproduce
Detailed steps to reproduce the behavior:
Run $ LANG=fr_FR.UTF-8 vim --clean /tmp/foo.
Do some editing in that file and write it.
Run vim --clean ~/.viminfo.
Move the cursor to line 7.
Behold this trainwreck:
# 'encoding' dans lequel ce fichier a été écrit
*encoding=utf-8
Here are all the problematic lines:
# Ce fichier viminfo a été généré par Vim 8.2.
# Vous pouvez l'éditer, mais soyez prudent.
# 'encoding' dans lequel ce fichier a été écrit
# Dernières chaînes de substitution :
# Historique ligne de commande (chronologie décroissante) :
# Historique chaîne de recherche (chronologie décroissante) :
# Historique expression (chronologie décroissante) :
# Historique ligne de saisie (chronologie décroissante) :
# Historique Ligne de débogage (chronologie décroissante) :
# Liste de sauts (le plus récent en premier) :
# Historique des marques dans les fichiers (les plus récentes en premier) :
Expected behavior
The comments should be encoded properly to look like this:
# 'encoding' dans lequel ce fichier a été écrit
*encoding=utf-8
Environment (please complete the following information):
:echo &encoding prints utf-8.Additional context
The maintainer of the french localisation seems unresponsive.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.![]()
Does your 'viminfo' setting include the c flag?
If it doesn't, do you se a difference if you add
set vi+=c
in your vimrc?
Best regards,
Tony.
No, there is no difference. The current encoding and the one used to write ~/.viminfo are the same, utf-8, so there shouldn't be any difference anyway.
If you start Vim with --clean, it implies -i NONE, which means the viminfo file is not read nor written. You should modify your first step.
Anyway, I have the same issue with LANG=fr_CA.UTF-8 on Kubuntu 20.10. If I open .viminfo and do :set fileencoding?, it prints latin1. If I open the file with :edit ++enc=utf-8 .viminfo, the text looks fine. I then wrote the file, opened it again, and the encoding was utf-8, as it should.
I also tried creating a new viminfo file with the following steps and the encoding of the file is correctly set to utf-8.
mv .viminfo .viminfo_orig
vim -Nu DEFAULTS /tmp/foo
Edit and write that file. Quit.
vim -Nu DEFAULTS .viminfo
Encoding looks fine.
So how did the viminfo file end up with the latin1 encoding?
Related: why are all the files under src/po/ in ISO-8859-1 when Vim is supposed to use UTF-8 internally? This is rather confusing.
Just historic reasons. French fits perfection in latin1 and the file encoding should be recognized automatically.
It saves a few bytes but that is hardly relevant.
Alright.
After converting ~/.viminfo to UTF-8 via ++enc=utf-8, the comments are no longer improperly encoded.
So I ran a little bissection experiment with a backup of my original ~/.viminfo until I found the culprit:
The ý is what appears to be forcing the whole buffer to be encoded in latin1, which presumably caused the issue.
Except I never pressed that key during that recording.
I am somewhat used to see those mysterious <80> (0x80) littering my recordings but that 0xfd following a <80> is new to me.
French fits perfection in latin1
According to this Wikipedia page, it's missing œ, Œ, and Ÿ. The first two are common in French, in words like cœur, sœur, œuf, etc. I think we should switch to UTF-8 for French.
@ProgMetalSlug you beat me to it.
@brammool I am well aware of the risk, which is why I never edited it. I'm not sure why you bring this up.
There was no manual editing involved at all.