[vim/vim] MacRoman-encoded file opened as Latin1 (Issue #10992)

54 views
Skip to first unread message

Lifepillar

unread,
Aug 27, 2022, 7:55:19 AM8/27/22
to vim/vim, Subscribed

Steps to reproduce

  1. Open the attached file in Vim with vim --clean xyz.applescript. The file has been saved with Mac OS Roman encoding.
  2. se fenc?

The output is fileencoding=latin1.

A workaround is to open file with explicit encoding (++enc=macroman). Other editors, however, detect the encoding correctly (I have tried with BBEdit), so I would expect Vim to be able to do the same.

Note that due to the wrong detection, some characters are not displayed correctly. Compare the defaul behavior:
opened-as-latin1
with :e ++enc=macroman xyz.applescript:
opened-as-macroman

Expected behaviour

I expected the buffer's fileencoding to be set to macroman.

Version of Vim

9.0.0252

Environment

OS: macOS 12.5.1
Terminal: Apple Terminal.app
$TERM: xterm-256color
shell: ZSH 5.8.1

Logs and stack traces

No response


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/10992@github.com>

Lifepillar

unread,
Aug 27, 2022, 7:56:48 AM8/27/22
to vim/vim, Subscribed

Ops, I haven't attached the file. Here it is:
xyz.applescript.gz


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/10992/1229178627@github.com>

Bram Moolenaar

unread,
Aug 27, 2022, 12:25:48 PM8/27/22
to vim/vim, Subscribed

MacRoman is an 8-bit encoding. Vim cannot guess the encoding from just looking at the text, it could be any 8-bit encoding. You can set 'fileencodings' to use "macroman" instead of latin1 for the default 8-bit encoding. But it's likely that causes more trouble than it solves, unless you edit a lot of MacRoman files.
Not sure why other editors would open the file as MacRoman. Either they guess or perhaps use some meta data?


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/10992/1229222317@github.com>

Lifepillar

unread,
Aug 27, 2022, 4:23:20 PM8/27/22
to vim/vim, Subscribed

Closed #10992 as completed.


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issue/10992/issue_event/7272501040@github.com>

Lifepillar

unread,
Aug 27, 2022, 4:23:20 PM8/27/22
to vim/vim, Subscribed

Ah, I see. Yes, BBEdit uses some heuristics based on the text content. No metadata, AFAICS. For instance, if each byte in the file can be interpreted as ASCII, it sets the encoding to ASCII. But now that you have asked me, I have double-checked BBEdit's settings, and, in fact, it uses Mac OS Roman as a fallback for when the encoding can't be guessed. If I change the fallback to ISO Latin 1, it opens the file as ISO Latin 1. So, it's not as smart as I had thought, and no smarter than Vim.


Reply to this email directly, view it on GitHub.

You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/issues/10992/1229259922@github.com>

Reply all
Reply to author
Forward
0 new messages