Opening BOMless UTF8 files

50 views
Skip to first unread message

Filipe

unread,
Apr 1, 2008, 10:34:36 AM4/1/08
to PyScripter
Hi,

When opening a file encoded as UTF8, but without a byte order mark,
PyScripter is interpreting it as an ANSI file, and scrambling the
characters that have no representation in ANSI. Having no BOM, I
believe it might be harder to decide weather the the file is encoded
as UTF8 or ANSI (I'm not sure how other editors deal with this), but
i'm finding it a bit annoying.

Cheers,
Filipe

PyScripter

unread,
Apr 1, 2008, 11:05:37 AM4/1/08
to PyScripter
For Python files you have to follow PEP 263 and include an encoding
comment (http://www.python.org/dev/peps/pep-0263/). For other files
the is an IDE option (Tools, Options, IDE Options) to detect utf8 when
you open a file.

Filipe

unread,
Apr 1, 2008, 11:09:40 AM4/1/08
to PyScripter
Just found out that by adding the following line on the top of each
python file, pyscripter uses the intended encoding:
# -*- coding: utf-8 -*-

I'm working with a lot of files though, I'd rather have some other
solution... :\

--
Filipe

Filipe

unread,
Apr 1, 2008, 11:14:55 AM4/1/08
to PyScripter
Oh, ok, thanks. I thought that option only applied to new files (and I
wasn't getting it to work on existing python files).

Cheers,
Filipe

Rodolfo

unread,
Apr 1, 2008, 7:24:06 PM4/1/08
to PyScripter
Well, in my humble opinion, since we're working with Python, and you
can modify the files you're working with, it's not hard to get the
proper encoding header on a batch of files...

Just write something to open your files, write the "# -*- coding:
utf-8 -*- " and you're done.
But still, setting up PyScripter to detect utf-8 files will be the
easiest way, involving no changes on your code ;-)

However, I advise you to from now on always write the encoding on the
first (or second) line of your files.

Sincerely yours,

Rodolfo
Reply all
Reply to author
Forward
0 new messages