Open a file and encodage

15 views
Skip to first unread message

Vincent MAILLE

unread,
Sep 19, 2021, 1:08:15 PM9/19/21
to PyScripter
Hi,

     Has the way a file is opened changed ? In my code, I open a file with a simple open, if I execute the code with Pyscripter 4, I have the message :

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 389: invalid continuation byte...

But not with the version 3.6.3, do you know why ?

Thanks,

Vincent


PyScripter

unread,
Sep 19, 2021, 1:13:54 PM9/19/21
to PyScripter
Files without an encoding comment such as:

# -*- coding: utf-8 -*-

are now assumed to be utf8 encoded.  Python 3 makes the same assumption, but Python 2 was assuming the windows text encoding.  So if your file contains say french characters, you need to save it as utf8.

Vincent MAILLE

unread,
Sep 19, 2021, 1:35:17 PM9/19/21
to PyScripter
I understood where the problem came from

With PyScripter 3.6.3 :
>>> import locale
>>> locale.getpreferredencoding()
'cp1252'

with PyScripter 4.0 :
>>> import locale
>>> locale.getpreferredencoding()
'UTF-8'

Can I change de configuration ?
Vincent

PyScripter

unread,
Sep 19, 2021, 1:55:09 PM9/19/21
to PyScripter
Are you using the same version of python in the above?

It is not PyScripter but python 3.x that assumes files with no BOM or file encoding comment are utf8 encoded.   So if your file contained strings with French characters it would not work well in python 3.x (say you run the file from the command prompt) unless it was utf8 encoding.   If you really want to avoid utf8 encoding (not sure why) then you can still add to your files in the first line:
 # -*- coding: cp1252 -*-
which is still recognized by both python and PyScripter.

Vincent MAILLE

unread,
Sep 19, 2021, 5:26:36 PM9/19/21
to PyScripter
Yes, it's the same version Python 3.7.6

I force the function getpreferredencoding()to return ' cp1252' and I wrote this simple code :
=====================================================
 # -*- coding: cp1252 -*-

import locale

print(locale.getpreferredencoding())
f = open('truc.txt', 'r')
print(f)
f.close()
===================================================
I have :
cp1252
<_io.TextIOWrapper name='truc.txt' mode='r' encoding='UTF-8'>

But it's not what is explain here : https://docs.python.org/fr/3.7/library/functions.html?highlight=open#open .... I don't understand....

Vincent

PyScripter

unread,
Sep 19, 2021, 6:18:23 PM9/19/21
to PyScripter
I am sorry I though the question was about source file encodings.

In PyScripter v4 the way pyscripter gets the output from the interpreter was modified and is now much faster and more reliable.  (it uses standard output redirection)  However, for this to work reliably, the interpreter now runs in utf8 mode (see  1. Command line and environment — Python 3.9.7 documentation for what this means).

You have to explicitly specify the encoding in the open command if this is not utf8.  I am afraid this is not currently configurable.

This issue was also discussed here.

PyScripter

unread,
Sep 20, 2021, 6:08:24 PM9/20/21
to PyScripter
By the way Vincent, it would be nice if you could help a bit with the French translation which is left a bit behind.

PyScripter

unread,
Sep 20, 2021, 6:18:31 PM9/20/21
to PyScripter
Reply all
Reply to author
Forward
0 new messages