UnicodedecodeError

Thomas Balstrøm

unread,

Apr 22, 2021, 4:01:10 PM4/22/21

to PyScripter

Hi,

A Python 3.6 script in PyScripter 3.6.4 executes fine.

But in PyScripter 4.0.0, I get this error:

File "C:\Users\tb\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone\lib\codecs.py", line 322, in decode

(result, consumed) = self._buffer_decode(data, self.errors, final)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 2527: invalid continuation byte

Apparently, I've triggered a wrong decoding, but what to look for and where?

Please advise.

Kind regards,

Thomas

PyScripter

unread,

Apr 22, 2021, 4:09:39 PM4/22/21

to PyScripter

Could you please post the full traceback? I would like to see where the call to _buffer_decode is coming from and if possible what data contains.

You could also place a breakpoint at line 322 of codecs.py and debug.

Thomas Balstrøm

unread,

Apr 22, 2021, 4:20:26 PM4/22/21

to PyScripter

Sorry, the full traceback is here:

Traceback (most recent call last):

File "C:\Users\tb\OneDrive - Københavns Universitet\Courses\Python\Scripts36\CoolDictionarySolutionCruiseShipsProject1_TBPaths.py", line 32, in <module>

for line in f:

PyScripter

unread,

Apr 22, 2021, 4:59:03 PM4/22/21

to PyScripter

I think this is caused by the fact that PyScripter 4.0.0 runs python with -X utf8 flag.

See 1. Command line and environment — Python 3.9.4 documentation

and 1. Command line and environment — Python 3.9.4 documentation for the implications.

It assumes therefor that the file you are reading (I assume you are reading a file), is in utf8 encoding by default. Apparently the file you are reading contains characters in > #127 and is not compatible with utf8. So you need to specify an encoding when you open the file if it is not utf8.

such as:

with open('unicode.txt', encoding=' cp437 ') as f:

for line in f: print(repr(line))

you can also use surrogateescape

with open(fname, 'r', encoding="ascii", errors="surrogateescape") as f:

data = f.read()

Please note that with recent versions of python utf8 is the default system encoding.

Thomas Balstrøm

unread,

Apr 22, 2021, 5:47:08 PM4/22/21

to PyScripter

Thank you very much. The encoding=' cp437 ' worked,. So, apparently my file isn't utf-8.

Best regards, Thomas

Reply all

Reply to author

Forward