UnicodedecodeError

40 views
Skip to first unread message

Thomas Balstrøm

unread,
Apr 22, 2021, 4:01:10 PM4/22/21
to PyScripter
Hi,

A Python 3.6 script in PyScripter 3.6.4 executes fine.
But in PyScripter 4.0.0, I get this error:

File "C:\Users\tb\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone\lib\codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 2527: invalid continuation byte

Apparently, I've triggered a wrong decoding, but what to look for and where?

Please advise.

Kind regards,
Thomas

PyScripter

unread,
Apr 22, 2021, 4:09:39 PM4/22/21
to PyScripter
Could you please post the full traceback?  I would like to see where the call to _buffer_decode is coming from and if possible what data contains.
You could also place a breakpoint at line 322 of codecs.py and debug.

Thomas Balstrøm

unread,
Apr 22, 2021, 4:20:26 PM4/22/21
to PyScripter
Sorry, the full traceback is here:
Traceback (most recent call last):
  File "C:\Users\tb\OneDrive - Københavns Universitet\Courses\Python\Scripts36\CoolDictionarySolutionCruiseShipsProject1_TBPaths.py", line 32, in <module>
    for line in f:

PyScripter

unread,
Apr 22, 2021, 4:59:03 PM4/22/21
to PyScripter
I think this is caused by the fact that PyScripter 4.0.0 runs python with -X utf8 flag.

It assumes therefor that the file you are reading (I assume you are reading a file), is in utf8 encoding by default.  Apparently the file you are reading contains characters in > #127 and is not compatible with utf8.  So you need to specify an encoding when you open the file if it is not utf8.
such as:

with open('unicode.txt', encoding=' cp437 ') as f: 
      for line in f: print(repr(line))

you can also use surrogateescape

with open(fname, 'r', encoding="ascii", errors="surrogateescape") as f: 
  data = f.read()

Please note that with recent versions of python utf8 is the default system encoding.

Thomas Balstrøm

unread,
Apr 22, 2021, 5:47:08 PM4/22/21
to PyScripter
Thank you very much. The encoding=' cp437 ' worked,. So, apparently my file isn't utf-8.
Best regards, Thomas

Reply all
Reply to author
Forward
0 new messages