ascii decoding error

882 views
Skip to first unread message

C.S.

unread,
Jun 14, 2011, 5:45:28 AM6/14/11
to psychopy-users
Hello,

My experiment breaks during the randomised presentation of words as
soon as there is a word with a non ascii letter like "ä" and I get
error warnings like:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position
1: ordinal not in range(128)

What can I do to fix this?

Greetings,
Carolin

Henrik Singmann

unread,
Jun 14, 2011, 6:14:48 AM6/14/11
to psychop...@googlegroups.com
Hi Caroline

I don't know how you get the words with ä into your program but I expect that you read them in via an external file (this is how I do it).
Then, you need to do the following.
At the beginning of your .py file import codecs:

import codecs

Then, don't open the file via open() but via codecs.open() and set the appropriate encoding, e.g.:

fileWithWords = codecs.open(fileN, 'r',  encoding='utf-8')

(To get and change the encoding of your file use, for example, notepad++. It is for windows and free!)

fileWithWords now works like a usual file object but you have to use unicode to read it, e.g.:

    tmpList = []
    for line in fileWithWords:
        tmpList.append(unicode(line.strip()))

unicode() will also be needed if you want to write these words to a file (but you should not need it not to present them with TextStim).

Also have a look at the textStimuli.py example that comes with PsychoPy for an example of 'hardcoded' non-ascii characters.

Perhaps you are also using the builder and not the coder and this does not help at all. But your question is so unspecific that you can not expect to get more specific answers (at least from me).

Best,

Henrik






2011/6/14 C.S. <Cespe...@gmx.de>

--
You received this message because you are subscribed to the Google Groups "psychopy-users" group.
To post to this group, send email to psychop...@googlegroups.com.
To unsubscribe from this group, send email to psychopy-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/psychopy-users?hl=en.


C.S.

unread,
Jun 14, 2011, 9:11:53 AM6/14/11
to psychopy-users

Hi Henrik,

thank you for your answer. Sorry for being so unspecific. I supposed
it was a general problem.

I use the coder (but mostly code generated by the builder) and the
file starts with # -*- coding: utf-8 -*-, so I thought it should be
able to decode my external .csv file.

I haven't used open() but the usual
#set up handler to look after randomisation of trials etc
trials_2=data.TrialHandler(nReps=1, method=u'random',
extraInfo=expInfo,
trialList=data.importTrialList(u'LDT Wörter.csv'))
thisTrial_2=trials_2.trialList[0]

and then later:
words=visual.TextStim(win=win, ori=0,
text=WortART,
...)

Is it possible that there is a problem with the file coding itself?
It doesn't work with neither ANSI nor Unicode nor UTF-8 coding of the
file, but the error messages are different.
E.g. with UTF-8 coding of the csv file it says:
File "E:\Carolin\Dissertation\Exp. 2\Vortest\Teil2.py", line 60, in
<module>
exec(paramName+'=thisTrial_2.'+paramName)
File "<string>", line 1
WortART=thisTrial_2.WortART
^
SyntaxError: invalid syntax

Greetings,
Carolin

Jonathan Peirce

unread,
Jun 14, 2011, 10:54:15 AM6/14/11
to psychop...@googlegroups.com
Hi Carolin,

Yes, more info is useful, especially error tracebacks as you've used
here. The encoding line at the top of the script only declares what
characters can be used within the script file - it doesn't alter the
conversion of characters from other auxiliary files.

It looks to me like you're trying to include non-ascii characters, not
just as strings for your stimuli but as *names* for parameters. That
definitely won't work, because the names get converted into variable
names and python variable names can't include non-ascii. That is, the
top row of parameter names should not include non-ascii. A possible
workaround for this, if you really need unicode param names, is to set

But I've just tested and it looks like unicode isn't supported by either
the python or matplotlib importers for csv files. I'm sure I can hack
something around that issue, but it might take a little while.

WORKAROUND: Can you save your file as an xlsx file instead? e.g. using
openoffice? The importer for xlsx DOES suuport unicode.

Jon

On 14/06/2011 14:11, C.S. wrote:
> Hi Henrik,
>
> thank you for your answer. Sorry for being so unspecific. I supposed
> it was a general problem.
>
> I use the coder (but mostly code generated by the builder) and the
> file starts with # -*- coding: utf-8 -*-, so I thought it should be
> able to decode my external .csv file.
>
> I haven't used open() but the usual
> #set up handler to look after randomisation of trials etc
> trials_2=data.TrialHandler(nReps=1, method=u'random',
> extraInfo=expInfo,

> trialList=data.importTrialList(u'LDT W�rter.csv'))


> thisTrial_2=trials_2.trialList[0]
>
> and then later:
> words=visual.TextStim(win=win, ori=0,
> text=WortART,
> ...)
>
> Is it possible that there is a problem with the file coding itself?
> It doesn't work with neither ANSI nor Unicode nor UTF-8 coding of the
> file, but the error messages are different.
> E.g. with UTF-8 coding of the csv file it says:
> File "E:\Carolin\Dissertation\Exp. 2\Vortest\Teil2.py", line 60, in
> <module>
> exec(paramName+'=thisTrial_2.'+paramName)
> File "<string>", line 1

> WortART=thisTrial_2.WortART


> ^
> SyntaxError: invalid syntax
>
> Greetings,
> Carolin
>

--
Dr. Jonathan Peirce
Nottingham Visual Neuroscience

http://www.peirce.org.uk/


This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment
may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

C.S.

unread,
Jun 14, 2011, 11:51:40 AM6/14/11
to psychopy-users
Hello Jon,

I've tried it with xlsx files - thank you for that! - , and though it
doesn't work with my own version of MS Excel, it does on my laptop
with another version (I really have no idea why, but I am glad I can
use these files anyway).

As far as the characters of the parameters names, I didn't use any non-
ascii characters there, and when I had an ANSI-coded csv file at
first, the experiment would start, but only continue until it reached
a trial with a word with an non-ascii character.

But even if I don't have a clue why this didn't work, with the xlsx
files I can run my experiment.

Thank you very much for your support and for providing a very useful
free experimental software!
Carolin

Jonathan Peirce

unread,
Jun 14, 2011, 12:06:58 PM6/14/11
to psychop...@googlegroups.com

OK, and I now have a fix (I think) for the csv files too, if you want to
apply a little hack to PsychoPy:

In psychopy/data.py, around line 620 you'll find something that says::

#if it looks like a list, convert it
if type(val)==numpy.string_ and val.startswith('[')
and val.endswith(']'):
exec('val=%s' %val)

replace those lines with::

#if it looks like a list, convert it
if type(val)==numpy.string_ and val.startswith('[')
and val.endswith(']'):
exec('val=%s' %unicode(val.decode('utf8')))
elif type(val)==numpy.string_:#if it looks like a
string read it as utf8
val=unicode(val.decode('utf-8'))

and you should be rocking!

Jon

C.S.

unread,
Jun 16, 2011, 8:17:44 AM6/16/11
to psychopy-users
Hi Jon,

I tried what you were suggesting, but now I get another error message:

File "E:\Carolin\Dissertation\Exp. 2\PsychoPy\Dateien für Exp.
2\NurMatheTest_max 20 sec.py", line 42, in <module>
trialList=data.importTrialList('Instruktion_M_ST.csv'))
File "C:\Program Files\PsychoPy2\lib\site-packages\psychopy-1.64.00-
py2.6.egg\psychopy\data.py", line 615, in importTrialList
val=unicode(val.decode('utf-8'))
File "C:\Program Files\PsychoPy2\lib\encodings\utf_8.py", line 16,
in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf6 in position 5:
invalid start byte

Before I changed the source file, additionally to the problem with the
exec(paramName+'=thisTrial_2.'+paramName) I referred to above, a
problem with the saving function appeared:
File "E:\Carolin\Dissertation\Exp. 2\Vortest\Gesamt Vortest-neu.py",
line 126, in <module>
dataOut=['n','all_mean','all_std', 'all_raw'])
File "C:\Program Files\PsychoPy2\lib\site-packages\psychopy-1.64.00-
py2.6.egg\psychopy\data.py", line 535, in saveAsExcel
ws.cell(_getExcelCellName(col=colN,row=stimN+1)).value =
unicode(self.trialList[stimN][heading])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position
5: ordinal not in range(128)

The same thing happened when I was using xlsx-files. Do you think this
might have to to something with the way the files are coded on Windows/
MS Office? With files written with the Mac Editor I had no problems
with the same experiment script.

Carolin

Jonathan Peirce

unread,
Jun 16, 2011, 10:47:33 AM6/16/11
to psychop...@googlegroups.com
To be honest, I don't know the answer to this either. I wonder if the
character you're using is also outside the UTF-8 character set. Maybe
you could send (off-list if you like?) the file that you're trying to
read in, and also a description of the character you're needing?

On 16/06/2011 13:17, C.S. wrote:
> Hi Jon,
>
> I tried what you were suggesting, but now I get another error message:
>

> File "E:\Carolin\Dissertation\Exp. 2\PsychoPy\Dateien f�r Exp.

--

C.S.

unread,
Jun 20, 2011, 7:08:12 AM6/20/11
to psychopy-users
Hello Jon,

I think I found out why my files didn't work; The csv files I wrote
with the editor or excel had invisible non-ASCII characters at the
beginning of the first line, they where only visible when opening the
files with a hex editor.
I guess this was also connected to the saving-problem, although I
don't understand how.
My xlsx-files were read correctly when I changed the regional settings
of my system from German to English(USA), and with these files I also
could save the data as xlsx.

Caro
Reply all
Reply to author
Forward
0 new messages