Python console in osx : problem with encoding

268 views
Skip to first unread message

Pierre-Henri

unread,
Sep 5, 2013, 8:40:02 AM9/5/13
to spyd...@googlegroups.com
Hi,

My computer is running osx 10.8.4, localized in french (LANG = fr_FR.UTF-8) 

In a python (2.7) console (not within spyder) launched in a terminal, I get
 >>> s = 'é' ; s; len(s)
'\xc3\xa9'
2

 >>> s = u'é'; s; len(s)
 u'\xe9'
1

I get the same result within an iPython console, within spyder.

But, within spyder, in a python console, I do get this instead :
>>> s = 'é' ; s; len(s)
'\xc3\xa9'
2

 >>> s = u'é'; s; len(s)
 '\xc3\xa9'
2

I suspected the LANG environment variable was the reason, as within spyder (2.2.1 DMG), from the output of os.environ, I get en_US for LANG instead, but when I run from the terminal a port installed 2.2.2 spyder, in which os.environ shows fr_FR.UTF-8 for the LANG env variable, I get the same strange (for me) result than the last one quoted above, so that is not the explanation.

So what could be the explanation for this difference ?

Thanks,

Pierre-Henri

stone...@gmail.com

unread,
Sep 8, 2013, 5:32:11 AM9/8/13
to spyd...@googlegroups.com
Hi Pierre-Henri

You see an inconsistency between 'python2.7' and 'spyder python2.7' behaviour.

'\xe9' is the code of "é" in 'latin1'.
'\xc3\xa9' is the code of "é" in unicode 'utf-8'

Try this , so we can see if you have indeed a 'latin1' or 'cp1252' showing up somewhere in your config.

>>>import locale;import sys;print(locale.getdefaultlocale(),locale.getpreferredencoding(),sys.getdefaultencoding())

Pierre-Henri Jondot

unread,
Sep 8, 2013, 6:19:42 AM9/8/13
to spyd...@googlegroups.com
Le 8 sept. 2013 à 11:32, stone...@gmail.com a écrit :

Hi Pierre-Henri

You see an inconsistency between 'python2.7' and 'spyder python2.7' behaviour.

'\xe9' is the code of "é" in 'latin1'.
'\xc3\xa9' is the code of "é" in unicode 'utf-8'

Yep, my understanding (I might well be wrong) is that when displaying unicode strings in interactive mode (but not by using print where accents are shown), python writes accents using latin1 encoding, but that does not tell us much about internal representation of the said string.


Try this , so we can see if you have indeed a 'latin1' or 'cp1252' showing up somewhere in your config.

>>>import locale;import sys;print(locale.getdefaultlocale(),locale.getpreferredencoding(),sys.getdefaultencoding())


Done. It is mostly a repeat of my remarks about os.environ :

So, in a python shell in the terminal, I get :
(('fr_FR', 'UTF-8'), 'UTF-8', 'ascii')

In the python shell within spyder installed from the dmg (spyder.app) :
(('en_US', 'ISO8859-1'), '', 'ISO8859-1')

(same output, running spyder.app, within an ipython console)

That might well be the explanation, except that despite using the same (wrong ?) environment, ipython correctly translates the string I write, where I suppose it gets UTF-8 strings, and furthermore :
In a macports installed version of spyder, run from the terminal (so with a correct environment) I do get instead :
(('fr_FR', 'UTF-8'), 'UTF-8', 'UTF-8')

And the behavior is the same there than with spyder.app

It might be related, or not at all, to the bug Pierre Raybaut wrote about in winpython list :
"The fact that Unicode characters tend to be rendered badly when the script has been executed in the current console is a known bug."

Btw, the bug (if I may call it as such) only happens when the strings are entered in interactive mode, and only within a python console, not ipython. 
No such problem when it is within a module : suppose I import a module with the following fonction :
def my_string():
    return u'é'

and, in interactive mode, I define :
>>> s = my_string()

then the length of s is 1, as expected, and not the 2 I get when I write : s = u'é' instead…

Not that I do get the same confusing results with spyder running within ubuntu, and as the windows 7 and 8 now uses UTF-8 I bet (but can't test) that it is the same with those versions of windows.

Pierre-Henri Jondot


Le jeudi 5 septembre 2013 14:40:02 UTC+2, Pierre-Henri a écrit :
Hi,

My computer is running osx 10.8.4, localized in french (LANG = fr_FR.UTF-8) 

In a python (2.7) console (not within spyder) launched in a terminal, I get
 >>> s = 'é' ; s; len(s)
'\xc3\xa9'
2

 >>> s = u'é'; s; len(s)
 u'\xe9'
1

I get the same result within an iPython console, within spyder.

But, within spyder, in a python console, I do get this instead :
>>> s = 'é' ; s; len(s)
'\xc3\xa9'
2

 >>> s = u'é'; s; len(s)
 '\xc3\xa9'
2

I suspected the LANG environment variable was the reason, as within spyder (2.2.1 DMG), from the output of os.environ, I get en_US for LANG instead, but when I run from the terminal a port installed 2.2.2 spyder, in which os.environ shows fr_FR.UTF-8 for the LANG env variable, I get the same strange (for me) result than the last one quoted above, so that is not the explanation.

So what could be the explanation for this difference ?

Thanks,

Pierre-Henri


--
You received this message because you are subscribed to a topic in the Google Groups "spyder" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/spyderlib/v5ELu04_DXU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to spyderlib+...@googlegroups.com.
To post to this group, send email to spyd...@googlegroups.com.
Visit this group at http://groups.google.com/group/spyderlib.
For more options, visit https://groups.google.com/groups/opt_out.

stone...@gmail.com

unread,
Sep 8, 2013, 8:14:11 AM9/8/13
to spyd...@googlegroups.com
Hi,

Did you try with Spyder Ipython QT console (still under 2.7 ) ?
Under a pc, It seems to work a little more "as expected", so maybe also under a Mac.

*** experiment ********
In [1]: s = 'é' ; t=u'é' ; print (s);print (len(s));print (t);print (len(t))

   ...:

é

2

é

1


In [2]

*** end of experiment ********

Pierre-Henri Jondot

unread,
Sep 8, 2013, 8:33:31 AM9/8/13
to spyd...@googlegroups.com
Hi,

I don't know what you mean by spyder ipython QT console… but if you do mean a ipython shell from within spyder, yes, I confirm it works as expected with entering unicode strings in interactive mode, as I wrote in my previous message. (No problem either when I run ipython from a terminal)

Pierre-Henri

stone...@gmail.com

unread,
Sep 8, 2013, 11:15:27 AM9/8/13
to spyd...@googlegroups.com
Hi,

Yes I meant : "Ipython shell from within spyder".

Your Macports installation of spyder looks perfect : all 'utf-8'

Your initial results seem a natural consequence of the mixture of 'ascii', 'ISO8859-1' and '', in the other installations.

stone...@gmail.com

unread,
Sep 11, 2013, 12:21:00 PM9/11/13
to spyd...@googlegroups.com
Side-question : did you try the "anaconda" installation on Mac ? is it recommandable ?
Reply all
Reply to author
Forward
0 new messages