Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

UnicodeEncodeError during repr()

1 view
Skip to first unread message

gb345

unread,
Apr 18, 2010, 9:46:46 PM4/18/10
to


I'm getting a UnicodeEncodeError during a call to repr:

Traceback (most recent call last):
File "bug.py", line 142, in <module>
element = parser.parse(INPUT)
File "bug.py", line 136, in parse
ps = Parser.Parse(open(filename,'r').read(), 1)
File "bug.py", line 97, in end_item
r = repr(CURRENT_ENTRY)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u3003' in position 0: o\
rdinal not in range(128)

This is what CURRENT_ENTRY.__repr__ looks like:

def __repr__(self):
k = SEP.join(self.k)
r = SEP.join(self.r)
s = SEP.join(self.s)
ret = u'\t'.join((k, r, s))
print type(ret) # prints "<type 'unicode'>", as expected
return ret

If I "inline" this CURRENT_ENTRY.__repr__ code so that the call to
repr(CURRENT_ENTRY) can be bypassed altogether, then the error
disappears.

Therefore, it is clear from the above that the problem, whatever
it is, occurs during the execution of the repr() built-in *after*
it gets the value returned by CURRENT_ENTRY.__repr__. It is also
clearly that repr is trying to encode something using the ascii
codec, but I don't understand why it needs to encode anything.

Do I need to do something especial to get repr to work strictly
with unicode?

Or should __repr__ *always* return bytes rather than unicode? What
about __str__ ? If both of these are supposed to return bytes,
then what method should I use to define the unicode representation
for instances of a class?

Thanks!

Gabe

Martin v. Loewis

unread,
Apr 19, 2010, 2:52:26 AM4/19/10
to
> Do I need to do something especial to get repr to work strictly
> with unicode?

Yes, you need to switch to Python 3 :-)

> Or should __repr__ *always* return bytes rather than unicode?

In Python 2.x: yes.

> What about __str__ ?

Likewise.

> If both of these are supposed to return bytes,
> then what method should I use to define the unicode representation
> for instances of a class?

__unicode__.

HTH,
Martin

gb345

unread,
Apr 19, 2010, 1:08:19 PM4/19/10
to

>In Python 2.x: yes.

>> What about __str__ ?

>Likewise.

>__unicode__.

Thanks!

Dave Angel

unread,
Apr 19, 2010, 10:41:25 PM4/19/10
to gb345, pytho...@python.org
More precisely, __str__() and __repr__() return characters. Those
characters are 8 bits on Python 2.x, and Unicode on 3.x. If you need
unicode on 2.x, use __unicode__().

DaveA

0 new messages