Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

UnicodeEncodeError: SOLVED

29 views
Skip to first unread message

Walter Hurry

unread,
Oct 9, 2013, 10:41:53 AM10/9/13
to
Many thanks to those prepared to forgive my transgression in the
'Goodbye' thread. I mentioned there that I was puzzled by a
UnicodeEncodeError, and said I would rise it as a separate thread.

However, via this link, I was able to resolve the issue myself:

http://stackoverflow.com/questions/3224268/python-unicode-encode-error

Nevertheless, thanks again for the kind words.

Steven D'Aprano

unread,
Oct 9, 2013, 9:47:52 PM10/9/13
to
On Wed, 09 Oct 2013 14:41:53 +0000, Walter Hurry wrote:

> Many thanks to those prepared to forgive my transgression in the
> 'Goodbye' thread. I mentioned there that I was puzzled by a
> UnicodeEncodeError, and said I would rise it as a separate thread.
>
> However, via this link, I was able to resolve the issue myself:
>
> http://stackoverflow.com/questions/3224268/python-unicode-encode-error

I don't know what problem you had, and what your solution was, but the
above link doesn't solve the problem, it just throws away data until the
problem no longer appears, and never mind if it changes the semantics of
the XML data.

Instead of throwing away data, the right solution is likely to be, stop
trying to deal with XML yourself, and use a proper UTF-8 compliant XML
library.

Or if you can't do that, at least open and read the XML file using UTF-8
in the first place. In Python 3, you can pass a codec to open. In Python
2, you can use codecs.open instead of the built-in open.


--
Steven

Walter Hurry

unread,
Oct 10, 2013, 5:10:17 PM10/10/13
to
On Thu, 10 Oct 2013 01:47:52 +0000, Steven D'Aprano wrote:

> On Wed, 09 Oct 2013 14:41:53 +0000, Walter Hurry wrote:
>
>> Many thanks to those prepared to forgive my transgression in the
>> 'Goodbye' thread. I mentioned there that I was puzzled by a
>> UnicodeEncodeError, and said I would rise it as a separate thread.
>>
>> However, via this link, I was able to resolve the issue myself:
>>
>> http://stackoverflow.com/questions/3224268/python-unicode-encode-error
>
> I don't know what problem you had, and what your solution was, but the
> above link doesn't solve the problem, it just throws away data until the
> problem no longer appears, and never mind if it changes the semantics of
> the XML data.
>
> Instead of throwing away data, the right solution is likely to be, stop
> trying to deal with XML yourself, and use a proper UTF-8 compliant XML
> library.
>
> Or if you can't do that, at least open and read the XML file using UTF-8
> in the first place. In Python 3, you can pass a codec to open. In Python
> 2, you can use codecs.open instead of the built-in open.

All true, but in *this* case, simply discarding the offending character
was sufficient. Thanks anyway.

0 new messages