
If you can retrieve the data from the database and look at it in a hex editor
and it has the byte sequence 0xe9 there then the problem is the data stored
in the db. If you retrieve it that way and the hex sequence is 0xc3 0xa9
then the data in the database is correct and somewhere in the mysql python
bindings or sqlalchemy the data is going through a series of decode and
encode operations that result in it being encoded as latin-1 instead of
utf-8.
-Toshio
The problem seems to be mysqldb connector, it has problems when
working with unicode, setting it explicitly to use utf8 helps to
keeping it from converting to ascii, and then you need to force
mysqldb to not use it's internal unicode functions (they have
problems, leaks, exceptions and whatnot), it will return a bytecode
string, which with genshi it gives trouble as is not a regular string,
you' ll need to convert it to unicode (something genshi can work with)
with something else, sqlalchemy can do the work for you, adding the
option convert_unicode.
It was a blast drilling though the documentation, mailing list,
traceback and code to find this. Bottom line, use another DB with a
proper DBAPI driver or use another driver for MySQL.
Regards,
Carlos Daniel Ruvalcaba Valenzuela
> I'm not sure what I did wrong.
>
>
> Is the default template from sprox not <meta content="text/html; charset=UTF-8"/> by chance?
>
>
> Any recommendations on how I can resolve my UnicodeDecodeError?
>
>
> Thanks.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
> Bob Tanner <tan...@real-time.com> | Phone : 952-943-8700
> http://www.real-time.com, Linux, OSX, VMware | Fax : 952-943-8500
> Key fingerprint = F785 DDFC CF94 7CE8 AA87 3A9D 3895 26F1 0DDB E378
Listen I have been on this same problem it's not genshi or anython it's python itself just find the site.py in the python/Lib/site.py and look for
def setencoding():
"""Set the string encoding used by the Unicode implementation. The
default is 'ascii', but if you're willing to experiment, you can
change this."""
encoding = "ascii" # Default value set by _PyUnicode_Init()
you see the encoding = "ascii" change to look like this encoding = "utf-8" and the whole problem is gone.
Yes the solution is that EASY