Euro Sign raise UnicodeEncodeError

102 views
Skip to first unread message

Christian Schmidt

unread,
Jan 25, 2007, 1:10:46 PM1/25/07
to Django users
Hi,

I've got a problem with putting non unicode signs into a mysql table.

I get an UnicodeEncodeError when an user puts the Euro Sign (€) in a
newform Textfield. The Exception raises when when Django want to write
into the MySQL Database.

Do I have to encode (or decode) the string before I put it into the
database or do i have to change the DEFAULT_CHARSET ?

This is the Traceback:

UnicodeEncodeError at /mail/schreiben/2/
'latin-1' codec can't encode character u'\u20ac' in position 0: ordinal
not in range(256)
Request Method: POST
Request URL: http://127.0.0.1:8000/mail/schreiben/2/
Exception Type: UnicodeEncodeError
Exception Value: 'latin-1' codec can't encode character u'\u20ac' in
position 0: ordinal not in range(256)
Exception Location:
/usr/lib/python2.4/site-packages/MySQLdb/connections.py in
unicode_literal, line 179

/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/core/handlers/base.py
in get_response
/home/christian/sail/saildj/../saildj/mailing/views.py in writemail
/home/christian/sail/saildj/../saildj/mailing/models.py in senden
/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/db/models/base.py
in save
/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/db/backends/util.py
in execute
/usr/lib/python2.4/site-packages/Django-0.95-py2.4.egg/django/db/backends/mysql/base.py
in execute
/usr/lib/python2.4/site-packages/MySQLdb/cursors.py in execute
/usr/lib/python2.4/site-packages/MySQLdb/connections.py in literal
/usr/lib/python2.4/site-packages/MySQLdb/connections.py in
unicode_literal

The POST Data from the user was the Euro Sign:

POST:

Variable Value
betreff 'testmail'
thread_id ''
text '\xe2\x82\xac'

I hope you can help me...

Thanks,

Christian.

ak

unread,
Jan 26, 2007, 12:37:13 AM1/26/07
to Django users
Same thing with national characters ...
DEFAULT_CHARSET = 'utf-8'
I think this is a bug somewhere in newforms

Christian Schmidt

unread,
Jan 26, 2007, 1:12:54 AM1/26/07
to Django users
On 26 Jan., 06:37, "ak" <a...@khalikov.ru> wrote:
> Same thing with national characters ...
> DEFAULT_CHARSET = 'utf-8'

This doesn't solve the problem. The errormessage still appears with the
same traceback.
Any other ideas?

oggie rob

unread,
Jan 26, 2007, 1:40:55 AM1/26/07
to Django users
Essentially ak pointed out the error. It is expecting 8bit encoding and
the euro is apparently higher than 8 bits (i.e. 20AC is greater than
FF). Just keep plugging around (particularly from the shell & looking
at the db) and you'll probably find it.

-rob

Ivan Sagalaev

unread,
Jan 26, 2007, 3:24:55 AM1/26/07
to django...@googlegroups.com
Christian Schmidt wrote:
> Do I have to encode (or decode) the string before I put it into the
> database

First, it depends on database backend. As far as I know MySQL wants byte
strings (and for example psycopg2 lives happily with unicode strings).
If you use newforms then you have data in unicode and you should then
encode it to put into db. If you don't do this explicitly Python will
use whatever default encoding is set for this in your environment. It's
latin-1 in your case and it can't encode "€", this is why you get an error.

The next question is which charset to use for encoding. This depends on
setup of your database. If it's configured to store data in utf-8 that
can encode all unicode characters (meaning in practice just all
characters) then you just encode your unicode strings into utf-8 and all
will work just fine. If the database configured in some 'old school' way
using a legacy charset then things get more complicated (can this
charset also store "€"? can this database accept utf-8 even if it
doesn't recognize it as such? do you need sorting? will anyone else use
this database with other client software?).

To summarizes: your storage (a database) and your input/output (the web)
really should use utf-8 to avoid problems with "strange" characters. If
you deal internally with unicode (which newforms produce for you) then
for now you should explicitly encode from it to utf-8 until Django
starts doing it automatically.

ak

unread,
Jan 26, 2007, 4:25:31 AM1/26/07
to Django users
Guys, take a look here: http://code.djangoproject.com/ticket/3370
it seems to be the solution, but needs to be tested in all environments

Michael Radziej

unread,
Jan 26, 2007, 5:16:52 AM1/26/07
to django...@googlegroups.com
Hi,

we have a bit of chaos here ... Tickets 3370, 1356 and probably 952
all are about this problem, all are accepted, and #3370 and #1356
have very similar patches. I ask everybody to continue discussion in
django-developers ("unicode issues in multiple tickets"), and I ask
the authors of these three tickets to work together to find out how
to proceed.

http://groups.google.com/group/django-developers/browse_thread/thread/4b71be8257d42faf

Michael

--
noris network AG - Deutschherrnstraße 15-19 - D-90429 Nürnberg -
Tel +49-911-9352-0 - Fax +49-911-9352-100

http://www.noris.de - The IT-Outsourcing Company

Reply all
Reply to author
Forward
0 new messages