UnicodeEncodeError

1,437 views
Skip to first unread message

jean polo

unread,
Sep 29, 2010, 12:59:06 PM9/29/10
to Django users
Hi.
I get an 'UnicodeEncodeError' if I upload a file (ImageField) with non-
ascii chars in my application (django-1.2.1).

I added:

export LANG='en_US.UTF-8'
export LC_ALL='en_US.UTF-8'

in my /etc/apache2/envvars as stated here:
http://docs.djangoproject.com/en/dev/howto/deployment/modpython/#if-you-get-a-unicodeencodeerror

but I still have the same error (after restarting apache).
Any hint much appreciated.

cheers,
_y

ps:

Traceback (most recent call last):
[snip]
File "/usr/languages/python/2.6/lib/python2.6/genericpath.py", line
18, in exists
st = os.stat(path)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in
position 53: ordinal not in range(128)

werefr0g

unread,
Sep 29, 2010, 1:05:28 PM9/29/10
to django...@googlegroups.com
Hi,

You should check that your file is actually utf-8 encoded and add the folliwing right after shebang:
# -*- coding: utf-8 -*-

Steve Holden

unread,
Sep 29, 2010, 1:39:24 PM9/29/10
to django...@googlegroups.com
It sounds to me as though the image is being transmitted with the wrong
MIME Type. Image files are binary data, but something in your
application is treating is as a string.

regards
Steve

On 9/29/2010 1:05 PM, werefr0g wrote:
> Hi,
>
> You should check that your file is actually utf-8 encoded and add the
> folliwing right after shebang:
> # -*- coding: utf-8 -*-
>

> Le 29/09/2010 18:59, jean polo a �crit :


>> Hi.
>> I get an 'UnicodeEncodeError' if I upload a file (ImageField) with non-
>> ascii chars in my application (django-1.2.1).
>>
>> I added:
>>
>> export LANG='en_US.UTF-8'
>> export LC_ALL='en_US.UTF-8'
>>
>> in my /etc/apache2/envvars as stated here:
>> http://docs.djangoproject.com/en/dev/howto/deployment/modpython/#if-you-get-a-unicodeencodeerror
>>
>> but I still have the same error (after restarting apache).
>> Any hint much appreciated.
>>
>> cheers,
>> _y
>>
>> ps:
>>
>> Traceback (most recent call last):
>> [snip]
>> File "/usr/languages/python/2.6/lib/python2.6/genericpath.py", line
>> 18, in exists
>> st = os.stat(path)
>>
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in
>> position 53: ordinal not in range(128)
>>
>

> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To post to this group, send email to django...@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.


--
DjangoCon US 2010 September 7-9 http://djangocon.us/

jean polo

unread,
Sep 29, 2010, 1:41:58 PM9/29/10
to Django users
sorry if I was not clear but I don't get you.
this happens only in the admin when uploading a file (say
'file_é.jpg') for an object.
saving the modifications gives the UnicodeEncodeError..

Modifying the same object (still in the admin) and uploading a
standard imagefile (which name contains only ascii chars) gives no
errors.

cheers,
_y


On Sep 29, 7:05 pm, werefr0g <weref...@yahoo.fr> wrote:
>   Hi,
>
> You should check that your file is actually utf-8 encoded and add the
> folliwing right after shebang:
> # -*- coding: utf-8 -*-
>
> Le 29/09/2010 18:59, jean polo a �crit :
>
> > Hi.
> > I get an 'UnicodeEncodeError' if I upload a file (ImageField) with non-
> > ascii chars in my application (django-1.2.1).
>
> > I added:
>
> > export LANG='en_US.UTF-8'
> > export LC_ALL='en_US.UTF-8'
>
> > in my /etc/apache2/envvars as stated here:
> >http://docs.djangoproject.com/en/dev/howto/deployment/modpython/#if-y...

jean polo

unread,
Sep 29, 2010, 2:01:20 PM9/29/10
to Django users
hi Steve

do you have any advices for where to look for this to happen ?

I have a basic 'Bien' class and a *very basic* 'Image' class (with a
ForeignKey to Bien).
BienAdmin has a ImageInline and that's all.

I am a bit confused..

cheers,
_y



On Sep 29, 7:39 pm, Steve Holden <holden...@gmail.com> wrote:
> It sounds to me as though the image is being transmitted with the wrong
> MIME Type. Image files are binary data, but something in your
> application is treating is as a string.
>
> regards
>  Steve
>
> On 9/29/2010 1:05 PM, werefr0g wrote:
>
>
>
> >  Hi,
>
> > You should check that your file is actually utf-8 encoded and add the
> > folliwing right after shebang:
> > # -*- coding: utf-8 -*-
>
> > Le 29/09/2010 18:59, jean polo a crit :
> >> Hi.
> >> I get an 'UnicodeEncodeError' if I upload a file (ImageField) with non-
> >> ascii chars in my application (django-1.2.1).
>
> >> I added:
>
> >> export LANG='en_US.UTF-8'
> >> export LC_ALL='en_US.UTF-8'
>
> >> in my /etc/apache2/envvars as stated here:
> >>http://docs.djangoproject.com/en/dev/howto/deployment/modpython/#if-y...

Petite Abeille

unread,
Sep 29, 2010, 2:06:24 PM9/29/10
to django...@googlegroups.com

On Sep 29, 2010, at 8:01 PM, jean polo wrote:

> I have a basic 'Bien' class and a *very basic* 'Image' class (with a
> ForeignKey to Bien).
> BienAdmin has a ImageInline and that's all.

Not even tangentially related, but...

"Do people in non-English-speaking countries code in English?"
http://programmers.stackexchange.com/questions/1483/do-people-in-non-english-speaking-countries-code-in-english
http://www.reddit.com/r/programming/comments/dg5jo/do_people_in_nonenglishspeaking_countries_code_in/

Steve Holden

unread,
Sep 29, 2010, 2:15:25 PM9/29/10
to django...@googlegroups.com
It might be helpful to provide rather more of the traceback information.

Also, check your database encoding. Somehow you are requiring Django to
convert a Unicode string in to an ASCII string.

regards
Steve

jean polo

unread,
Sep 29, 2010, 2:30:05 PM9/29/10
to Django users
ok,
here is the Traceback (and my DB encoding is utf8-general-ci):

Traceback (most recent call last):

File "/usr/local/alwaysdata/python/django/1.2.1/django/core/handlers/
base.py", line 100, in get_response
response = callback(request, *callback_args, **callback_kwargs)

File "/usr/local/alwaysdata/python/django/1.2.1/django/contrib/admin/
options.py", line 239, in wrapper
return self.admin_site.admin_view(view)(*args, **kwargs)

File "/usr/local/alwaysdata/python/django/1.2.1/django/utils/
decorators.py", line 76, in _wrapped_view
response = view_func(request, *args, **kwargs)

File "/usr/local/alwaysdata/python/django/1.2.1/django/views/
decorators/cache.py", line 69, in _wrapped_view_func
response = view_func(request, *args, **kwargs)

File "/usr/local/alwaysdata/python/django/1.2.1/django/contrib/admin/
sites.py", line 190, in inner
return view(request, *args, **kwargs)

File "/usr/local/alwaysdata/python/django/1.2.1/django/utils/
decorators.py", line 21, in _wrapper
return decorator(bound_func)(*args, **kwargs)

File "/usr/local/alwaysdata/python/django/1.2.1/django/utils/
decorators.py", line 76, in _wrapped_view
response = view_func(request, *args, **kwargs)

File "/usr/local/alwaysdata/python/django/1.2.1/django/utils/
decorators.py", line 17, in bound_func
return func(self, *args2, **kwargs2)

File "/usr/local/alwaysdata/python/django/1.2.1/django/db/
transaction.py", line 299, in _commit_on_success
res = func(*args, **kw)

File "/usr/local/alwaysdata/python/django/1.2.1/django/contrib/admin/
options.py", line 798, in add_view
self.save_formset(request, form, formset, change=False)

File "/usr/local/alwaysdata/python/django/1.2.1/django/contrib/admin/
options.py", line 603, in save_formset
formset.save()

File "/usr/local/alwaysdata/python/django/1.2.1/django/forms/
models.py", line 487, in save
return self.save_existing_objects(commit) +
self.save_new_objects(commit)

File "/usr/local/alwaysdata/python/django/1.2.1/django/forms/
models.py", line 625, in save_new_objects
self.new_objects.append(self.save_new(form, commit=commit))

File "/usr/local/alwaysdata/python/django/1.2.1/django/forms/
models.py", line 737, in save_new
obj.save()

File "/usr/local/alwaysdata/python/django/1.2.1/django/db/models/
base.py", line 435, in save
self.save_base(using=using, force_insert=force_insert,
force_update=force_update)

File "/usr/local/alwaysdata/python/django/1.2.1/django/db/models/
base.py", line 518, in save_base
for f in meta.local_fields if not isinstance(f, AutoField)]

File "/usr/local/alwaysdata/python/django/1.2.1/django/db/models/
fields/files.py", line 255, in pre_save
file.save(file.name, file, save=False)

File "/usr/local/alwaysdata/python/django/1.2.1/django/db/models/
fields/files.py", line 92, in save
self.name = self.storage.save(name, content)

File "/usr/local/alwaysdata/python/django/1.2.1/django/core/files/
storage.py", line 47, in save
name = self.get_available_name(name)

File "/usr/local/alwaysdata/python/django/1.2.1/django/core/files/
storage.py", line 73, in get_available_name
while self.exists(name):

File "/usr/local/alwaysdata/python/django/1.2.1/django/core/files/
storage.py", line 196, in exists
return os.path.exists(self.path(name))

File "/usr/languages/python/2.6/lib/python2.6/genericpath.py", line
18, in exists
st = os.stat(path)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in
position 53: ordinal not in range(128)


<WSGIRequest
...
META:{'CONTENT_LENGTH': '176634',
'CONTENT_TYPE': 'multipart/form-data;
boundary=---------------------------21724139663430',
...
'GATEWAY_INTERFACE': 'CGI/1.1',
'HTTP_ACCEPT': 'text/html,application/xhtml+xml,application/
xml;q=0.9,*/*;q=0.8',
'HTTP_ACCEPT_CHARSET': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'HTTP_ACCEPT_ENCODING': 'gzip,deflate',
'HTTP_ACCEPT_LANGUAGE': 'fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3',
'HTTP_CONNECTION': 'close',
...
'HTTP_USER_AGENT': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:
1.9.2.10) Gecko/20100914 Firefox/3.6.10',
...
'REDIRECT_STATUS': '200',
...
'REQUEST_METHOD': 'POST',
'REQUEST_URI': '/admin/biens/bien/add/',
'SCRIPT_FILENAME': '/.../django.fcgi',
'SCRIPT_NAME': u'',
...
'SERVER_ADMIN': '[no address given]',
...
'SERVER_PORT': '80',
'SERVER_PROTOCOL': 'HTTP/1.1',
'SERVER_SIGNATURE': '',
'SERVER_SOFTWARE': 'Apache/2.2',
'TZ': '/etc/localtime',
'wsgi.errors': <flup.server.fcgi_base.OutputStream object at
0x3309590>,
'wsgi.input': <flup.server.fcgi_base.InputStream object at
0x33093d0>,
'wsgi.multiprocess': False,
'wsgi.multithread': True,
'wsgi.run_once': False,
'wsgi.url_scheme': 'http',
'wsgi.version': (1, 0)}>

thanks,
_y

werefr0g

unread,
Sep 29, 2010, 3:42:56 PM9/29/10
to django...@googlegroups.com
Isn't the filename a string?

It may not be the solution but I think you should try them at least since they are very quick to apply. As I saw something implying os module I thought that before Django handles the string, it must encode by os module.

I found that using previously written steps and explicitly set encoding with "codecs" module while working with text files got rid of my problems encontered with unicode. The fact is that at different levels, there are assumptions on the encoding used while executing it and using utf-8 doesn't make you application work with unicode objects at all levels. When those problem arise, one of the three solutions always end it in my very short python existence.

Regards

jean polo

unread,
Sep 29, 2010, 4:26:50 PM9/29/10
to Django users
On Sep 29, 9:42 pm, werefr0g <weref...@yahoo.fr> wrote:
>   Isn't the filename a string?
>
> It may not be the solution but I think you should try them at least
> since they are very quick to apply. As I saw something implying os
> module I thought that before Django handles the string, it must encode
> by os module.

thanks for your help werefr0g but this happens in a normal admin page
(i.e. not a personified one).
I do have # -*- coding: utf-8 -*- in all my files though...

> I found that using previously written steps and explicitly set encoding
> with "codecs" module while working with text files got rid of my
> problems encontered with unicode. The fact is that at different levels,
> there are assumptions on the encoding used while executing it and using
> utf-8 doesn't make you application work with unicode objects at all
> levels. When those problem arise, one of the three solutions always end
> it in my very short python existence.
>
> Regards

what do you mean ? which three solutions ?
(sorry if I miss the obvious here)

cheers,
_y

werefr0g

unread,
Sep 29, 2010, 4:38:23 PM9/29/10
to django...@googlegroups.com
Jean,

Sorry, the three points are:

>  # -*- coding: utf-8 -*- line
> checking the module's file is actually utf-8 encoded
> using codecs module for file like read/ write operations.

jean polo

unread,
Sep 29, 2010, 5:14:44 PM9/29/10
to Django users
Well, not sure I have the skills to check these (except the first
one).

As all this happens in the admin code, I have no clue about how (or
where) to look/change something.
Any file that contents any accent (yeah french customer: éèê or ç)
triggers the error.

Sorry but I'm still new to django/python.

cheers,
_y

werefr0g

unread,
Sep 29, 2010, 5:24:13 PM9/29/10
to django...@googlegroups.com
Tu peux m'envoyer ton fichier ? je vérifie son encodage. Sinon, quel OS utilises-tu ?

jean polo

unread,
Sep 29, 2010, 6:06:41 PM9/29/10
to Django users
On Sep 29, 11:24 pm, werefr0g <weref...@yahoo.fr> wrote:
>   Tu peux m'envoyer ton fichier ? je v�rifie son encodage.

ca y est

> Sinon, quel OS utilises-tu ?

I use ubuntu 9.10 (but problem is the same with osx or windows)

>
> Le 29/09/2010 23:14, jean polo a �crit :
>
> > On Sep 29, 10:38 pm, werefr0g<weref...@yahoo.fr>  wrote:
> >>    Jean,
>
> >> Sorry, the three points are:
>
> >>   >    # -*- coding: utf-8 -*- line
> >>   >  checking the module's file is actually utf-8 encoded
> >>   >  using codecs module for file like read/ write operations.
> > Well, not sure I have the skills to check these (except the first
> > one).
>
> > As all this happens in the admin code, I have no clue about how (or
> > where) to look/change something.
> > Any file that contents any accent (yeah french customer: ��� or �)

Karen Tracey

unread,
Sep 30, 2010, 8:20:58 AM9/30/10
to django...@googlegroups.com
On Wed, Sep 29, 2010 at 12:59 PM, jean polo <josia...@googlemail.com> wrote:
Hi.
I get an 'UnicodeEncodeError' if I upload a file (ImageField) with non-
ascii chars in my application (django-1.2.1).

I added:

export LANG='en_US.UTF-8'
export LC_ALL='en_US.UTF-8'

in my /etc/apache2/envvars as stated here:
http://docs.djangoproject.com/en/dev/howto/deployment/modpython/#if-you-get-a-unicodeencodeerror

but I still have the same error (after restarting apache).
Any hint much appreciated.


Some servers do not have the necessary language files to allow successfully setting the locale to one that supports utf-8 encoding. See the very last sentence here: http://code.djangoproject.com/wiki/ExpectedTestFailures

You should be able to experiment with setting these variables in a shell session and passing unicode strings containing non-ASCII characters to file system routines like stat. If it works in a shell, then likely you've got the necessary language support installed, and the problem then is that the Apache configuration for some reason is not taking effect. If you cannot get it to work in a shell either, then likely you are missing a language pack that would allow successfully setting locale in this way.
 
Karen
--
http://tracey.org/kmt/

jean polo

unread,
Sep 30, 2010, 12:47:28 PM9/30/10
to Django users
ok, thanks to everybody for the help but unfortunately nothing works
for my issue.
(except Karen one that solves it on one of my local machines but not
the other which has the same linux system... weird..).

I guess I'll ask my client to rename their files without any special
char....
not the best solution but a good habit anyway =)

cheers,
_y


On Sep 30, 2:20 pm, Karen Tracey <kmtra...@gmail.com> wrote:
> On Wed, Sep 29, 2010 at 12:59 PM, jean polo <josiano....@googlemail.com>wrote:
>
>
>
> > Hi.
> > I get an 'UnicodeEncodeError' if I upload a file (ImageField) with non-
> > ascii chars in my application (django-1.2.1).
>
> > I added:
>
> > export LANG='en_US.UTF-8'
> > export LC_ALL='en_US.UTF-8'
>
> > in my /etc/apache2/envvars as stated here:
>
> >http://docs.djangoproject.com/en/dev/howto/deployment/modpython/#if-y...

werefr0g

unread,
Sep 29, 2010, 5:32:22 PM9/29/10
to django...@googlegroups.com
Sorry, sorry.... I proposed Jean to send me the actual file to check its encoding. I asked for his OS too in order to see what editor are available and how it allows to "transcode" the text.

Karen Tracey

unread,
Oct 1, 2010, 12:11:24 AM10/1/10
to django...@googlegroups.com
On Wed, Sep 29, 2010 at 5:32 PM, werefr0g <were...@yahoo.fr> wrote:
Sorry, sorry.... I proposed Jean to send me the actual file to check its encoding. I asked for his OS too in order to see what editor are available and how it allows to "transcode" the text.

Not the contents of the file is irrelevant to this problem: it is the file's name that is causing the problem. The last bit of the traceback is:

 File "/usr/languages/python/2.6/lib/python2.6/genericpath.py", line 18, in exists
   st = os.stat(path)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 53: ordinal not in range(128)

"path" here includes the file name, no file data is involved. Django is passing a unicode string "path" to the os.stat() function. On many operating systems, Python must actually pass a bytestring, not unicode, to the underlying OS routine that implements "stat".  Therefore Python must convert the unicode string to a bytestring using some encoding. The encoding it uses is whatever is returned by os.getfilesystemencoding: http://docs.python.org/library/sys.html#sys.getfilesystemencoding. As noted in that documentation, on Unix the encoding will be:

... the user’s preference according to the result of nl_langinfo(CODESET), or None if the nl_langinfo(CODESET) failed.

That's a pretty obscure description but it boils down to the encoding associated with the currently set locale. (And on some systems successfully setting a locale with an encoding like utf-8 requires installing some "extra" language packs.) So the key to fixing this problem is to ensure the locale of the running server is successfully set to one with an encoding such as utf-8, which supports (can encode) the full range of unicode values. Unfortunately details on setting locales differs from machine to machine so it is hard to give specific instructions here.

Karen
--
http://tracey.org/kmt/

Klaas van Schelven

unread,
Oct 1, 2010, 9:09:19 AM10/1/10
to Django users
Hi all,

I just ran into the same problem. Locally it doesn't occur, but it
does on the server.

I share Karen's analysis that the variable path of type unicode cannot
be encoded into ascii.
However, sys.getfilesystemencoding is also "UTF-8", so I don't see why
os.stat would try to encode using ascii anyway.

Klaas

Benedict Verheyen

unread,
Oct 1, 2010, 9:42:32 AM10/1/10
to django...@googlegroups.com

I had a similar problem in a python script. Not necessary to rename files :)
The problem is that when you print a debug statement or write to a file, you
need to specify the correct encoding.
I'll add a snippet of the logger class that i'm using so you'll have an idea.
Also, play with the shell and see if you can reproduce and solve the problem there.

try:
# If the message is unicode, convert to bytecode
screen_message = message
file_message = message

if ( isunicode(message) ):
screen_message = message.encode("cp850")
file_message = (message + '\n').encode("utf-8")

# Print to screen
if ( to_screen == 1 ): print screen_message

# Write to file
f = open(logfile, 'a')
f.write(file_message)
f.close()
except Exception, e:
print "Logmessage: exception %s" % str(e)

I seem to remember that on my windows, the code page in the command screen (cp850)
is different from the code page used in a file (latin1).

Anyway, as you can see, before i print a message to the cmd screen, i encode it.
Same happens when i save the message in the logfile. I encode it to a different
code page however.
You'll have to do the same with other strings that get printed.

This works for me

Regards,
Benedict

roboter

unread,
Oct 4, 2010, 11:44:41 PM10/4/10
to Django users
import sys
ret = sys.getdefaultencoding()
if ret == 'ascii'
modify /python/site.py setencoding function
set encoding = "utf-8"


On 10月1日, 下午9时09分, Klaas van Schelven <klaasvanschel...@gmail.com>
wrote:
> Hi all,
>
> I just ran into the same problem. Locally it doesn't occur, but it
> does on the server.
>
> I share Karen's analysis that the variable path of type unicode cannot
> be encoded into ascii.
> However, sys.getfilesystemencoding is also "UTF-8", so I don't see whyos.statwould try to encode using ascii anyway.
>
> Klaas
>
> > File "/usr/languages/python/2.6/lib/python2.6/genericpath.py", line 18, in
> > exists
> > st =os.stat(path)
>
> >UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position
> > 53: ordinal not in range(128)
>
> > "path" here includes the file name, no file data is involved. Django is
> > passing a unicode string "path" to theos.stat() function. On many operating
Reply all
Reply to author
Forward
0 new messages