Unicode Error - model_regress

31 views
Skip to first unread message

Tarun Pasrija

unread,
Jun 11, 2009, 11:01:32 AM6/11/09
to Django developers
In the regression suite, I ran the model model_regress and it fails
with the following error:-

Database - MYSQL
Platform - Windows

BrokenUnicodeMethod.objects.all()
Expected:
[<BrokenUnicodeMethod: [Bad Unicode data]>]
Got:
[<BrokenUnicodeMethod: Názov: Jerry>]

On Linux this test case runs fine.

The following code breaks in the test case:-

class BrokenUnicodeMethod(models.Model):
name = models.CharField(max_length=7)
def __unicode__(self):
return 'Názov: %s' % self.name

# Models with broken unicode methods should still have a printable
repr
>>> b = BrokenUnicodeMethod(name="Jerry")
>>> b.save()
>>> BrokenUnicodeMethod.objects.all()
[<BrokenUnicodeMethod: [Bad Unicode data]>]

The tests case tries to insert unicode data into a string type and it
should throw an exception in models/base.py on __repr__.

On Linux it runs fine and throws an exception but on windows this test
case fails with the above output. Is anyone else facing the same
problem with Unicode or am I missing something here?

Thanks in advance for the help.

Karen Tracey

unread,
Jun 11, 2009, 12:03:11 PM6/11/09
to django-d...@googlegroups.com
On Thu, Jun 11, 2009 at 11:01 AM, Tarun Pasrija <tarun....@gmail.com> wrote:

In the regression suite, I ran the model model_regress and it fails
with the following error:-

Database - MYSQL
Platform - Windows

BrokenUnicodeMethod.objects.all()
Expected:
   [<BrokenUnicodeMethod: [Bad Unicode data]>]
Got:
   [<BrokenUnicodeMethod: Názov: Jerry>]


On Linux this test case runs fine.

The following code breaks in the test case:-

class BrokenUnicodeMethod(models.Model):
   name = models.CharField(max_length=7)
   def __unicode__(self):
       return 'Názov: %s' % self.name


# Models with broken unicode methods should still have a printable
repr
>>> b = BrokenUnicodeMethod(name="Jerry")
>>> b.save()
>>> BrokenUnicodeMethod.objects.all()
[<BrokenUnicodeMethod: [Bad Unicode data]>]

The tests case tries to insert unicode data into a string type and it
should throw an exception in models/base.py on __repr__.

On Linux it runs fine and throws an exception but on windows this test
case fails with the above output. Is anyone else facing the same
problem with Unicode or am I missing something here?

Thanks in advance for the help.

You've apparently got a Python on Windows that is configured to have a default encoding of something other than ascii.  Do you have a sitecustomize.py file in your python's Lib directory that calls setdefaultencoding to 'latin1' or 'iso8859-1' perhaps?  That would lead to the result you are seeing; I don't know if there is any other way to change the Python default encoding.

Technically that test case could be changed to not fail for this setup -- what it is really trying to test is that repr of a model with a broken unicode method does not throw an unhelpful exception. However I'm not sure changing the testcase to not care about the specifics of what is printed is a good idea. 

Running Python with a default encoding of something other than ascii is likely to hide errors.  In this case you can see that that has happened, instead of getting an exception (which is what triggers insertion of 'Bad Unicode data' into the object's repr), incorrect data was produced: Názov is not right, it should be Názov.  The incorrect data is the result of Python assuming an incorrect encoding (ascii is not correct either, but it has the benefit of failing hard so that you know it was the wrong choice). 

So I'd recommend changing your Python install to not have this default encoding override, particularly if you are writing code that is intended to be portable to other machines.

Karen
Reply all
Reply to author
Forward
0 new messages