The unicode branch, [1], is now at a point where it is essentially
feature-complete and could do with a bit of heavy testing from the wider
community.
So if you have some applications that work against Django's current
trunk and would like to try them out on the unicode branch, I'd
appreciate your efforts. The porting effort should be very minimal
(almost zero, in many cases).
For code that is only meant to work with ASCII data, there are probably
no changes required at all. For code that is meant to work with all
kinds of input (essentially, arbitrary strings), there are a few quick
porting steps required.
See [2] for the short list (5 steps, maximum!) of changes you might need
to make. For more detailed information, have a read through the
unicode.txt document in the docs/ directory of the branch.
Any bugs you find should be filed in Trac. Put "[unicode]" at the start
of the summary title so that I can search for them later. No need to put
any special keywords or anything like that in (the "version" field
should be set to "other branch", if you remember).
A couple of things to watch out for when you're testing:
(A) Strings that seem to mysteriously disappear, but when you
examine the source, you see something like
"<django.utils.functional.__proxy__ object at 0x2aaaaf87a750>".
These shouldn't be too common and will mostly be restricted to
places like the admin interface that do introspection.
(B) Translations that happen too early. If you have translations
available and use your app in a language that is different from
the LANGUAGE_CODE setting, watch out for any strings that are
translated into LANGUAGE_CODE, instead of your current locale.
This is a sign that ugettext() is being used somewhere that
ugettext_lazy() should be used.
(C) If you're using Python 2.3, look for strings that don't make
much sense when printed. That is a sign that a bytestring is
being used where a unicode string was needed (not your fault;
it's an oversight in Django). Python 2.3 has some
"interesting" (I could use nastier words) behaviour when it
tries to interpolate non-string objects into unicode strings (it
doesn't call the __unicode__ method!!) and we have to work
around them explicitly. I think I've got most of them, but I'll
bet I have overlooked some.
Most bugs that people are finding at the moment fit into one of these
categories and they are very easy to fix once we find them. I've tried
to nail most of them in advance, but you can probably imagine how
exciting it is to read every line of source code and try to find all the
strings that are in a precise form that need changing. My attention may
have drifted from time to time.
Have realistic expectations about this branch, too. It is meant to be as
close to 100% backwards-compatible as we can make it. So, for example,
usernames still have to use normal ASCII alphabetic characters, etc.
Similarly, the slugify filter still behaves as it did before. At some
point it will be extended to handle a _few_ more non-ASCII characters,
but it's never going to be a full transliteration function. They are the
two big items I expect people would otherwise try to extend beyond what
is intended. There may be others and I'm sure we'll discover what they
are as the questions pop up.
[1] http://code.djangoproject.com/wiki/UnicodeBranch
[2]
http://code.djangoproject.com/wiki/UnicodeBranch#PortingApplicationsTheQuickChecklist
Regards,
Malcolm
I really welcome this branch and thank you all for the effort.
Before I consider a bug what follows I'd ask if this should entitle
me to use
non ASCII letters in tests with test.client.
I tried something like self.client.get(url, dict(name=u'F\xf2')) to
get back
an error from urlparse
File "/misc/src/django/branches/unicode/django/test/client.py", line
196, in get
r = {
File "/usr/lib/python2.4/urllib.py", line 1162, in urlencode
v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf2' in
position 1: ordinal not in range(128)
sandro
*:-)
Switched my site today to the branch. Works like a charm (translations,
admin, multilingual content).
Excellent stuff. That's a bug. Python's urllib.quote_plus doesn't handle
unicode characters (with some reasonably good reasons) and calling str()
on anything is not such a hot idea any longer. So Django has it's own
django.utils.html.urlquote_plus() function that we can use as a
replacement. Not sure how I overlooked that one, but I'll fix it
shortly.
Regards,
Malcolm
This should be fixed in [5338].
Regards,
Malcolm
well... you already know it works like a charm!
grazie!
*:-)
Thank you so much for this branch!
> Similarly, the slugify filter still behaves as it did before. At some
> point it will be extended to handle a _few_ more non-ASCII characters,
> but it's never going to be a full transliteration function. They are the
> two big items I expect people would otherwise try to extend beyond what
> is intended. There may be others and I'm sure we'll discover what they
> are as the questions pop up.
Why don't we use the slughifi function:
http://amisphere.com/contrib/python-django/ ?
I already use it to replace the django one and it's very useful (at
least for french titles).
David
If the author wanted to contribute that under a new-BSD license, we
could use something like that, certainly (at least as the Python
replacement; the Javascript enhancement should probably be smaller so
that we don't have to ship so much data around, but that's a minor
issue). There's been a lot of work put into that table of mappings,
which is the bit we can really use.
If you are the author, or you know the author and think they want to
submit it for inclusion, please open a ticket in Trac. We really only
need the table of mappings.
Regards,
Malcolm
I've just emailed the authors to see if they would relicense the file
to include it inside django.
I'll update as soon as i have their replies.
On 25 mai, 10:44, Malcolm Tredinnick <malc...@pointy-stick.com> wrote:
> On Fri, 2007-05-25 at 10:31 +0200, David Larlet wrote:
> > 2007/5/24, Malcolm Tredinnick <malc...@pointy-stick.com>:
Thanks. We only need the mapping table. We'd want to rewrite most of the
function for stylistic, consistency and correctness reasons anyway.
Regards,
Malcolm
A short disclaimer: I'm currently trying the unicode branch with the autoescape patch and a
couple of other patches, so my problems might really be my own problems,
but I don't expect it.
First, I found that I have a problem with commit 5255 together with the test
client. It breaks loading the modules, probably due to recursive imports.
- management activates translation
- this loads all apps
- One of my apps loads the test Client (I'm use a different testing
framework that uses the django test client)
- test client loads contrib.session
- the model meta class starts translation in contribute_to_class
- this loads all apps --> doesn't work
I moved the import statement in my app into the function --> works.
I suggest to change the test client so that it imports other models
only in a function and not at compile time.
-*-
Second, I have a map of view tags, verbose names for these and how to build
the url (it was born before the regex reverser). This map uses gettext_lazy
for the verbose names, which is used later with the % operator. This fails
because
In [44]: "%s" % gettext_lazy("Dienste")
Out[44]: '<django.utils.functional.__proxy__ object at 0xb70dacac>'
With proper unicode objects, though, it works:
In [45]: u"%s" % ugettext_lazy("Dienste")
Out[45]: u'Services'
(It really requires both that the pattern is unicode and that ugettext_lazy is
used and not gettext_lazy)
I'm now working to work around this, but it's a lot of replacements from
"gettext_lazy" --> "ugettext_lazy" and also to promote all the patterns to
unicode.
I wonder, can this be changed so that it works the old way, too? This seems
to be related with commit 5239:
@@ -32,6 +32,8 @@ def lazy(func, *resultclasses):
self.__dispatch[resultclass] = {}
for (k, v) in resultclass.__dict__.items():
setattr(self, k, self.__promise__(resultclass, k, v))
+ if unicode in resultclasses:
+ setattr(self, '__unicode__', self.__unicode_cast)
def __promise__(self, klass, funcname, func):
# Builds a wrapper around some magic method and registers that
magic
@@ -47,6 +49,9 @@ def lazy(func, *resultclasses):
self.__dispatch[klass][funcname] = func
return __wrapper__
+ def __unicode_cast(self):
+ return self.__func(*self.__args, **self.__kw)
+
def __wrapper__(*args, **kw):
# Creates the proxy object, instead of the actual value.
return __proxy__(args, kw)
this makes unicode() work for the proxies, but not str(). I tried to add a
similar hook for str(), but I failed (and I really don't understand how all
the various parts play together here ...)
That's for now, I'm still trying to get over this before I can start more
serious testing.
So long,
Michael
--
noris network AG - Deutschherrnstraße 15-19 - D-90429 Nürnberg -
Tel +49-911-9352-0 - Fax +49-911-9352-100
http://www.noris.de - The IT-Outsourcing Company
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Hansjochen Klenk -
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689
I'd rather try and fix the root problem first, since having to order
your code in a particular way to avoid import problems is fragile.
Certainly needs to be looked at, though. Will do.
>
> -*-
>
> Second, I have a map of view tags, verbose names for these and how to build
> the url (it was born before the regex reverser). This map uses gettext_lazy
> for the verbose names, which is used later with the % operator. This fails
> because
>
> In [44]: "%s" % gettext_lazy("Dienste")
> Out[44]: '<django.utils.functional.__proxy__ object at 0xb70dacac>'
This is an issue in lazy() that is very hard to fix, because __str__ is
used for so many things in Python, I'm not going to call it a bug; it's
just unbelievably annoying.
>
> With proper unicode objects, though, it works:
>
> In [45]: u"%s" % ugettext_lazy("Dienste")
> Out[45]: u'Services'
>
> (It really requires both that the pattern is unicode and that ugettext_lazy is
> used and not gettext_lazy)
>
> I'm now working to work around this, but it's a lot of replacements from
> "gettext_lazy" --> "ugettext_lazy" and also to promote all the patterns to
> unicode.
I'll have a look at this. Commit 5239 is absolutely required.. I think
we may need to make a Promise-variant for translations only so that we
can make __str__ work properly for them, too (we can't do it in general,
because their are non-translation-related places where lazy() is used
and I don't want to break __str__ for them).
I don't feel too bad about people have to move gettext_lazy to
ugettext_lazy (it's the 21st century, global search and replace has
existed for 30 years), but the promotion to unicode strings can take a
few minutes, agreed.
Regards,
Malcolm
Fixed at the source of the problem (django.db.models.options) in [5345].
At least, I'm pretty sure that will fix it. Let me know if the problem
persists (and why, because then it's not as you describe).
>
> -*-
>
> Second, I have a map of view tags, verbose names for these and how to build
> the url (it was born before the regex reverser). This map uses gettext_lazy
> for the verbose names, which is used later with the % operator. This fails
> because
>
> In [44]: "%s" % gettext_lazy("Dienste")
> Out[44]: '<django.utils.functional.__proxy__ object at 0xb70dacac>'
>
> With proper unicode objects, though, it works:
>
> In [45]: u"%s" % ugettext_lazy("Dienste")
> Out[45]: u'Services'
>
> (It really requires both that the pattern is unicode and that ugettext_lazy is
> used and not gettext_lazy)
>
> I'm now working to work around this, but it's a lot of replacements from
> "gettext_lazy" --> "ugettext_lazy" and also to promote all the patterns to
> unicode.
Fixed in [5344]. '%s' % gettext_lazy('Dienste') will do what you expect
now.
Regards,
Malcolm
When the form in admin interface saving.
Traceback (most recent call last):
File "F:\python25\Lib\site-packages\django-svn\unicode\django\core
\handlers\base.py" in get_response
77. response = callback(request, *callback_args, **callback_kwargs)
File "F:\python25\Lib\site-packages\django-svn\unicode\django\contrib
\admin\views\decorators.py" in _checklogin
55. return view_func(request, *args, **kwargs)
File "F:\python25\Lib\site-packages\django-svn\unicode\django\views
\decorators\cache.py" in _wrapped_view_func
39. response = view_func(request, *args, **kwargs)
File "F:\python25\Lib\site-packages\django-svn\unicode\django\contrib
\admin\views\main.py" in add_stage
258. LogEntry.objects.log_action(request.user.id,
ContentType.objects.get_for_model(model).id, pk_value,
force_unicode(new_object), ADDITION)
File "F:\python25\Lib\site-packages\django-svn\unicode\django\utils
\encoding.py" in force_unicode
32. s = unicode(s)
UnicodeDecodeError at /admin/books/author/add/
'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in
range(128)
Not sure if that is only my n00b skillz kicking on, it should be a
direct result of the return string as here:
from django.db import models
# ...
class Author(models.Model):
salutation = models.CharField(maxlength=10)
first_name = models.CharField(maxlength=30)
last_name = models.CharField(maxlength=40)
email = models.EmailField()
headshot = models.ImageField(upload_to= ('/tmp'))
class Admin:
pass
def __unicode__(self):
return self.first_name # this string is what I am talking
about
I am using postgresSQL in utf-8 so I thought some non-ascii input
would pass through nicely in the admin interface. But it didn't.
The item would still be saved into the database and viewable in the
admin interface.
After some testing, only the one declared in return clause
(self.first_name in this case) being filled with utf-8 text would
generate this error. If all other fields is filled with utf-8 char and
not the one returned, it would go normally.
And putting "return unicode(self.first_name)" wouldn't help either.
Please have a look to this, and should I repost this as a ticket?
Regards,
Alan
This suggests that self.first_name hasn't been converted to a unicode
string for some reason and is still a sequence of UTF-8 bytes. That
shouldn't be happening.
>
> Not sure if that is only my n00b skillz kicking on, it should be a
> direct result of the return string as here:
>
> from django.db import models
> # ...
> class Author(models.Model):
> salutation = models.CharField(maxlength=10)
> first_name = models.CharField(maxlength=30)
> last_name = models.CharField(maxlength=40)
> email = models.EmailField()
> headshot = models.ImageField(upload_to= ('/tmp'))
> class Admin:
> pass
>
> def __unicode__(self):
> return self.first_name # this string is what I am talking
> about
>
> I am using postgresSQL in utf-8 so I thought some non-ascii input
> would pass through nicely in the admin interface. But it didn't.
> The item would still be saved into the database and viewable in the
> admin interface.
This is certainly a bit odd and it should be working. I've hammered on
the admin interface quite a bit, saving and loading all kinds of weird
data and so have some other testers. Your model looks like it should be
perfect, too.
I'll have a look at this today when I can make some time.
Regards,
Malcolm
michal@lentilka app $./manage.py test staticpages
Creating test database...
Creating table auth_message
Creating table auth_group
Creating table auth_user
Creating table auth_permission
Creating table django_content_type
Creating table django_session
Creating table django_site
Creating table django_admin_log
Creating table staticpages_staticpage
Creating table news_subscriber
Creating table news_new
Creating table news_tag
Creating table partners_partneruser
Creating table parameters_parameter
Creating table pressreleases_pressrelease
Traceback (most recent call last):
File "./manage.py", line 11, in ?
execute_manager(settings)
File
"/usr/local/lib/python2.4/site-packages/django/core/management.py", line
1678, in execute_manager
execute_from_command_line(action_mapping, argv)
File
"/usr/local/lib/python2.4/site-packages/django/core/management.py", line
1592, in execute_from_command_line
action_mapping[action](args[1:], int(options.verbosity))
File
"/usr/local/lib/python2.4/site-packages/django/core/management.py", line
1309, in test
failures = test_runner(app_list, verbosity)
File "/usr/local/lib/python2.4/site-packages/django/test/simple.py",
line 84, in run_tests
create_test_db(verbosity)
File "/usr/local/lib/python2.4/site-packages/django/test/utils.py",
line 118, in create_test_db
management.syncdb(verbosity, interactive=False)
File
"/usr/local/lib/python2.4/site-packages/django/core/management.py", line
537, in syncdb
_emit_post_sync_signal(created_models, verbosity, interactive)
File
"/usr/local/lib/python2.4/site-packages/django/core/management.py", line
464, in _emit_post_sync_signal
verbosity=verbosity, interactive=interactive)
File
"/usr/local/lib/python2.4/site-packages/django/dispatch/dispatcher.py",
line 358, in send
sender=sender,
File
"/usr/local/lib/python2.4/site-packages/django/dispatch/robustapply.py",
line 47, in robustApply
return receiver(*arguments, **named)
File
"/usr/local/lib/python2.4/site-packages/django/contrib/auth/management.py",
line 26, in create_permissions
ctype = ContentType.objects.get_for_model(klass)
File
"/usr/local/lib/python2.4/site-packages/django/contrib/contenttypes/models.py",
line 20, in get_for_model
model=key[1], defaults={'name': smart_unicode(opts.verbose_name_raw)})
File
"/usr/local/lib/python2.4/site-packages/django/db/models/options.py",
line 105, in verbose_name_raw
raw = unicode(self.verbose_name)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7:
ordinal not in range(128)
Regards
Michal
I guess there is something between the model class and database
adapters which adds those strange bytes.
No, there's nothing like that. These are just standard ASCII decoding
errors that Python reports. A byte starting with 'C' in the high nibble
(e.g. Michal's 0xc3) is the start of a two-byte UTF-8 sequence,
something like your 0xe4 is the first byte of three byte UTF-8 sequence
(or possibly some non-UTF-8 bytes altogether, in both cases).
Regards,
Malcolm
I'm having issues with the Unicode Branch and mod_python (the
development server ist working fine). That's what's comming from
mod_python:
Phase: 'PythonHandler'
Handler: 'django.core.handlers.modpython'
Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/mod_python/importer.py",
line 1537, in HandlerDispatch
default=default_handler, arg=req, silent=hlist.silent)
File "/usr/lib/python2.4/site-packages/mod_python/importer.py",
line 1229, in _process_target
result = _execute_target(config, req, object, arg)
File "/usr/lib/python2.4/site-packages/mod_python/importer.py",
line 1128, in _execute_target
result = object(arg)
File "/usr/lib/python2.4/site-packages/django/core/handlers/
modpython.py", line 177, in handler
return ModPythonHandler()(req)
File "/usr/lib/python2.4/site-packages/django/core/handlers/
modpython.py", line 163, in __call__
req.headers_out[key] = value
TypeError: table values must be strings
The checkout of the Unicode Branch is from yesterday evening (cannot
remember the revision number) and I'm using Apache 2.0.59, Python
2.4, MySQL-python-1.2.2 and MySQL 5.0.38. Is this a bug or am I doing
something wrong? Unfortunately, I wasn't able to figure it out.
Regards,
A.
Hmm... Sorry, but I could't locate this problem (before my last update
to revivision 5371) tests fails too, but not so early (they fails when
fixtures was loadaded to DB). So, I think that there is some problem in
Django Unicode branch.
I have plan to:
1) update unicode branch to latest revision
2) run separate all of my tests
3) locate unicode problems
But now (after update), I couldn't run any of the test.
When you say "this problem", which problem do you mean? The one you
reported in your first email? Because in the email fragment you quoted
from me, there is no "problem" reported.
I'm very happy to help fix these errors people are seeing; they are all
small things and easy to nail with a good explanation, but you have to
help me help you: what is failing? Remember that there are about four
different sub-threads going on under this topic, so giving replying to
the right email so that the right replies thread together is going to be
useful, too.
If this is the problem you reported in your first email, the simplest
thing you can do to help is work out which model's verbose name is
causing problems. From reading your email, I suspect you have a UTF-8
string being used as a verbose_name somewhere (which is perfectly fine)
and it uses some codepoints outside the ASCII range. I've just finished
eating dinner and am about to try and test that theory, because my gut
feeling is that will cause a traceback, just from reading the code.
Assuming it's what I think it is, this will be fixed in about 30
minutes.
Regards,
Malcolm
Aah.. I didn't think to check that (that's why other people are helping
with the tests .. thanks). That's easy enough to fix.
Regards,
Malcolm
Sorry for confusions Malcolm.
My note was in relation with latest error (ie. I have problem with
execution of tests due to verbose_name error).
I am just after dinner too, so I will try to find what is wrong in my
application... :)
Once again, sorry for my obscure latest report and english.
Michal
No worries. :-)
I think I've fixed this problem (non-ASCII bytestrings for verbose_name)
in [5372], which I've just committed.
Regards,
Malcolm
I am just rewrite all my string like:
verbose_name='něco'
fields = (
(None, {'fields': ('title', 'slug', 'annotation', 'content',)}),
('Hiearchie', {'fields': ('parent', 'order')}),
('Pokročilé nastavení', {'fields': ('short_title','template_name',
'person', 'info_box', 'show_menu')}),
('Omezení přístupu na stránku', {'fields':
('registration_required', 'groups')}),
)
order = models.IntegerField("Pořadí", help_text="Pořadí stránky v
rámci sourozenců, tj. stránek které mají stejného rodiče.")
to:
verbose_name=u'něco'
fields = (
(None, {'fields': ('title', 'slug', 'annotation', 'content',)}),
(u'Hiearchie', {'fields': ('parent', 'order')}),
(u'Pokročilé nastavení', {'fields': ('short_title','template_name',
'person', 'info_box', 'show_menu')}),
(u'Omezení přístupu na stránku', {'fields':
('registration_required', 'groups')}),
)
order = models.IntegerField(u"Pořadí", help_text=u"Pořadí stránky v
rámci sourozenců, tj. stránek které mají stejného rodiče.")
I am also update my unicode branch to revision [5372] and now I get
another error messages:
michal@lentilka app $./manage.py test
Creating test database...
"/usr/local/lib/python2.4/site-packages/django/db/models/manager.py",
line 76, in get_or_create
return self.get_query_set().get_or_create(**kwargs)
File
"/usr/local/lib/python2.4/site-packages/django/db/models/query.py", line
280, in get_or_create
obj.save()
File
"/usr/local/lib/python2.4/site-packages/django/db/models/base.py", line
246, in save
','.join(placeholders)), db_values)
File
"/usr/local/lib/python2.4/site-packages/django/db/backends/postgresql/base.py",
line 54, in execute
return self.cursor.execute(smart_str(sql, self.charset),
self.format_params(params))
File
"/usr/local/lib/python2.4/site-packages/django/db/backends/postgresql/base.py",
line 51, in format_params
return tuple([smart_str(p, self.charset, True) for p in params])
File
"/usr/local/lib/python2.4/site-packages/django/utils/encoding.py", line
55, in smart_str
return s.encode(encoding, errors)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in
position 7: ordinal not in range(128)
I try to call all test (./manage.py test) and also each app by specifing
its name (ie. ./manage.py test staticpages). Everything ends with error
above.
Regards
Michal
That sounds like your database client encoding is set to ASCII for some
reason, which isn't something Django is going to be able to handle.
Have a look in django/db/backends/postgresql/base.py, line 97, where is
says
cursor.execute("SHOW client_encoding")
encoding = ENCODING_MAP[cursor.fetchone()[0]]
and print out the value of encoding (maybe even assign cursor.fetchone()
to a temporary variable and print that out, too). That will at least
confirm that the problem is where we think it is.
If the client encoding is not set to something that can handle non-ASCII
characters, there is no hope of putting non-ASCII chars in there in the
first place. I'll have to look up how you configure that (maybe you can
do it in the meantime), but it's not something Django should touching, I
suspect.
I'm working on isnotvalid's problem right at the moment; I'll come back
to this once I've got that fixed (which will be soon).
Regards,
Malcolm
You are right, the problem is in the database.
It seems like the test database is created in SQL_ASCII encoding. I
looked into psql terminal and found:
List of databases
Name | Owner | Encoding
-----------------+------------+-----------
gr4unicode | pgsql | UNICODE
test_gr4unicode | gr4unicode | SQL_ASCII
DB gr4unicode was created by me, manually:
CREATE DATABASE gr4unicode WITH ENCODING 'UNICODE';
Database test_gr4unicode was created dynamically by calling ./manage.py test
I don't know, how to tell to test framework to create database with
UNICODE charset... :(
> I'm working on isnotvalid's problem right at the moment; I'll come back
> to this once I've got that fixed (which will be soon).
Don't be in a hurry due to my problems! :) Primarily I would like help
you with unicode branch testing...
I am attached output "dump" of the latest call of ./manage.py test (I am
printing fetchone and encodings variables).
Regards
Michal
That was a little bit of a tricky example. The problem only occurs when
a file upload field is included in the form, which is why I'd never seen
it before.
Should be fixed in [5373].
Regards,
Malcolm
Aaah! :-(
I've been fighting this problem a bit when testing with MySQL, too,
because my system creates the databases in LATIN1 if I don't tell it
anything special and so the test database can't hold the full unicode
range of characters. It creates PostgreSQL database in UTF-8 on my end,
though, so I've never seen it with that database.
Okay... time to fix that problem then. Probably need to introduce a
settings for tests only for database encoding. I should have done that
when I first saw the problem instead of trying to dodge around it.
I hate it when being lazy doesn't work. :-(
I'll put this one on my list. Nice debugging job. Thanks.
Regards,
Malcolm
Ah! Yes I've stepped on this one too (wrong collation in my case). I
don't know if it could be worked around currently... Looks like we need
a way to specify db creation parameters.
It was my pleasure :)
Regards,
Michal
I can't replicate this problem, but I can take a guess at what is going
on. In [5377] I've checked in what is probably a fix for the problem.
Could you try it and see if it changes things for you?
If you still get the traceback, try modifying the source just before
that last line in the exception traceback
(django/core/handlers/modpython.py) and print out what "key" and "value"
are. I am guessing they have a type of unicode, but they should still be
ASCII characters, because you can't put anything else into HTTP headers.
So if for some reason there are non-ASCII characters in there, we need
to work out where they are coming from.
However, I suspect [5377] is going to fix the main problem by coercing
both "key" and "value" to string types.
Regards,
Malcolm
> I can't replicate this problem, but I can take a guess at what is
> going
> on. In [5377] I've checked in what is probably a fix for the problem.
> Could you try it and see if it changes things for you?
Looks good so far. I'll report if the error pops up again.
Thank you!
A.
I also find using the smart_str() really handy, for cases where stuff
getting out of python.
Regards,
itsnotvalid
Hello again,
I temporarily patched Django source code (django/test/utils.py, lines 96
and 107) to:
cursor.execute("CREATE DATABASE %s WITH ENCODING 'UNICODE'" %
backend.quote_name(TEST_DATABASE_NAME))
So, now I could run my tests. And here is some experience which I get
during debuging (my advices are dedicated mainly for other testers; I am
developing application in Czech language, in utf-8 encoding):
* check *all* your strings (I have a lot of strings like 'něco' or
'%s-123' % var; most of them I must to rewrite to u'něco' and u'%s-123'
% var); check them on all possible places (models, views, tests,
settings, custom tags, ...)
* if you use Client in test (django.test.client), make sure, that you
recode content with smart_unicode function. For example:
response = self.submitHelper('www.example.com')
self.failUnlessEqual(response.status_code, 200)
self.failUnless(smart_unicode(response.content).find(u'nějaký
rětězec') != -1)
* make sure, that data, which you post via client.post, are correctly
encoded. For example:
post_data = {
'item1': u"První položka",
'item2': u"Druhá položka",
'item3': u"Třetí položka"
}
response = self.client.post('/url/', post_data)
Hope this will help to somebody.
Regards
Michal
FWIW, as a workaround, in Mysql's my.cnf, you can set:
character_set_database = 'utf8'
In postgres, new databases are created from the template1 system
database; new databases will have whatever encoding that database has.
(template0 is the pristine DB shipped with postgres and should never
be changed, but you should feel free to change template1 as is
useful).
Do we need such a settings or we really need to *copy* database encoding so
that tests are done exactly as the application database. (if it's possible
to use other than utf8...).
That would prevent people from runnnig wanderfull tests on a well configured
db when the real db is still "SQL_ASCII" just becouse template1 was shipped
that way!
sandro
*:-)