On Tue, 2007-06-26 at 01:30 -0700, jedie wrote:
> I have a model class like this:
> ----------------------------------------------------------------
> class PagesInternal(models.Model):
> name = models.CharField(primary_key=True, maxlength=150)
> ...
> ----------------------------------------------------------------
> And my names (the primary keys) contains unterscore, like this:
> "page_admin.edit_page"
> I used no ID for the primary key, because i "addressed" the entries
> about the names.
[...]
> UnicodeDecodeError at /_admin/PyLucid/pagesinternal/
> page_admin.edit_page/
> 'utf8' codec can't decode byte 0xad in position 4: unexpected code
> byte
> ----------------------------------------------------------------
> But i think this is not a real UnicodeDecodeError... It's a problem
> with the quote()/unquote() routines in
> django.contrib.admin.views.main.py
> The string before unquote() is..: page_admin.edit_page
> The string after unquote() is...: page\ufffdmin.edit_page
> In a local test, it seems to work fine:
> ----------------------------------------------------------------
> from django.contrib.admin.views.main import quote, unquote
> TEST_STRING = "page_admin.edit_page"
> q = quote(TEST_STRING)
> print "quote():", q
> print "unquote():", unquote(q)
> print
> print "unquote()2:", unquote(TEST_STRING)
> ----------------------------------------------------------------
> output:
> ----------------------------------------------------------------
> quote(): page_5Fadmin.edit_5Fpage
> unquote(): page_admin.edit_page
> unquote()2: page\ufffdmin.edit_page
> ----------------------------------------------------------------
Except this isn't what your test produces. When I run that test program
against the Unicode branch, I got the same result as against trunk:
unquote(TEST_STRING) returns 'page\xadmin.edit_page'. And when Python
tries to interpret that as at UTF-8 string, it hits the illegal byte
\xad, giving the UnicodeDecodeError it reports. You are seeing \ufffd in
your output because of some terminal or Python shell settings you have
in effect (\ufffd is the Unicode "illegal character" replacement
codepoint). Have a look at the results of repr(unquote(TEST_STRING)) to
see the real data being passed around by Django.
> But, in my case, the real input for unquote() is not the quoted one
> like "page_5Fadmin.edit_5Fpage"!
> It is the non-quoted one like: "page_admin.edit_page"
So the real question is why is unquote() being called on that string,
since unquote() should only ever be called on strings that have been run
through quote() previously.
Since the bug is crash inside change_stage() in the same file, try to
work out what why the wrong string is being passed in there. This should
be just pieces of input captured from the URL (via admin/urls.py), so
this suggests that something is creating the wrong URL (and I thought we
would have noticed that previously: it's why quote() and unquote() were
written in the first place).
It might also be possible that you have just lucked onto the right piece
of data that demonstrates the problem. The unquote() function is
resilient to bad input and it's only by accident that the first two
characters the word "admin" happen to be valid hex digits, so "_ad" can
be treated as an escaped sequence.
So there might be a bug here, but a bit more poking around is required.
Try to figure out why an unquoted URL fragment is being passed to admin
in the first place (well, first check that the object_id being passed to
change_stage() really is unquoted already and work backwards from there
-- why is it?).
Regards,
Malcolm
--
How many of you believe in telekinesis? Raise my hand...
http://www.pointy-stick.com/blog/