Developer help appreciated: unicode() not returning SafeString correctly - possibly a bug?

31 views
Skip to first unread message

Margie Roginski

unread,
Oct 14, 2009, 6:25:13 PM10/14/09
to Django users
I am seeing some odd behavior related to the
django.utils.safestring.SafeString class. What I see is that if my
render function returns a SafeString, the "safeness" of it is lost and
its tags end up getting escaped. I've looked at this in detail in pdb
and I think the issue is in force_unicode(), but I don't have the full
answer, am hoping a developer that knows this code can give me some
ideas.

I have a widget whose render() method returns a SafeString, ie:

def render(self, name, value, attrs=None):
mystr = "some string" # note it's a string, not unicode - this
is important
return mark_safe(mystr) # returns a SafeString


If at this point in the code, I start stepping through the code, I
find that this SafeString value gets returned without any modification
by my render() function, then by as_widget(), and then by __unicode__
(). At those points in the code, if I munge the code a bit to create
a temporary variable and then look a the type of that temporary
variable, the value being returned is still a <class
'django.utils.safestring.SafeString'>, as you'd expect.

Eventually I end up in the force_unicode() function at code that looks
like this:

if hasattr(s, '__unicode__'):
s = unicode(s)

The call to unicode(s) has resulted in my render function getting
called, and as far as I can tell, unicode(s) should simply return the
value that my render function returned. I would expect the resulting
's' to be a SafeString. However, if I look at the type of s after
unicode(s) has been called, it's type is now <type 'unicode'> rather
than <class 'django.utils.safestring.SafeString'>

This seems to result in the mark_safe() that I did in my render
function not having the intended effect.

I have anlayzed this code to death, and I absolutely cannot figure out
why the value being returned by my render function would be changing
from SafeString to unicode.

Note that if in my render() function I have a unicode string rather
than a normal string, I don't see this issue. I can obviously work
around this, but would like to know if this seems like a bug that I
should post. Perhaps it is a python bug even? That's hard to
believe, but I guess it's possible.

Margie

Karen Tracey

unread,
Oct 15, 2009, 1:21:18 AM10/15/09
to django...@googlegroups.com
On Wed, Oct 14, 2009 at 6:25 PM, Margie Roginski <margier...@yahoo.com> wrote:

Eventually I end up in the force_unicode() function at code that looks
like this:

if hasattr(s, '__unicode__'):
   s = unicode(s)

The call to unicode(s) has resulted in my render function getting
called, and as far as I can tell, unicode(s) should simply return the
value that my render function returned.  I would expect the resulting
's' to be a SafeString.  However, if I look at the type of s after
unicode(s) has been called, it's type is now <type 'unicode'> rather
than <class 'django.utils.safestring.SafeString'>


Python absolutely positively no exceptions requires that unicode(x) returns unicode.  If your implementation of __unicode__ does not return unicode, then Python will attempt to coerce it to unicode.  SafeString inherits from str, and str is a type that can be coerced to unicode, so that is what Python does.  (If an implementation of __unicode__ returns something that is neither unicode nor can be coerced to unicode, an exception will be raised.) 
 
This seems to result in the mark_safe() that I did in my render
function not having the intended effect.

I have anlayzed this code to death, and I absolutely cannot figure out
why the value being returned by my render function would be changing
from SafeString to unicode.


Because Python forces the return value from a unicode() call to be of type unicode, so it coerces the SafeString to unicode.

Note that if in my render() function I have a unicode string rather
than a normal string, I don't see this issue.  I can obviously work
around this, but would like to know if this seems like a bug that I
should post.  Perhaps it is a python bug even?  That's hard to
believe, but I guess it's possible.


If you call mark_safe on a unicode type, you get SafeUnicode  instead of SafeString.  Since SafeUnicode inherits from unicode, no conversion is needed and the safeness property is maintained through the calls.  Why do you want to be returning strings instead of unicode from your widget's render()?

Karen

Margie Roginski

unread,
Oct 15, 2009, 2:11:44 PM10/15/09
to Django users
Ah - ok, I see! My knowledge of the unicode area is lacking, so I
hadn't actually realized we were overriding a built-in by defining
__unicode__. Completely obvious now that you point it out of course.

I don't need to be returning a string instead of unicode. I just
inadvertantly ended up doing that due to the fact that my widget, in
certain cases, does not need to provide an input and thus was not
calling its super() method. I'm used to using strings rather than
unicode, so the resulting widget was just rendering some html for
display only, and my render() method was returning it as a string
rather than as unicode. In all other cases where I've done this sort
of thing, I think I at some point ended up concatenating my html onto
the return value of the widget's super() method, or concatenating it
with the return of render_to_string(), and both of those have the
effect of turning it into unicode, so I just never noticed the
problem.

So possibly a silly question, but is it considered best practice to
code all strings as unicode as you write them, or is it better to
convert to unicode at the end? IE

def render(self, name, value, attrs=None):
rendered = u'<div> blah blah blah </div>'
rendered += u'<div> blah blah blah </div>'
return mark-safe(rendered)

vs:

def render(self, name, value, attrs=None):
rendered = '<div> blah blah blah </div>'
rendered += '<div> blah blah blah </div>'
return mark_safe(unicode(rendered))

Or maybe it doesn't matter at all, just curious if there is some
benefit to one versus the other.

I do recognize that what I'm doing above is ugly and that the html
should go into the template, and eventually I will move it there.
Sometimes it's just easier to live in python when debugging and not
have the added complexity of the template.

Ok, thanks for your response, that clarified a lot.

Margie




On Oct 14, 10:21 pm, Karen Tracey <kmtra...@gmail.com> wrote:
> On Wed, Oct 14, 2009 at 6:25 PM, Margie Roginski
> <margierogin...@yahoo.com>wrote:
Reply all
Reply to author
Forward
0 new messages