ValidationError output of SafeData

98 views
Skip to first unread message

Andrew Pinkham

unread,
Feb 8, 2015, 1:33:08 PM2/8/15
to django-d...@googlegroups.com
Hi,
I mentioned on the Django User Mailing List that the interaction of ValidationError, mark_safe, and ugettext was not what I expected.

https://groups.google.com/d/topic/django-users/Soik095MTaI/discussion

Succinctly, the two versions of clean_slug below behave differently, when they should not.

# results in <code> being printed for user
def clean_slug(self):
new_slug = self.cleaned_data['slug'].lower()
if new_slug == 'invalid_value':
raise ValidationError(
# _ is ugettext
mark_safe(_('SlugField may not be '
'"%(slug_value)s" '
'for URL reasons.')),
params={
'slug_value':
mark_safe('<code>invalid_value</code>')})
return new_slug

# results in <code> being valid HTML elements
def clean_slug(self):
new_slug = self.cleaned_data['slug'].lower()
if new_slug == 'invalid_value':
raise ValidationError(
# _ is ugettext
mark_safe(_('SlugField may not be '
'"<code>invalid_value</code>" '
'for URL reasons.')))
return new_slug

The problem behavior is caused by the way ValidationError outputs its text; ValidationError behaves differently depending on whether params are used, which seems undesirable. We can demonstrate this in the Django shell (1.7.4).

>>> from django.core.exceptions import ValidationError
>>> from django.utils.safestring import mark_safe, SafeData
>>> from django.utils.six import text_type
>>> isinstance(mark_safe('hello'), SafeData)
True

When the ValidationError message is SafeData, the output of the ValidationError is also SafeData.

>>> ve = ValidationError('this is static')
>>> isinstance(list(ve)[0], text_type)
True
>>> ve = ValidationError(mark_safe('this is safe'))
>>> isinstance(list(ve)[0], SafeData)
True

... unless we use the params feature of the exception, as the documentation tells developers to.

>>> ve = ValidationError(mark_safe('%(r)s'), params={'r': 'replaced'})
>>> isinstance(list(ve)[0], SafeData)
False
>>> ve = ValidationError(mark_safe('%(r)s'), params={'r': mark_safe('replaced')})
>>> isinstance(list(ve)[0], SafeData)
False

The difference in behavior depending on whether params is passed should not happen.

I can see how in the first instance above---where the message is SafeData but the params values aren't---it is desirable to return text instead of SafeData from a security standpoint. However, in the second case, when both the message and the params are SafeData, it seems counter-intuitive to return text.

| message | params | output |
|---------|--------|--------|
| text | none | text |
| safe | none | safe |
| text | text | text |
| safe | text | text |
| text | safe | text |
| safe | safe | safe | <-- the problem: currently outputs text

The issues stems from ValidationError.__iter__(). In `add_error` of form validation, ValidationErrors are added on with the following:

# django/forms/forms.py: line 353
self._errors[field].extend(error_list)

This code results in a call to ValidationError.__iter__(),

# django/core/exceptions.py: line 151
def __iter__(self):
if hasattr(self, 'error_dict'):
for field, errors in self.error_dict.items():
yield field, list(ValidationError(errors))
else:
for error in self.error_list:
message = error.message
if error.params:
message %= error.params
yield force_text(message)

Turns out, __mod__ on SafeText is not defined, meaning the code `message %= error.params` always transforms SafeText to text.

>>> isinstance(mark_safe('hello %(r)s'), SafeData)
True
>>> isinstance(mark_safe('hello %(r)s') % {'r': 'boo'}, SafeData)
False
>>> isinstance(mark_safe('hello %(r)s') % {'r': mark_safe('boo')}, SafeData)
False

I can thus think of two ways to solve this:

1. Modify ValidationError.__iter__() to account for SafeText with params.
2. add __mod__() to SafeText

Given that ugettext (or rather, do_translate() in django/utils/translation/trans_real.py) takes the first approach, I've created a branch in my fork of Django to do just that.

https://github.com/jambonrose/django/commit/61612f38aa68efcb5038f51305359b25485e2f48

My djangocore-box isn't working quite the way it should (postgres tests will not run on master), but all the tests appear to run correctly on runtests2.7-sqlite and runtests3.3-sqlite.

With that said: while I think there is a problem in the difference in behavior, I'm not sure my solution is the best. More simply: is the following behavior table actually what we want?

| message | params | output |
|---------|--------|--------|
| text | none | text |
| safe | none | safe |
| text | text | text |
| safe | text | text | <-- this in particular makes me wonder
| text | safe | text |
| safe | safe | safe |

If this is what we want, what next step should I be taking for this? Should I be filing a bug report, or issuing a PR directly?

If the addition of __mod__ to SafeText is a better change, I'm happy to code that too.

Andrew

Andrew Pinkham

unread,
Apr 13, 2015, 6:56:13 PM4/13/15
to django-d...@googlegroups.com
I'm noting for posterity of this thread that I've opened a ticket and issued a pull-request.

Ticket: https://code.djangoproject.com/ticket/24639
PR: https://github.com/django/django/pull/4497

Reply all
Reply to author
Forward
0 new messages