proposal: helper functions for validation

6 views
Skip to first unread message

Amit Upadhyay

unread,
Sep 8, 2007, 6:01:39 PM9/8/07
to django-d...@googlegroups.com
Python unittest TestCase objects have a lot of helper functions like assert_(), failUnless(), assertEqual(), assertNotEqual() and so on[1]. If we had a similar set of helper functions, possibly with more pythonic names, newform validation functions could be written a little more succinctly.

    def clean_password2(self):
        if self.clean_data['password1'] != self.clean_data['password2']:
            raise ValidationError(u'Please make sure your passwords match.')
        return self.clean_data['password2']

can become

    def clean_password2(self):
        assert_equal(self.clean_data['password1'], self.clean_data['password2', "Please make sure your passwords match."
        return self.clean_data['password2']

or
    def clean_code(self):
        try:
            ConfirmationCode(code=self.clean_data['code'])
        except ConfirmationCode.DoesNotExist :
            raise ValidationError("Code does not exist")
        return self.clean_data["code"]

becomes:

    def clean_code(self):
        assert_not_raises(ConfrimationCode.DoesNotExist , "Code does not exist", ConfirmationCode.objects.get, code=self.clean_data['code'])
        return self.clean_data["code"]

or
    def clean_username(self):
        try: User.objects.get(username=self.clean_data["username"])
        except User.DoesNotExist: pass
        else: raise ValidationError("Username already taken")
        return self.clean_data["username"]

becomes

    def clean_username(self):
        assert_raises(User.DoesNotExist, "Username already taken", User.objects.get, username=self.clean_data ["username"])
        return self.clean_data["username"]

Message would be compulsory in these helper functions, and they will raise ValidationError instead of AssertionError.

[1]: http://docs.python.org/lib/testcase-objects.html
--
Amit Upadhyay
Vakow! www.vakow.com
+91-9820-295-512

Russell Keith-Magee

unread,
Sep 9, 2007, 3:58:05 AM9/9/07
to django-d...@googlegroups.com
On 9/9/07, Amit Upadhyay <upad...@gmail.com> wrote:
> Python unittest TestCase objects have a lot of helper functions like
> assert_(), failUnless(), assertEqual(), assertNotEqual() and so on[1]. If we
> had a similar set of helper functions, possibly with more pythonic names,
> newform validation functions could be written a little more succinctly.
>
> def clean_password2(self):
> if self.clean_data['password1'] != self.clean_data['password2']:
> raise ValidationError(u'Please make sure your passwords match.')
> return self.clean_data['password2']
>
> can become
>
> def clean_password2(self):
> assert_equal(self.clean_data['password1'],
> self.clean_data['password2', "Please make sure your passwords match."
> return self.clean_data['password2']

I think a series of helpers for newforms validation is a reasonable
idea. I have a few comments about the exact form of your suggestions:

- I'm not wild about the deviation from the established camelCase
naming convention. The tests are called assertEqual, assertRaises etc
in TestCase; although Django generally uses underscores rather than
camelCase, I think there would be more value in maintaining
similarity.

- I'm also inclined to continue the similarities, and make the
assertion functions members on BaseForm. This would allow a
significant simplification of your proposal, as the data locations
(i.e., self.cleaned_data) can be implied.

So - as a result of those suggestions, the helpers would look more like:

def clean_password2(self):
self.assertEqual('password1','password2','Please make sure...')
return self.clean_data['password2']

Thoughts?

Yours,
Russ Magee %-)

Malcolm Tredinnick

unread,
Sep 9, 2007, 5:13:54 AM9/9/07
to django-d...@googlegroups.com
On Sun, 2007-09-09 at 15:58 +0800, Russell Keith-Magee wrote:
[...]

> - I'm also inclined to continue the similarities, and make the
> assertion functions members on BaseForm. This would allow a
> significant simplification of your proposal, as the data locations
> (i.e., self.cleaned_data) can be implied.

I'm not sure I understand this. Are you proposing putting some methods
into newforms.forms.BaseForm that only exist for supporting the test
framework? I'm usually not in favour of that type of change because it's
leaking stuff from the test framework, which is ideally entirely
independent of core functionality, into the core.

Willing to admit I don't actually understand what you're proposing,
though.

Malcolm

--
Success always occurs in private and failure in full view.
http://www.pointy-stick.com/blog/

Amit Upadhyay

unread,
Sep 9, 2007, 9:45:28 AM9/9/07
to django-d...@googlegroups.com
On 9/9/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:

On Sun, 2007-09-09 at 15:58 +0800, Russell Keith-Magee wrote:
[...]
> - I'm also inclined to continue the similarities, and make the
> assertion functions members on BaseForm. This would allow a
> significant simplification of your proposal, as the data locations
> (i.e., self.cleaned_data) can be implied.

I'm not sure I understand this. Are you proposing putting some methods
into newforms.forms.BaseForm that only exist for supporting the test
framework? I'm usually not in favour of that type of change because it's
leaking stuff from the test framework, which is ideally entirely
independent of core functionality, into the core.

It is for new forms validation and not for test.

I like the methods being part of BaseForm. We might add a different signature or methods for automatic data location thing, where a name will be queried upon in self.clean_data, as not all assert equals will be based on field, and some might require further processing before comparison.

Drawback if most of the times we are not using that feature, each validation statement will have to contain a superfluous self. in the beginning. Whereas helper functions can all be imported in the name space. Its only minor syntactic sugar so to say, that we are after with this.

Russell Keith-Magee

unread,
Sep 9, 2007, 9:46:52 AM9/9/07
to django-d...@googlegroups.com
On 9/9/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
>
> On Sun, 2007-09-09 at 15:58 +0800, Russell Keith-Magee wrote:
> [...]
> > - I'm also inclined to continue the similarities, and make the
> > assertion functions members on BaseForm. This would allow a
> > significant simplification of your proposal, as the data locations
> > (i.e., self.cleaned_data) can be implied.
>
> I'm not sure I understand this. Are you proposing putting some methods
> into newforms.forms.BaseForm that only exist for supporting the test
> framework? I'm usually not in favour of that type of change because it's
> leaking stuff from the test framework, which is ideally entirely
> independent of core functionality, into the core.

No - this has nothing to do with the test framework. It's a set of
helper functions to support writing clean_XXX methods.

The connection with the test system is purely conceptual. The
functional requirements of form validation are quite similar to those
during a test: assert that two values are/are not equal, assert that
an exception is/isn't raised when a query is executed, etc. However,
whereas the test system checks arbitrary test values and raises an
AssertionError, the validation system needs to check the contents of
cleaned_data and raise a ValidationError.

The reason for putting the methods in BaseForm is that they have very
little use outside of a clean_XXX method. The reason for calling them
assertEquals rather than assert_equals is that the parallels with the
test system are quite strong.

Yours,
Russ Magee %-)

Malcolm Tredinnick

unread,
Sep 9, 2007, 10:16:37 AM9/9/07
to django-d...@googlegroups.com
On Sun, 2007-09-09 at 21:46 +0800, Russell Keith-Magee wrote:
> On 9/9/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> >
> > On Sun, 2007-09-09 at 15:58 +0800, Russell Keith-Magee wrote:
> > [...]
> > > - I'm also inclined to continue the similarities, and make the
> > > assertion functions members on BaseForm. This would allow a
> > > significant simplification of your proposal, as the data locations
> > > (i.e., self.cleaned_data) can be implied.
> >
> > I'm not sure I understand this. Are you proposing putting some methods
> > into newforms.forms.BaseForm that only exist for supporting the test
> > framework? I'm usually not in favour of that type of change because it's
> > leaking stuff from the test framework, which is ideally entirely
> > independent of core functionality, into the core.
>
> No - this has nothing to do with the test framework. It's a set of
> helper functions to support writing clean_XXX methods.

Okay. Then I'm mostly in agreement, except that I have two small
problems with your "for consistency arguments":

(1) They shouldn't be called "assertXYZ". Assertions are typically
things that should never, ever fail. It's a programming error if they
do. Passwords failing to match are not assertion failures. That's just a
normal error checking path.

It's correct to use "assert" in tests, since it's a failure if they
aren't true. But I wouldn't want to encourage using that word in normal
code that is expected to fail from time to time outside of the program's
control.

(2) We should use Django's normal naming scheme (underscores, etc),
rather than camel case, because it's core code, so the author is not
going to be thinking about how the unittest module names things. The
fact that the unittest framework uses camel case is because it was
modelled on JUnit. For functions that are useful in core, we should
follow our standard naming practice. I feel that if we don't, people are
continually going to be wondering whether the function is "normal" or
has "that funky punctuation like the test code".

Consistenty is important. However, I think you've focused on being
consistent with the wrong piece of code; they will be used in normal
code, so they should look like normal code, not stuff that has to
conform to unittests slightly different naming standards.

Once you move away from using "assert", confusion with the unittest
names will fade and the methods will be more self-explanatory. Maybe
call them validate_equal(), validate_not_equal(), etc.

Regards,
Malcolm

--
Quantum mechanics: the dreams stuff is made of.
http://www.pointy-stick.com/blog/

du...@habmalnefrage.de

unread,
Sep 9, 2007, 10:27:50 AM9/9/07
to django-d...@googlegroups.com
Hi,

since your assert-function won't be like assert in python or UnitTests, you mix semantics nobody knows everywhere else.

Please remember that assert Statements have special meanings depending on __DEBUG__
http://docs.python.org/ref/assert.html
http://pyunit.sourceforge.net/pyunit.html#TESTCONDS

I personally found the normal way (without assertions) until now could be read very well.

Regards,
Dirk
--
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

Russell Keith-Magee

unread,
Sep 9, 2007, 10:35:20 AM9/9/07
to django-d...@googlegroups.com
On 9/9/07, Amit Upadhyay <upad...@gmail.com> wrote:
> Drawback if most of the times we are not using that feature, each validation
> statement will have to contain a superfluous self. in the beginning. Whereas
> helper functions can all be imported in the name space. Its only minor
> syntactic sugar so to say, that we are after with this.

Well, if you don't like calling self.XXX, then maybe Python isn't the
language for you :-)

Seriously, I don't see the problem. The validation functions will be
fairly tightly bound to a particular form, and to that form's
cleaned_data. 'self.' is the way Python binds a method to an instance.
The only way you could avoid calling 'self.validate_XXX' is to provide
the form (or pieces of the form) as arguments to a generic function:
i.e., validate(self, 'field1','field2'), or
validate(self.cleaned_data['field1'], self.cleaned_data['field2']),
both of which I find much less elegant.

Yours,
Russ Magee %-)

Russell Keith-Magee

unread,
Sep 9, 2007, 10:35:28 AM9/9/07
to django-d...@googlegroups.com
On 9/9/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
>
> On Sun, 2007-09-09 at 21:46 +0800, Russell Keith-Magee wrote:
> > No - this has nothing to do with the test framework. It's a set of
> > helper functions to support writing clean_XXX methods.
>
> Okay. Then I'm mostly in agreement, except that I have two small
> problems with your "for consistency arguments":
>
> (1) They shouldn't be called "assertXYZ". Assertions are typically
> things that should never, ever fail. It's a programming error if they
> do. Passwords failing to match are not assertion failures. That's just a
> normal error checking path.
...

> (2) We should use Django's normal naming scheme (underscores, etc),
...

> Once you move away from using "assert", confusion with the unittest
> names will fade and the methods will be more self-explanatory. Maybe
> call them validate_equal(), validate_not_equal(), etc.

Both valid points. Ok, sign me up for self.validate_equal() on BaseForm.

Yours,
Russ Magee %-)

Honza Král

unread,
Sep 9, 2007, 2:27:37 PM9/9/07
to django-d...@googlegroups.com


Does anybody else think that this might be interesting as a mix-in class?

Something my form COULD inherit along with BaseForm to gain validation
shortcuts, but that could also be used in models for example, so that
when model validation comes up, we wouldn't duplicate code.

The other reason for that is that I like Forms for their simplicity,
this would add many methods that won't be used for simple forms (in my
case, that's most of the forms I use) - but I agree that this is just
my own personal feeling. ;)

just a thought...

btw. +1 on validate_equal VS assertEqual ;)

>
> Yours,
> Russ Magee %-)
>
> >
>


--
Honza Král
E-Mail: Honza...@gmail.com
ICQ#: 107471613
Phone: +420 606 678585

Nis Jørgensen

unread,
Sep 10, 2007, 5:08:10 AM9/10/07
to django-d...@googlegroups.com
Russell Keith-Magee skrev:
I don't like the transformation of strings into field values. How would
you compare a field value to a string?

I guess this means we have to keep referring to self.cleaned_data (and
not clean_data - see http://code.djangoproject.com/changeset/5231)

Apart from that, I am +1 on this (with the name changes suggested by
Malcolm).

--
Nis Jørgensen

Amit Upadhyay

unread,
Sep 10, 2007, 4:48:46 PM9/10/07
to django-d...@googlegroups.com

Exactly my reason for not making it a method of BaseForm. If things were simple and all validations were such that transfomation of strings into field value was enough, we would have gotten rid of one or more " self.clean_data" per statement for one extra "self.". Let me talk in code:

If for some legacy reason, password is supposed to be case in sensitive, we would want to do:

def clean_password2(self):
    self.validate_equal(self.clean_data["password1"].lower(), self.clean_data["password2"].lower(), "Please make sure..")
    return self.clean_data["password2"]

and automatic translation for string to field would not allow it.

I would much rather do:

def clean_password2(self):
    d = self.clean_data.get
    self.validate_equal(d("password1").lower(), d("password2").lower(), "Please make sure..")
    return d("password2")

and if we are doing such tests all the time, the "self." is just not buying as anything. And we can make it:

import * from newforms.validation_helpers

def clean_password2(self):
    d = self.clean_data.get
    validate_equal(d("password1").lower(), d("password2").lower(), "Please make sure..")
    return d("password2")

There is no other reason to refer to self but to automatically translate strings to field value.

Russell Keith-Magee

unread,
Sep 10, 2007, 8:33:24 PM9/10/07
to django-d...@googlegroups.com
On 9/11/07, Amit Upadhyay <upad...@gmail.com> wrote:
> On 9/10/07, Nis Jørgensen <n...@superlativ.dk> wrote:
>
> > I don't like the transformation of strings into field values. How would
> > you compare a field value to a string?
>
> Exactly my reason for not making it a method of BaseForm. If things were
> simple and all validations were such that transfomation of strings into
> field value was enough, we would have gotten rid of one or more "
> self.clean_data" per statement for one extra "self.". Let me talk in code:
>
> If for some legacy reason, password is supposed to be case in sensitive, we
> would want to do:
>
> def clean_password2(self):
> self.validate_equal(self.clean_data["password1"].lower(),
> self.clean_data["password2"].lower(), "Please make sure..")
> return self.clean_data["password2"]

So, essentially you are advocating replacing:

if self.clean_data['password2'].lower() != self.clean_data['password1'].lower():
raise ValidationError("Please make sure"...")

with:

self.validate_equal(self.clean_data["password1"].lower(),
self.clean_data["password2"].lower(),
"Please make sure..")

I'm afraid you're going to have a hard time convincing me that this is
a good idea. I'm willing to entertain the idea of validation helpers
if they genuinely add clarity - but not if they are just a verbose
replacement for something that Python can already express quite
cleanly.

Yours,
Russ Magee %-)

Amit Upadhyay

unread,
Sep 11, 2007, 6:00:49 AM9/11/07
to django-d...@googlegroups.com
On 9/11/07, Russell Keith-Magee <freakb...@gmail.com> wrote:
I'm afraid you're going to have a hard time convincing me that this is
a good idea. I'm willing to entertain the idea of validation helpers
if they genuinely add clarity - but not if they are just a verbose
replacement for something that Python can already express quite
cleanly.

The point is python is clear was very clear to the developers who wrote unittest module too, they still went ahead with such helpers. I have given many examples where clarity is obvious, validation based on presence or absence of certain exception for example, code gets cut down from 4 lines to 1.

--
Amit Upadhyay
+91-9820-295-512

Russell Keith-Magee

unread,
Sep 11, 2007, 7:32:20 AM9/11/07
to django-d...@googlegroups.com
On 9/11/07, Amit Upadhyay <upad...@gmail.com> wrote:
> On 9/11/07, Russell Keith-Magee <freakb...@gmail.com> wrote:
> > I'm afraid you're going to have a hard time convincing me that this is
> > a good idea. I'm willing to entertain the idea of validation helpers
> > if they genuinely add clarity - but not if they are just a verbose
> > replacement for something that Python can already express quite
> > cleanly.
>
> The point is python is clear was very clear to the developers who wrote
> unittest module too, they still went ahead with such helpers.

Well... no. assertEquals exists in Python unittests because
assertEquals exists in JUnit, and unittest emulates the JUnit API.
This API is, in turn, based on xUnit, which was derived from SUnit,
the original Smalltalk implementation written by Kent Beck.

The motivation in Python was to implement an xUnit API, not design a
specifically Pythonic test API.

> I have given
> many examples where clarity is obvious, validation based on presence or
> absence of certain exception for example, code gets cut down from 4 lines to
> 1.

If this is the case, then show us a complex example. The simple cases
you have provided so far don't convince me.

Yours,
Russ Magee %-)

Amit Upadhyay

unread,
Sep 11, 2007, 10:39:43 AM9/11/07
to django-d...@googlegroups.com
On 9/11/07, Russell Keith-Magee <freakb...@gmail.com> wrote:
Well... no. assertEquals exists in Python unittests because
assertEquals exists in JUnit, and unittest emulates the JUnit API.
This API is, in turn, based on xUnit, which was derived from SUnit,
the original Smalltalk implementation written by Kent Beck.

The motivation in Python was to implement an xUnit API, not design a
specifically Pythonic test API.

But they still get used.

assert_(variable, msg)

is more concise than

if not variable: raise AssertionError(msg)

Saves us 20 chars.

[sure python has language support for assertion and you could as well write: "assert variable, msg", but this can not be done for validation, and above conciseness holds].

Before:

if variable1 == variable2: raise ValidationError(msg)

after

validate_equal(variable1, variable2, msg)

12 few chars to type.

Before:

try: User.objects.get(username=variable)
except User.DoesNotExist: pass
else: raise ValidationError(msg)

after:
validate_exception(User.DoesNotExist, msg, User.objects.get, username=variable)

27 chars saved.

Things like validate_[not_]almost_equal, validate_date_within_range, validate_date_before, validate_date_after will have to be implemented by the users, but if we had such a library with dozens of special purpose functions to assist validation, writing form validation would become all the more sweeter.

Marty Alchin

unread,
Sep 11, 2007, 11:18:54 AM9/11/07
to django-d...@googlegroups.com
I'll chime in just for a few cents here.

First, I think it's worth noting that most Python programmers find
little value in how many characters they have to type. Sure, Python is
generally less verbose than many other languages, but only to the
extent of making it more readable. So kep readability in mind when
you're suggesting changes, not just how many characters are "saved".

That said, I'd also recommend that if you're going to draw any
parallels with unittest assertions, you should probably do it
reliably. Your validate_exception() example was completely unreadable
to me, because it works quite differently than the one in unittest.

On the whole though, I have to agree with Russell on this. The
unittest module was written to fall in line with implementations in
other languages, probably so that documentation could be reusable.
It's not terribly Pythonic, but people use it because it works well.
It's also now well-entrenched, so changing it would cause more
problems than it would solve. It's certainly not a banner-worthy
Python module.

Also of note is the fact that people generally don't write and
maintain tests very often. Good tests can be reused for multiple
iterations of development. In fact, good testing practices would
specifically avoid changing tests unelss it's absolutely necessary
(major feature change), so people deal with the strangness of unittest
far less often than they deal with other parts of their code. Given
the goal of readability, we should focus on making the more-often-used
tasks more readable, not simply shorter.

-Gul

P.S. This is no way intended to mean I have any opinion on the
readability of the proposal in question. I'm just bringing up some
things to keep in mind when discussing.

Amit Upadhyay

unread,
Sep 11, 2007, 12:47:27 PM9/11/07
to django-d...@googlegroups.com
On 9/11/07, Marty Alchin <gulo...@gamemusic.org> wrote:
That said, I'd also recommend that if you're going to draw any
parallels with unittest assertions, you should probably do it
reliably. Your validate_exception() example was completely unreadable
to me, because it works quite differently than the one in unittest.

I agree with you on readability, and my proposal about function signature is not final, and probably can do well with further inputs.

About validate_exception, this is the signature of the corresponding method in unittest:

assertRaises( exception, callable, *args, **kw)

they have removed message, but since args and kw will be passed to callable, these are the only two possible options for adding message:

validate_raises(exception, msg, callable, *args, **kw) or validate_raises(msg, exception, callable, *args, **kw). I picked the former, and do not understand how it has become completely unreadable.

It's not terribly Pythonic, but people use it because it works well.

This is my point, a set of convenience function, well thought out, will be used by python programmer no matter even if it is as bad as looking like coming from java. The challenge is probably to figure out function names and signatures that are intuitive as best as we can.

Support of server side form validation is a strong point for django, and a rich libraries of such functions would only make things better. Batteries included thing. I have created a wiki page[1] with proposed function names and signatures, please go through it, and see if they make sense, and might be useful for django.

[1]: http://code.djangoproject.com/wiki/ValidationHelperFunctionsProposal

--
Amit Upadhyay
http://www.amitu.com/blog/
+91-9820-295-512

Marty Alchin

unread,
Sep 11, 2007, 2:18:23 PM9/11/07
to django-d...@googlegroups.com
On 9/11/07, Amit Upadhyay <upad...@gmail.com> wrote:
About validate_exception, this is the signature of the corresponding method in unittest:

assertRaises( exception, callable, *args, **kw)

they have removed message, but since args and kw will be passed to callable, these are the only two possible options for adding message:

validate_raises(exception, msg, callable, *args, **kw) or validate_raises(msg, exception, callable, *args, **kw). I picked the former, and do not understand how it has become completely unreadable.

I suppose that's my fault. Every time I've seen assertRaises used, it was used with a curried function, rather than supplying arguments directly to it. I hadn't looked at the signature of it myself, so I was unaware that you were in fact following convention on that particular point.

It's not terribly Pythonic, but people use it because it works well.

This is my point, a set of convenience function, well thought out, will be used by python programmer no matter even if it is as bad as looking like coming from java. The challenge is probably to figure out function names and signatures that are intuitive as best as we can.

But when I say it "works well", I mean that it has a variety of other features that are useful to Python programmers, and those work well enough to not be reinvented. Doing something that looks terrible simply because "it works" or "people will use it anyway" is definitely not the Python Way. Personally, I think you're making a mountain of a mole hill here. There's potential value, sure, but I just don't think it's worth it in the end, after you consider the readability concerns. It just doesn't seem to be enough benefit to make it worth the hassles.

Support of server side form validation is a strong point for django, and a rich libraries of such functions would only make things better. Batteries included thing. I have created a wiki page[1] with proposed function names and signatures, please go through it, and see if they make sense, and might be useful for django.

[1]: http://code.djangoproject.com/wiki/ValidationHelperFunctionsProposal

What I'd recommend is that you make a mix-in class that you can use with your forms, and validate them using your helper functions. Once you get them worked out as well as you'd like, put it up on djangosnippets, and see what kind of response you get. I could be wrong, but I don't expect you'll get quite the "OMG This saved my life!" response you might (or might not) be expecting.

-Gul
Reply all
Reply to author
Forward
0 new messages