ModelForm unique validation is not done right IMHO.

473 views
Skip to first unread message

Todor Velichkov

unread,
Oct 7, 2017, 4:42:45 AM10/7/17
to Django developers (Contributions to Django itself)
I will best explain what I mean with an example.

Lets say we have a simple model called Book.

class Book(models.Model):
    user
= models.ForeignKey(User)
    name
= models.CharField(max_length=255)

   
class Meta:
        unique_together
= ('user', 'name')

so `Books` are linked to users, and we have added an unique constrain because we don't want a user to have the same book twice.

Now lets say we want to allow users to create/update books, when we do this we would only need their book.name, we already knew who they are in the system.
So a simple ModelForm would look like this:

class BookForm(forms.ModelForm):
   
class Meta:
         model
= Book
         fields
= ('name',)

Now since i've omitted the `user` field and I can't have a book w/o a user, its my responsibility to populate it. Thus when I create new form i will always a pass an instance with a user.
But before we do this, lets manually create 1 book for this user.

user = User.objects.get(pk=1)
book1
= Book.objects.create(user=user, name='Harry Potter')

#now lets build our form
book2
= Book(user=user)
form
= BookForm(data={'name': 'Harry Potter'}, instance=book2)

if form.is_valid(): #yes the form will be valid
    book2
= form.save() #IntegrityError will be raised with UNIQUE constraint failed

Now this happens because `ModelForm.validate_unique` collects a set of fields to pass as an exclude fields into `instance.validate_unique`.
Please, can someone explain me why is this done in such a manner, and what will be the correct way approach this problem.

From my point of view when I define a ModelForm I imagine that I have to only explain what the user needs to see when the form gets rendered.
But when the instance is being validated, I would always want to validate this instance as a whole (with the fields that I've omitted).



Florian Apolloner

unread,
Oct 7, 2017, 6:55:24 AM10/7/17
to Django developers (Contributions to Django itself)
I think the current behaviour is correct. Django shouldn't make any assumptions about fields not in the form since it doesn't know what will happen… In that sense I disagree with the statement:


> From my point of view when I define a ModelForm I imagine that I have to only explain what the user needs to see when the form gets rendered.

To get what you want use something along the lines of: https://gist.github.com/apollo13/ef72035c5f864174ad3c968dec60d88a

Cheers,
Florian

Todor Velichkov

unread,
Oct 7, 2017, 11:07:31 AM10/7/17
to Django developers (Contributions to Django itself)
Thank you for the replay Florian, I will think about what you said that Django shouldn't make any assumptions about fields outside the form, but my intuition tells me that there are more staff which we could be retrieved from the ModelForm's.

Some of my initial thoughts:

If the ModelForm doesn't care about anything outside the defined fields, then this is just a regular Form, the only difference is that it can use Models as a short-cut to be defined. (ModelFormFactory?)
If we agree on this (I'm perfectly fine to agree that this could be the scope of a ModelForm), then I would probably not expect to see a method like `form.save()`, just force me to use `form.cleaned_data` to build my instance, make it more clear that the ModelForm is more like a FormFactory class instead of throwing me options like `commit=False`. I believe this could save a lot of headache to people. If there is no guarantee that an instance cannot be saved after the form has been validated, then do not give me an option to shoot myself into the foot.

However, why I could expect the ModelForm to be something more...

This is a ModelForm, yes it may sit into the `forms` namespace, but still its half `Model` and a half `Form`, pretending to be a form-only class is pity for something that do so much heavy-lifting to remove all the tedious tasks from us (love it for this). Use all the metadata we have into the model to validate the instance. My only concern here actually is not that we didn't know what will happen (I think we know that), but how to show the user that the developer got messed up, i.e. how to render an error for a field that is not rendered to the user? Should this error be propagated to a NON-FIELD error or something else? I don't know yet, maybe no, maybe this field should not be validated and lead to a server error. But, however, in the scope of the problem shown here (a multi-field-level unique constrain) this is not an issue, because this is not a field-level error, this is a multi-field-unique constrains, so maybe the only thing that needs to be changes, is to stop respecting the excluded fields when validating multi-field level unique constrains? I feel its perfectly fine to ask for a complete model instance at this stage of the validation process (lots of methods got passed from the beginning - form.clean() and instance.clean()).

I would love to see more opinions on the topic. Thank you.

Florian Apolloner

unread,
Oct 8, 2017, 12:36:58 PM10/8/17
to Django developers (Contributions to Django itself)


On Saturday, October 7, 2017 at 5:07:31 PM UTC+2, Todor Velichkov wrote:
I believe this could save a lot of headache to people. If there is no guarantee that an instance cannot be saved after the form has been validated, then do not give me an option to shoot myself into the foot.

Even if the whole instance were validated there is __never__ and can __never__ be any guarantee that the instance can actually be saved.


My only concern here actually is not that we didn't know what will happen (I think we know that), but how to show the user that the developer got messed up, i.e. how to render an error for a field that is not rendered to the user? Should this error be propagated to a NON-FIELD error or something else?

Showing an error message which the user cannot correct because the field is not in the form is horrible UX.

I feel its perfectly fine to ask for a complete model instance at this stage of the validation process (lots of methods got passed from the beginning - form.clean() and instance.clean()).

I strongly disagree, especially for values which are set programmatically after validation (with the use of save(commit=False)) depending on values the user chose. Ie when the user chooses a category for a bug report and the code then automatically assigns another user as owner of that reuqest/ticket.

Cheers,
Florian

Todor Velichkov

unread,
Oct 9, 2017, 2:52:53 AM10/9/17
to Django developers (Contributions to Django itself)
Settings values programmatically is a cumulative operation most of the time, however when its not and things depend on each other (like your example), then even the docs suggests than one can use the form.clean method. If there is some other dependency outside form.cleaned_data I would prefer to use dependency injection in order to get this data and do the job done. I'm sorry I just see commit=False as an anti-pattern, because the validation needs to be repeated after that (as your example in the first post).

Showing an error message which the user cannot correct is a horrible UX indeed, but still its a UX, and some people may find it as a better alternative to a `500 server error page`, where there is no UX at all. Even a generic message like 'sorry we messed up' would be useful, because the user data that will be preserved into the form. However, in the example shown here, this is not even the case, there is something that the user can do to prevent the error.

Finally, yes, there may never be a 100% guarantee that an instance can actually be saved, but this would be outside of Django's scope? In terms of a model.save() things are deterministic and we should do doing our best to ensure that the operation is valid for execution, right? In the example I'm showing is not something outside of Django's knowledge to make it unexpected/unpreventable.

Thank you,
Todor.

Florian Apolloner

unread,
Oct 9, 2017, 3:52:50 AM10/9/17
to Django developers (Contributions to Django itself)
Hi,


On Monday, October 9, 2017 at 8:52:53 AM UTC+2, Todor Velichkov wrote:
Settings values programmatically is a cumulative operation most of the time, however when its not and things depend on each other (like your example), then even the docs suggests than one can use the form.clean method.

It does? Can you give me a link to that please?

If there is some other dependency outside form.cleaned_data I would prefer to use dependency injection in order to get this data and do the job done. I'm sorry I just see commit=False as an anti-pattern, because the validation needs to be repeated after that (as your example in the first post).

Repeated is a bit overreaching, it also checks new fields…

Showing an error message which the user cannot correct is a horrible UX indeed, but still its a UX, and some people may find it as a better alternative to a `500 server error page`, where there is no UX at all. Even a generic message like 'sorry we messed up' would be useful, because the user data that will be preserved into the form. However, in the example shown here, this is not even the case, there is something that the user can do to prevent the error.

The generic error message is something that I supply in some forms of mine where race conditions can happen due to high concurrency. This is why I guard form.save, catch database errors and then use form.add_error to add a generic error message asking for a retry. In the example shown, the user can do something about the error, this is correct, but the default error message would be rather confusing since it would cover a field which is not part of the form.

That said, form.save(commit=False) is there and will stay, even if it is just for backwards compatibility. Same goes for the partial validation of the instance, this is just something to many people rely on and changing is not really an option.

One could add a new flag to the Meta object ala validate_full_model, but for that to work you will have to tell us a sensible UX approach first. I am opposed to adding something which we agree is horrible just because the alternative (like I've shown) requires a few additional lines of code.

Cheers,
Florian

Todor Velichkov

unread,
Oct 9, 2017, 10:56:32 PM10/9/17
to Django developers (Contributions to Django itself)
It does? Can you give me a link to that please?

Pard me, it does not explicitly say to set values programmatically, but that this is the place to go when fields depend on each other, and since clean is a multi-step process which does not include only field validation, but settings values too, it kind of makes sense.


The form subclass’s clean() method can perform validation that requires access to multiple form fields. This is where you might put in checks such as “if field A is supplied, field B must contain a valid email address”. This method can return a completely different dictionary if it wishes, which will be used as the cleaned_data.


Suppose we add another requirement to our contact form: if the cc_myself field is True, the subject must contain the word "help". We are performing validation on more than one field at a time, so the form’s clean() method is a good spot to do this.

3) Model.clean()

This method should be used to provide custom model validation, and to modify attributes on your model if desired. For instance, you could use it to automatically provide a value for a field, or to do validation that requires access to more than a single field.

Please, don't get me wrong, I'm far away from the idea of deprecating `commit=False`. I just personally try to void it and trying to explain my arguments.

The generic error message is something that I supply in some forms of mine where race conditions can happen due to high concurrency. This is why I guard form.save, catch database errors and then use form.add_error to add a generic error message asking for a retry.

Yes, that's definitely the place when something outside the validation process can happen, but you don't need `commit=False` here ;)

One alternative approach to validate the instance w/o `commit=False` right now would be to change the form like so:

class BookForm(forms.ModelForm):

   
class Meta:
        model
= Book

        fields
= ('name', 'user')
        widgets
= {'user': forms.HiddenInput()}

   
def __init__(self, *args, **kwargs):
       
super(BookForm, self).__init__(*args, **kwargs)
       
self.fields['user'].disabled = True

#in the view
form
= BookForm(data={'name': 'Harry Potter'}, initial={'user': user})

But if we compare this with the form in the first post, we are just Fixing/Patching it, i.e. fighting with it in order to make it work for us.

However, there is one more little difference here, which I want to point out. Image we didn't had the unique constrain, and we just wanted to hide the user field from the form, and we do it this way (instead of the one with the `instance` kwarg as in the first post). Doing this and adding the unique_constrain later on would yield to no more core changes (except changing the error msg if too ambiguous). While using the `commit=False` approach would require further code changes and it would be multiplied by the number of views the form is being used, and by the number of forms where the field has been omitted (i.e. commit=False has been used). Its a slight difference, but can lead to a big wins.

About the error message, to be honest I don't know, gonna keep thinking about it, one thing that came to mind is to strip the missing fields, i.e. instead of "Book with this User and Name already exists." to become: "Book with this Name already exists." but it really depends on the context. The one thing that I know for sure is that personally I would definitely prefer an ambiguous message, rather than a 500 server error.

Protik

unread,
Sep 23, 2018, 9:57:15 AM9/23/18
to Django developers (Contributions to Django itself)
Hi  Todor

I am experiencing the same problem. Can you please post the possible solution?

Todor Velichkov

unread,
Sep 23, 2018, 11:41:55 AM9/23/18
to Django developers (Contributions to Django itself)
You can use the `disabled` attribute on form fields with a combination of HiddenInput

Using the Book example from the first comment it will look like this:
 
class BookForm(forms.ModelForm):
   
class Meta:
        model
= Book

        fields
= ('user', 'name')
       
   
def __init__(self, *args, **kwargs):
        user
= kwargs.pop('user')
       
super(BookForm, self).__init__(*args, **kwargs)
       
self.fields['user'].widget = forms.HiddenInput()
       
self.fields['user'].initial = user
       
self.fields['user'].disabled = True


First we hide the the user field because we dont want the user to see it, and at the same time keeping it inside form fields we wont prevent the unique_together validation.
Second - we disable the field and programmatically set initial value to be used during form validation

Protik

unread,
Sep 23, 2018, 2:25:41 PM9/23/18
to Django developers (Contributions to Django itself)
Hi, Todor

I have tested this solution and It looks like it works only when you don't disable the field (i.e the last line in the BookForm's `__init__()` method. My views looks like this:

def book_add(request):
user = get_object_or_404(User, id=1)

if request.method == 'POST':

f = BookForm(request.POST, user=user)
if f.is_valid():
book = f.save(commit=False)
book.user = user
book.save()
messages.add_message(request, messages.INFO, 'book added.')
return redirect('book_add')
else:
f = BookForm(user=user)

return render(request, 'blog/book_add.html', {'f': f})


def post_update(request, post_pk):
user = get_object_or_404(User, id=1)
book = get_object_or_404(Book, pk=post_pk)
if request.method == 'POST':
f = BookForm(request.POST, instance=book, user=user)
if f.is_valid():
post = f.save(commit=False)
post.user = user
post.save()
messages.add_message(request, messages.INFO, 'Post added.')
return redirect('post_update', post.pk)
else:
f = BookForm(instance=book, user=user)

return render(request, 'blog/book_update.html', {'f': f})

The code for models and modelform is exactly same as yours.

Am I doing something wrong?
Message has been deleted

Todor Velichkov

unread,
Sep 23, 2018, 4:29:31 PM9/23/18
to Django developers (Contributions to Django itself)
First thought: What is your Django version? The `disabled` attribute was added in Django 1.9.
However by looking at your code (w/o testing anything) after `form.is_valid()` you should only call `form.save()`, no need to do `commit=False` and manually assigning user, you have already done that in the form constructor (by settings initial value + disabled attr).
Message has been deleted
Message has been deleted

Protik

unread,
Sep 24, 2018, 1:34:35 AM9/24/18
to Django developers (Contributions to Django itself)
I am using Django 1.11. Further, adding a new book with user field disabled results in the following error:

Selection_046.png



I have attached the code to reproduce the error.
django_project.tar.gz

ludovic coues

unread,
Sep 24, 2018, 2:09:58 AM9/24/18
to django-d...@googlegroups.com
First, that's a discussion for the Django user list, not the Django developers one.

I would add a clean_book method on the form and use it to check if the user already have a book with that name. For that specific problems, that's the cleanest solution in my opinion. Simply do a Book.objects.filter(user=form.user, name=form.cleaned_data["name"]). exists() and if you get a hit, raise a Validation error. The error will be associated to the book field.

If someone want to write a patch for adding full validation of a ModelForm, I wish them good luck. You could try a save commit=False but you will only get an integrity error and a string coming from the database. Which can be translated. You could check for your unique_together with an extra request. Then the form will raise a Validation error saying there is already a book with that user and that name. Confusing the user.

If you think a ModelForm giving guarantee that the save method will work make sense, the best way to proceed is a third party module. Doing so let you do quick release for development and testing, user will be able try the module and see if it solves their problems. In my opinion, something generic won't help when validating unique_together relationship when one of the fields is not exposed to the user. But I could be wrong.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/44b8461e-2f29-4ac4-82f4-705b52c81560%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Protik

unread,
Sep 24, 2018, 2:36:19 AM9/24/18
to Django developers (Contributions to Django itself)
Book.objects.filter(user=form.user, name=form.cleaned_data["name"]). exists()

This will only work for creating records, not updating.

But what's the issue with the current solution posted by Todor. If you look at the model form it looks like this:

class BookForm(forms.ModelForm):
    class Meta:
         model = Book
         fields = ('user', 'name')

    def __init__(self, *args, **kwargs):
        user = kwargs.pop('user')
        super(BookForm, self).__init__(*args, **kwargs)
        self.fields['user'].widget = forms.HiddenInput()
        self.fields['user'].initial = user
        self.fields['user'].disabled = True

Logically, It should work. But it is not, so far I have checked this code with django version 1.11 and 2.1. 

What do you think where the problem lies?

Todor Velichkov

unread,
Sep 24, 2018, 4:13:00 AM9/24/18
to Django developers (Contributions to Django itself)
Protik, just use `self.fields['user'].initial = user.id` instead of just `user`. Just tested it and it work fine.

Todor Velichkov

unread,
Sep 24, 2018, 4:51:49 AM9/24/18
to Django developers (Contributions to Django itself)
Then the form will raise a Validation error saying there is already a book with that user and that name. Confusing the user.
What error message are you gonna put instead? Book with that name already exists? This is even more confusing, because its actually allowed to have more books with that name.
What about Hidden field errors? Aren't they confusing?

If you think a ModelForm giving guarantee that the save method will work make sense, the best way to proceed is a third party module.
Yes, lets keep pretending that ModelForms are just Forms, without Model in their name. Server errors are much better than "Confusing" form errors.


Simply do a Book.objects.filter(user=form.user, name=form.cleaned_data["name"]). exists()
 Why should I repeat something already implemented? More over, how are you gonna keep this up to date with model unique constrains ? So every time I touch unique_together I should go around all the forms and double check if this wont break anything ?

Todor Velichkov

unread,
Sep 24, 2018, 5:04:39 AM9/24/18
to Django developers (Contributions to Django itself)
One more thing I forgot to add

I would add a clean_book method on the form and use it to check if the user already have a book with that name.

You actually can't to this, because this just a simplified example where we have two fields only, what if we have a 3rd one and two of the fields are visible for the user? Then you will have to use Form.clean, where you will basically rewrite the unique check which Django decides to skip.

Protik

unread,
Sep 24, 2018, 8:30:01 AM9/24/18
to Django developers (Contributions to Django itself)
Yes, it is working now.

Thank you very much.
Reply all
Reply to author
Forward
0 new messages