Guardrails around lack of model validation on blank=True fields

243 views
Skip to first unread message

Jacob Walls

unread,
Jul 25, 2021, 9:12:13 AM7/25/21
to Django developers (Contributions to Django itself)

Hi group,

We have several accepted tickets regarding developer headaches when a blank=True field, which skips model validation by design[1], contains inappropriate empty values (e.g. None for a not-nullable field or the empty string where empty strings aren't allowed). I want to see what the community thinks is reasonable here and move them to a resolution or potentially wontfix them.

22224 — CharField (None from an application/json encoded payload isn’t cleaned to empty string)

27697 — JSONField (None isn’t cleaned to “{}”)

20205 — PositiveIntegerField (empty string unexpectedly passes model validation if blank=True. In this case the user was opting-in to model validation.)

***

For 22224—CharField—we could take Simon’s proposal to last-minute cast None to empty string in get_db_prep_value() if the field’s empty_strings_allowed is True and null=False.

For 27696—JSONField—similarly, we could cast from None to "{}" if null=False.

For 20205—PositiveIntegerField, and by extension, any field where empty_strings_allowed is False—we might run model validation if we have an empty string to ensure we fail at the model level instead of the db, although this is not the usual practice for blank=True fields. See PR.

The backwards compatibility concerns seem limited, because users are presumably working around these edge cases today, since invalid data is not being saved to the database. But there is the question of how much effort to expend here. I’m willing to see these through if we have a consensus around the best way to proceed.

All best,

Jacob


[1] blank=True and null=False and injecting-data-before-save being an idiom we don’t want to remove, see: https://code.djangoproject.com/ticket/22224#comment:7

Carlton Gibson

unread,
Jul 27, 2021, 6:08:31 AM7/27/21
to Django developers (Contributions to Django itself)
Hi Jacob. 

Thanks for the follow-up. Good hustle! 😉

I saw your comments on Trac last week, and was preparing a response. 
The trouble with these 8, 7, and 5 year old tickets is that they're tricky… 

Let me put the take-home first, I think that in-themselves we should close all 
three of these tickets as wontfix. 

* The issues arise due to the `blank=True` usage, which is as expected, but
  folks need to be using the right forms fields, or implementing `clean` on their
  model, or various `clean_<field>` methods on their forms to ensure that sane
  values are provided, or … — there are lots of options here. 
* Silently adjusting the value the ORM receives on the basis of `blank` would
  lead to more issues than it would resolve. 
* If users want to implement custom fields that automatically map given values,
  then the API is there to do this. **BUT** that belongs properly to the form
  layer: they'd be better off correcting the issue there. In any case, it is not
  something we should do by default. 
* `blank` is under-documented. We should expand those docs to spell-out that
  validation is skipped once it's in play, and so the extra steps that a user
  needs to take.

Take the simplest example, ignoring `blank`, and validation: 

```
from django.db import models

class A(models.Model):
    b = models.CharField(max_length=100)

a = A(b=None)
a.save()  # IntegrityError: NOT NULL constraint failed: app_a.b
```

That **is** the behaviour we want. I set bad values on the model instance. I
try to save. The DB error is raised. 

Let's call `full_clean()` there, with no other changes: 

```
a = A(b=None)
a.full_clean()  # ValidationError: This field cannot be null.
```

All good. We can call `full_clean()` prior to `save()` to run the field
validation.

The issue comes adding `blank=True`, which causes the field to be excluded from
`full_clean()`: 

```
class A(models.Model):
    b = models.CharField(max_length=100, blank=True)

a = A(b=None)
a.full_clean()  # This now passes. 
a.save()  # IntegrityError: NOT NULL constraint failed: app_a.b
```

The issues report this as surprising. But it's precisely what setting
`blank=True` does: exclude the field from validation. It **is** the expected
result. 

Given that this issue comes up, we should do a better job of documenting it. 


What I don't think we can/should do is to silently map to a correct value. Here
we're still at the ORM layer. We positively **want** the IntegrityError when we
pass bad values and try to save. As tricky as these issues are, how much harder
would they be if you instantiate a model with `None` but what's actually passed
to the DB is `''`: 

```
a = A(b=None)
```

— that way lies madness. (If I've opted into this mapping with a custom field, that's a different topic: I'm presumably expecting it then.)

**Rather**, the issue here lies at the user's form layer. The ORM is protecting
me by raising the IntegrityError if I give it bad data, but my job is to make
sure my form (or if I'm using DRF serialiser) outputs the correct data types
for my field. 

For #20205, the form for the nullable PositiveIntegerfield should give me
`None`, not `''`. (This was noted on the ticket.) 

Similarly, for #22224 the form for a CharField should give me an empty string
`''`, rather than `None`.

#27697 is more complex. By the time we reach the model, we're (again) expecting
#the form to provided us a sensible empty value (`{}`) — but there are related
#questions about how we render the initial and so on. Those, though, belong at
#the form/widget level: it's still not the ORM's business. 

Summary there: the ORM is protecting you from gaps in your form layer. We like
that. 

Can we do **something** at the model layer? — Yes, we can implement `clean()`
to do the right thing for `blank=True` fields. Or we can have fields do
something clever in `get_db_prep_value()`. 

Of those two, I much prefer the first: 

* It's the long-standing one-way to do things. 
* When I'm writing my model, my clean method is right in front of me, in the
  same place as my field definitions. (I don't need to go digging into Field
  internals to see how/what mapping is applied.)

Expanding the `blank` docs[0] in the model field reference to point to the
`Model.clean()` docs[1], and maybe adding an example for a `blank=True` field
in the latter, would probably help folks to see what's missing. 🤔


But beyond that, I don't think a change is in order. So I'd close as wontfix. 
I think from your comments you more or less already think that? 

Ref your PR[2], I don't think adding the `and (f.empty_strings_allowed or
raw_value != '')` special case in `django/db/models/base.py` is the right way
to go. 

For one, any of the other `empty_values` except `''` will still exhibit the
issue (e.g. the test case you added fails with `{}`). 

But more, `BaseModelForm._get_validation_exclusions()` will already add the
relevant fields to the `exclude` list, so assuming normal usage the
ValidationError won't be raised anyway. (We're back to the form layer.)


There's some good stuff in the discussion on #20205, but I think it's relevant
to _Advanced techniques with custom model fields_, rather than a change we can
realistically make.

I hope that all makes sense. What do you think? 

Kind Regards,

Carlton




Jacob Walls

unread,
Jul 27, 2021, 8:22:50 AM7/27/21
to Django developers (Contributions to Django itself)
This makes good sense, Carlton, and I didn't mean to be rushing your reply!

"That way lies madness" is entirely correct. Just wanted to map the terrain after I had so confidently commented on one Trac ticket about wontfix-ing it, since on it there were various opinions about possible courses of action. I think closing the behavior tickets is the right course to follow here.

All best,
Jacob

Carlton Gibson

unread,
Jul 27, 2021, 8:45:31 AM7/27/21
to Django developers (Contributions to Django itself)
OK, super. 

Would you then fancy looking at that docs change? It does come up. It _is_ surprising that `blank` has this effect, and I'm not convinced it says clearly anywhere how you should approach it…

Thanks for your efforts digging into this! 👍
C.

Raffaele Salmaso

unread,
Jul 27, 2021, 5:46:33 PM7/27/21
to django-d...@googlegroups.com
Hi all,
related to these tickets I want to remind this old ticket https://code.djangoproject.com/ticket/10244 which needs a decision.

Thanks!

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/58616ad3-54b6-4b21-b252-96040680fa04n%40googlegroups.com.


--

Carlton Gibson

unread,
Aug 1, 2021, 4:58:15 AM8/1/21
to Django developers (Contributions to Django itself)
Hi Raffaele. Looking at #10244, it looks as if it needs someone to pick up Aymeric's suggestion. Would that be you? 😀
(Since you favour the option 2 there, I'd try that — it's probably OK with a suitable release note, as Aymeric said.) 
C. 

Reply all
Reply to author
Forward
0 new messages