unique_together does not work as expected with nullable fields

閲覧: 2,094 回
最初の未読メッセージにスキップ

Rich Rauenzahn

未読、
2016/04/28 17:59:222016/04/28
To: Django developers (Contributions to Django itself)

I just got bitten by this today, finding a duplicate row where I didn't expect one.  I haven't been able to find an existing Django bug.

It's a popular topic on stack overflow:


This is apparently an expected (and standardized) thing in SQL that ('A', 'B', NULL) is unique to ('A', 'B', NULL) as NULL is never equal to another NULL.

There is a workaround at the SQL level of ... 

CREATE UNIQUE INDEX ab_c_null_idx ON my_table (id_A, id_B) WHERE id_C IS NULL;

I'm wondering if this ought to at least be addressed in a runtime warning, or at least documentation in unique_together -- and I'm hoping that perhaps a Django level workaround could be devised to explicitly ask for unique indexes accommodating null values.

For myself, I'm writing a unittest to fail if any of my unique_together's have a nullable field and using a specific value as my "null" value for now.

Thoughts?  Has this come up before?


Florian Apolloner

未読、
2016/04/29 3:51:312016/04/29
To: Django developers (Contributions to Django itself)
Hi,


On Thursday, April 28, 2016 at 11:59:22 PM UTC+2, Rich Rauenzahn wrote:
This is apparently an expected (and standardized) thing in SQL that ('A', 'B', NULL) is unique to ('A', 'B', NULL) as NULL is never equal to another NULL.

Yes, though the standard goes even further: every comparision with NULL results in false, in that sense NULL is never equal to NULL but also never unequal to NULL.
 
There is a workaround at the SQL level of ... 

CREATE UNIQUE INDEX ab_c_null_idx ON my_table (id_A, id_B) WHERE id_C IS NULL;

This only creates an index on two columns though, you might still be interested in indexing the cases where A, B, C are not null…
 
I'm wondering if this ought to at least be addressed in a runtime warning,

Runtime warnings are the worst imo, if at all a system check.
 
or at least documentation in unique_together -- and I'm hoping that perhaps a Django level workaround could be devised to explicitly ask for unique indexes accommodating null values.

I am not against a note in the docs, but I find the fact that nulls are not "unique" and can exist in an index more than once very useful (fwiw ordering after a column with null can also be interesting across databases). I'd be interested to hear about your use case -- the "general" use case is usually that you have an optional column but want to ensure it is unique as soon as it is filled…

Cheers,
Florian

Aymeric Augustin

未読、
2016/04/29 4:03:302016/04/29
To: django-d...@googlegroups.com
Hello,

In SQL, defining a unique index on a nullable column will only enforce unicity of non-null values. This behavior seems more useful than allowing exactly one null value.

Your example adds two more columns to the index. Other than that, it’s the exact same situation.

In my opinion, the conclusion here is “users of Django still need to know a little bit about SQL”. I don’t think this warrants making code changes. We could add something to the documentation.

-- 
Aymeric.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/566d247e-4aae-429e-9cc3-2544c82ce9a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Anssi Kääriäinen

未読、
2016/04/29 5:16:452016/04/29
To: django-d...@googlegroups.com
If you really, really want an unique index that allows just a single
value, you might want to try unique index on (a, b, c) where c is not
null, and another unique index on (a, b) where c is null. That might
give the results you are looking for, though I haven't tested this.

SQL's nulls are weird, but in this case the feature is actually very
useful, especially for single column indexes. Unfortunately this
causes an abstraction leak, but we'll just have to live with that.

- Anssi
> https://groups.google.com/d/msgid/django-developers/92297BB3-F01E-42C7-8F18-8FCB32F927EC%40polytechnique.org.

Rich Rauenzahn

未読、
2016/04/29 13:52:152016/04/29
To: Django developers (Contributions to Django itself)


On Friday, April 29, 2016 at 12:51:31 AM UTC-7, Florian Apolloner wrote:

I am not against a note in the docs, but I find the fact that nulls are not "unique" and can exist in an index more than once very useful (fwiw ordering after a column with null can also be interesting across databases). I'd be interested to hear about your use case -- the "general" use case is usually that you have an optional column but want to ensure it is unique as soon as it is filled…


Let's see if I can explain my use case without having to explain my whole domain ... 

I have a Model where it has a boolean field, "BOO".  When "BOO" is False, another field "VAL" should have a meaningful value, otherwise NULL.  VAL is the only nullable field.  (And yes, the boolean is actually superfluous, "VAL" is sufficient for the logic.)

I want the PK of the Model to always be unique combined with BOO=False and VAL (and a couple of other non-nullables).  But I also don't want duplicate values of PK,BOO=True,VAL=null, which I am currently getting.  Put another way, only one row should have PK,Boo=True for each PK, but I can have many PK,Bool=False

I see now that I need to provide a sentinel value -- BOO=True,VAL=<sentinel>, or manually create additional unique indexes.

Since it is conceivable for Django to create the right indexes to handle the null case, it would be nice to somehow be able to explicitly ask for what I want expressed in Django.  (unique_together obviously can't change its current default behavior.)

Is that helpful?

Rich Rauenzahn

未読、
2016/04/29 13:56:542016/04/29
To: Django developers (Contributions to Django itself)


On Friday, April 29, 2016 at 2:16:45 AM UTC-7, Anssi Kääriäinen wrote:
If you really, really want an unique index that allows just a single
value, you might want to try unique index on (a, b, c) where c is not
null, and another unique index on (a, b) where c is null. That might
give the results you are looking for, though I haven't tested this.

What I'm suggesting is a way to express that index within Django, similar to unique_together (and perhaps a warning in the docs, given the frequency is comes up on stackoverflow.)

I see now that since multi column indexes are an extension of a single column index, it makes sense -- you'd never want a single column index to only have one null value.

Aymeric Augustin

未読、
2016/04/29 14:00:372016/04/29
To: django-d...@googlegroups.com
Hi Rich,

On 29 Apr 2016, at 19:52, Rich Rauenzahn <rrau...@gmail.com> wrote:

I see now that I need to provide a sentinel value -- BOO=True,VAL=<sentinel>, or manually create additional unique indexes.

Indeed, you should write a migration with a RunSQL operation that creates a unique index on boo where boo = true. Then you can have only one row with boo = True.

-- 
Aymeric

Rich Rauenzahn

未読、
2016/05/02 12:10:252016/05/02
To: Django developers (Contributions to Django itself)
It sounds like my request should probably be appended to this ticket, then:  https://code.djangoproject.com/ticket/11964 (Add the ability to use database-level CHECK CONSTRAINTS)

全員に返信
投稿者に返信
転送
新着メール 0 件