get_or_create gave IntegrityError: (1062, "Duplicate entry) error sometimes

1,988 views
Skip to first unread message

Zheng Li

unread,
Oct 12, 2012, 9:47:15 PM10/12/12
to django...@googlegroups.com
class AB(models.Model):
    a = models.ForeignKey(A)
    b = models.ForeignKey(B)
    c = models.IntegerField(default=0)
    d = models.FloatField(default=0)
    e = models.IntegerField(default=0)
    f = models.FloatField(default=0)
    class Meta:
        unique_together = (('a', 'b'),)

I have a class like above.
when I call get_or_create, sometimes IntegrityError: (1062, "Duplicate entry) error happens.
a, _ = AB.objects.get_or_create(a=a, b=b)

I really have googled and worked on it for a while, but still nothing.
anyone can help?
Thanks in advance.

Wei Wei

unread,
Oct 12, 2012, 10:06:49 PM10/12/12
to django...@googlegroups.com
I am new to Django. But does the error message literally mean you have duplicated records against your unique_together constriction?

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To post to this group, send email to django...@googlegroups.com.
To unsubscribe from this group, send email to django-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

Zheng Li

unread,
Oct 15, 2012, 8:50:30 AM10/15/12
to django...@googlegroups.com
get_or_create shouldn't give duplicate entry error
anyone help.

Reinout van Rees

unread,
Oct 15, 2012, 8:59:29 AM10/15/12
to django...@googlegroups.com
On 15-10-12 14:50, Zheng Li wrote:
> get_or_create shouldn't give duplicate entry error

Perhaps you have existing data that violates your unique-together
constraint? Data that existed before that constraint was in place?

Check it directly in your database by searching for that a=a, b=b entry.


Reinout

--
Reinout van Rees http://reinout.vanrees.org/
rei...@vanrees.org http://www.nelen-schuurmans.nl/
"If you're not sure what to do, make something. -- Paul Graham"

Tom Evans

unread,
Oct 15, 2012, 9:43:06 AM10/15/12
to django...@googlegroups.com
On Mon, Oct 15, 2012 at 1:50 PM, Zheng Li <dlli...@gmail.com> wrote:
> get_or_create shouldn't give duplicate entry error
> anyone help.
>

This is incorrect, and a common misconception: get_or_create() is not
atomic. If this happens, you will get duplicate key errors:

Thread A calls get_or_create(a='a', b='b')
Thread B calls get_or_create(a='a', b='b')
Thread A get_or_create looks for entry in DB, not found
Thread B get_or_create looks for entry in DB, not found
Thread B creates entry in DB
Thread A get_or_create tries to create entry in DB

Where I say 'thread', you could also read 'process'. If you are
capable of serving multiple requests simultaneously, this can happen,
even if you aren't using threads.

If this is a problem for you, use a transaction aware DB instead of
MySQL. It won't solve the issue, but at least you will be able to
unroll the transaction.

Cheers

Tom

Karen Tracey

unread,
Oct 15, 2012, 10:28:41 AM10/15/12
to django...@googlegroups.com
On Mon, Oct 15, 2012 at 9:43 AM, Tom Evans <teva...@googlemail.com> wrote:
On Mon, Oct 15, 2012 at 1:50 PM, Zheng Li <dlli...@gmail.com> wrote:
> get_or_create shouldn't give duplicate entry error
> anyone help.
>

This is incorrect, and a common misconception: get_or_create() is not
atomic. If this happens, you will get duplicate key errors:

No, get_or_create IS intended to be atomic.
 

Thread A calls get_or_create(a='a', b='b')
Thread B calls get_or_create(a='a', b='b')
Thread A get_or_create looks for entry in DB, not found
Thread B get_or_create looks for entry in DB, not found
Thread B creates entry in DB
Thread A get_or_create tries to create entry in DB

Internally, get_or_create code in thread A will now attempt to GET the object created by thread B. It will not automatically reflect the error to the caller. See:

https://github.com/django/django/blob/stable/1.4.x/django/db/models/query.py#L430
 

Where I say 'thread', you could also read 'process'. If you are
capable of serving multiple requests simultaneously, this can happen,
even if you aren't using threads.

If this is a problem for you, use a transaction aware DB instead of
MySQL. It won't solve the issue, but at least you will be able to
unroll the transaction.

The issue with MySQL here is when you ARE using its transactional DB engine, InnoDB, AND using its (default) "repeatable read" transaction isolation level. In that case, the DB will refuse to return the object created by thread B to thread A if thread A has already "read" the non-existence of such a row. See:

https://code.djangoproject.com/ticket/13906

This particular issue can be fixed by changing the transaction isolation level to "read committed". Although I have heard various people say this can cause "other problems" with MySQL apps, I've never seen a concrete example of one, so switching to "read committed" would be my preferred way to use get_or_created if I had to use MySQL/InnoDB.

Karen
--
http://tracey.org/kmt/

Tom Evans

unread,
Oct 15, 2012, 11:35:46 AM10/15/12
to django...@googlegroups.com
On Mon, Oct 15, 2012 at 3:28 PM, Karen Tracey <kmtr...@gmail.com> wrote:
> On Mon, Oct 15, 2012 at 9:43 AM, Tom Evans <teva...@googlemail.com> wrote:
>>
>> On Mon, Oct 15, 2012 at 1:50 PM, Zheng Li <dlli...@gmail.com> wrote:
>> > get_or_create shouldn't give duplicate entry error
>> > anyone help.
>> >
>>
>> This is incorrect, and a common misconception: get_or_create() is not
>> atomic. If this happens, you will get duplicate key errors:
>
>
> No, get_or_create IS intended to be atomic.
>

I can be intended to be whatever it wants, but since there is no
atomic database operation to get or create a tuple, there is no
guarantee of atomicity.

>>
>>
>> Thread A calls get_or_create(a='a', b='b')
>> Thread B calls get_or_create(a='a', b='b')
>> Thread A get_or_create looks for entry in DB, not found
>> Thread B get_or_create looks for entry in DB, not found
>> Thread B creates entry in DB
>> Thread A get_or_create tries to create entry in DB
>
>
> Internally, get_or_create code in thread A will now attempt to GET the
> object created by thread B. It will not automatically reflect the error to
> the caller. See:
>
> https://github.com/django/django/blob/stable/1.4.x/django/db/models/query.py#L430

I regularly - once every few months - see get_or_create raise
IntegrityError due to duplicate key on MySQL 5.0/MyISAM/Django 1.3. So
whilst that is the intention of the code, it isn't what happens. I'm
happy to blame MySQL.

Cheers

Tom

Zheng Li

unread,
Oct 16, 2012, 12:32:29 AM10/16/12
to django...@googlegroups.com
Isn't it because MyISAM does not support transaction?

BTW
I am using django 1.3.1, mysql 5.5, innodb
mysql> SELECT @@GLOBAL.tx_isolation, @@tx_isolation;
+-----------------------+-----------------+
| @@GLOBAL.tx_isolation | @@tx_isolation |
+-----------------------+-----------------+
| REPEATABLE-READ | REPEATABLE-READ |
+-----------------------+-----------------+

another question,
does "read committed" affects performance?
Reply all
Reply to author
Forward
0 new messages