[Django] #18557: get_or_create() causes a race condition with MySQL

1,564 views
Skip to first unread message

Cal Leeming [Simplicity Media Ltd]

unread,
Jul 16, 2012, 7:00:49 AM7/16/12
to django-d...@googlegroups.com
Hi guys,

Okay so, this has been marked as wontfix in its current approach.

The problem exists purely because of the way MySQL transactions and indexes work - if you create a row that matches a unique index, it won't show up as a row until you commit (which is correct), but if you try and insert another row that matches this same unique index it will say the key already exists (this is to prevent unique constraint race conditions within transactions).

This really isn't a bug in the code, but more that developers need to understand how the database works, and take the appropriate actions (in this case, it's that you need to commit() the transaction to guarantee safe usage of get_or_create()).

As aaugustin brought up, we can't automatically commit as this would break transaction control, and there is little point in adding a 'commit=True' argument onto get_or_create() because you could just do this yourself.

Therefore, I'm proposing a documentation update that highlights and explains this in a clear and simple way - rather than any sort of code change. Would this be an appropriate way forward?

Cal

---------- Forwarded message ----------
From: Django <nor...@djangoproject.com>
Date: Thu, Jul 12, 2012 at 10:25 PM
Subject: Re: [Django] #18557: get_or_create() causes a race condition with MySQL
To:
Cc: django-...@googlegroups.com


#18557: get_or_create() causes a race condition with MySQL
-------------------------------------+-------------------------------------
     Reporter:  foxwhisper           |                    Owner:  nobody
         Type:  Uncategorized        |                   Status:  closed
    Component:  Database layer       |                  Version:  1.4
  (models, ORM)                      |               Resolution:  wontfix
     Severity:  Normal               |             Triage Stage:
     Keywords:                       |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Changes (by aaugustin):

 * status:  reopened => closed
 * resolution:   => wontfix


Comment:

 Replying to [ticket:18557 foxwhisper]:
 > Is there any chance this patch would make it into the core?
 [[BR]]

 To be honest, as is, there is no chance for this code to make it into
 Django:
 - it isn't a patch, just a chunk of code,
 - it's specific to MySQL but it appears that it should go into
 django.db.models,
 - there's no explanation of why you're using this technique and why it
 works, and it isn't strikingly obvious either,
 - it messes transaction control,
 - no tests, no docs,
 - the code itself very far from Django's coding standards (`print`, `raise
 Exception`, etc.)

 Regarding transaction control, I see you've put the responsibility on the
 developer. It's hand-vawing, and I don't think we can put the issue aside
 like this. Transaction control is one of the more complicated parts of
 Django and I don't expect more than 0.1% of the framework's users to
 understand it. (I don't understand it completely myself.)

 I acknowledge that the problem exists, but this specific piece of code
 doesn't look like an appropriate solution, at least not in its current
 state.

 The way to move forward on a ticket that received a wontfix by a core
 developer is to bring up the issue on django-developers. Thanks!

--
Ticket URL: <https://code.djangoproject.com/ticket/18557#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Anssi Kääriäinen

unread,
Jul 16, 2012, 2:28:46 PM7/16/12
to Django developers
On 16 heinä, 14:00, "Cal Leeming [Simplicity Media Ltd]"
<cal.leem...@simplicitymedialtd.co.uk> wrote:
> Hi guys,
>
> Okay so, this has been marked as wontfix in its current approach.
>
> The problem exists purely because of the way MySQL transactions and indexes
> work - if you create a row that matches a unique index, it won't show up as
> a row until you commit (which is correct), but if you try and insert
> another row that matches this same unique index it will say the key already
> exists (this is to prevent unique constraint race conditions within
> transactions).
>
> This really isn't a bug in the code, but more that developers need to
> understand how the database works, and take the appropriate actions (in
> this case, it's that you need to commit() the transaction to guarantee safe
> usage of get_or_create()).
>
> As aaugustin brought up, we can't automatically commit as this would break
> transaction control, and there is little point in adding a 'commit=True'
> argument onto get_or_create() because you could just do this yourself.
>
> Therefore, I'm proposing a documentation update that highlights and
> explains this in a clear and simple way - rather than any sort of code
> change. Would this be an appropriate way forward?

I have tried different ways of solving this issue. To me it seems that
unless one does a commit in get_or_create or changes the default
isolation level used by MySQL then there is no cure for this issue.
And this issue seems to pop out regularly. I don't the above mentioned
approaches as good solutions.

I would like to see the possibility to use different isolation levels
if the user chooses so. This would be a good solution in my opinion:
use READ COMMITTED isolation level if you are having problems with
get_or_create. Allowing use of different isolation levels has other
advantages, too. The use of custom isolation level should come with a
big warning of "you should know what you are doing, or prepare for
trouble".

So, lets document the current behavior for now, and check if we can do
something to the isolation level issue later on.

- Anssi

Cal Leeming [Simplicity Media Ltd]

unread,
Jul 16, 2012, 4:18:09 PM7/16/12
to django-d...@googlegroups.com
Okay - anyone else want to throw their thoughts at this?

Also - messing with the isolation levels on MySQL is really not a great idea.. it can cause all sorts of unexpected behavior.

Cal


 - Anssi

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.


Ian Kelly

unread,
Jul 16, 2012, 4:43:31 PM7/16/12
to django-d...@googlegroups.com
On Mon, Jul 16, 2012 at 2:18 PM, Cal Leeming [Simplicity Media Ltd]
<cal.l...@simplicitymedialtd.co.uk> wrote:
> Okay - anyone else want to throw their thoughts at this?
>
> Also - messing with the isolation levels on MySQL is really not a great
> idea.. it can cause all sorts of unexpected behavior.

Just a thought -- I don't have MySQL handy to test on, but what if the
savepoint occurred before the first get instead of before the create?
Would rolling back the savepoint then release the row lock and allow
the second get to read the row?

If this works but performance is an issue, then perhaps we could
dynamically order the operations according to the known isolation
level?

Cheers,
Ian

Anssi Kääriäinen

unread,
Jul 16, 2012, 5:00:56 PM7/16/12
to Django developers
On 16 heinä, 23:18, "Cal Leeming [Simplicity Media Ltd]"
<cal.leem...@simplicitymedialtd.co.uk> wrote:
> Okay - anyone else want to throw their thoughts at this?
>
> Also - messing with the isolation levels on MySQL is really not a great
> idea.. it can cause all sorts of unexpected behavior.

The idea is that one could use different isolation levels when wanted,
something like @commit_on_success(isolation='READ COMMITTED'). This
should come with two warnings:
- You should know what you are doing.
- Django is designed to work in the default isolation level - all
guarantees are off when using different isolation levels.

I must say that the real reason why I would like the above implemented
is the "true serializable" isolation level in PostgreSQL 9.1+. It
would be very useful in many of my projects...

But, this is a much bigger change, and one that I don't expect to
happen soon (if ever). Definitely not a quick way to solve this issue.

- Anssi

Anssi Kääriäinen

unread,
Jul 16, 2012, 5:06:55 PM7/16/12
to Django developers
On 16 heinä, 23:43, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> On Mon, Jul 16, 2012 at 2:18 PM, Cal Leeming [Simplicity Media Ltd]
>
> <cal.leem...@simplicitymedialtd.co.uk> wrote:
> > Okay - anyone else want to throw their thoughts at this?
>
> > Also - messing with the isolation levels on MySQL is really not a great
> > idea.. it can cause all sorts of unexpected behavior.
>
> Just a thought -- I don't have MySQL handy to test on, but what if the
> savepoint occurred before the first get instead of before the create?
> Would rolling back the savepoint then release the row lock and allow
> the second get to read the row?
>
> If this works but performance is an issue, then perhaps we could
> dynamically order the operations according to the known isolation
> level?

Unfortunately this doesn't work. There is another ticket where this
was discussed. At first it seems to work, but it turns out that it
works only if the get_or_create is called as the first thing in the
query. See https://code.djangoproject.com/ticket/13906#comment:29 and
https://code.djangoproject.com/ticket/13906#comment:30 for details.
Basically MySQL in repeatable read isolation works weird in this case.

- Anssi

Cal Leeming [Simplicity Media Ltd]

unread,
Aug 8, 2012, 8:13:57 AM8/8/12
to django-d...@googlegroups.com
Does anyone else have any input at this stage??

It seems to me that the most appropriate way forward is a documentation update which explains that get_or_create() is not atomically safe across all databases, and may need a commit() before hand, or a read isolation change (with warnings about both).

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.




--

Cal Leeming
Technical Support | Simplicity Media Ltd
US 310-362-7070UK 02476 100401 Direct 02476 100402

Available 24 hours a day, 7 days a week.


Karen Tracey

unread,
Aug 8, 2012, 8:29:33 AM8/8/12
to django-d...@googlegroups.com
On Wed, Aug 8, 2012 at 8:13 AM, Cal Leeming [Simplicity Media Ltd] <cal.l...@simplicitymedialtd.co.uk> wrote:
Does anyone else have any input at this stage??

It seems to me that the most appropriate way forward is a documentation update which explains that get_or_create() is not atomically safe across all databases, and may need a commit() before hand, or a read isolation change (with warnings about both).


The docs should note that READ COMMITTED isolation level, or database-level autocommit of every query, is required for get_or_create() to work properly under MySQL. Recommending an alternative of a commit beforehand strikes me as wrong.

Karen

Cal Leeming [Simplicity Media Ltd]

unread,
Aug 8, 2012, 9:25:18 AM8/8/12
to django-d...@googlegroups.com
I'm not entirely sure that suggesting every query needs to be committed is the right way forward either, given that you only need to commit once before get_or_create() is called to prevent the issue.

Could you expand on why you feel every query would need to be committed?

Cal

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Karen Tracey

unread,
Aug 8, 2012, 10:55:03 PM8/8/12
to django-d...@googlegroups.com
On Wed, Aug 8, 2012 at 9:25 AM, Cal Leeming [Simplicity Media Ltd] <cal.l...@simplicitymedialtd.co.uk> wrote:
I'm not entirely sure that suggesting every query needs to be committed is the right way forward either, given that you only need to commit once before get_or_create() is called to prevent the issue.

Could you expand on why you feel every query would need to be committed?

get_or_create does:

1 - get
2 - if get fails due to DoesNotExist, create
3 - if create fails due to IntegrityError, get again

The problem with repeatable read transaction level is that step 3 can never return something other than what was found (or not) in step 1. If some other thread, between the times at which 1 and 2 happen, creates what this thread is attempting to get_or_create, then the get_or_create is going to fail, because the get will still find nothing in step 3, causing get_or_create to re-raise the IntegrityError from step 2.

Issuing a commit before calling get_or_create doesn't fix this race condition that exists within the get_or_create code itself. Such a pre-call commit is only going to be helpful if the current transaction has already done a read that has fixed what is going to be returned by the DB in step 1. If that has happened, then committing before calling get_or_create will reduce the likelihood of hitting the race condition, since step 1 may then read a row that has been created subsequent to the last read by this thread but before this thread gets to step 1 in get_or_create. Essentially the pre-commit narrows the race condition window in this case. (If no read has yet been done by the transaction to already set in stone what this transaction will read in step 1, then committing before calling get_or_create does not help at all.)

Regardless of what's done before the call into get_or_create, there's still a small window of opportunity, between steps 1 and 2, for get_or_create to fail. get_or_create requires either DB level autocommit or read committed transaction isolation level in order for it to be free of this race condition. Recommending anything else in Django's docs would be misleading.

Karen

Ian Kelly

unread,
Aug 8, 2012, 11:09:33 PM8/8/12
to django-d...@googlegroups.com


On Aug 8, 2012 10:25 AM, "Cal Leeming [Simplicity Media Ltd]" <cal.l...@simplicitymedialtd.co.uk> wrote:
>
> I'm not entirely sure that suggesting every query needs to be committed is the right way forward either, given that you only need to commit once before get_or_create() is called to prevent the issue.

No, that's not sufficient. The crux of the problem is that the two get attempts within get_or_create are within the same transaction, and so to satisfy the requirements of REPEATABLE READ they *must* return the same results. So it's not enough just to start with a clean transaction; in order to prevent this you would need to commit or rollback between the two gets, *within* the get_or_create call. MySQL actually does handle this much correctly AFAICT. The weirdness is just that it defaults to RR instead of RC.

Resorting to database-level autocommit would be overkill (and might create some other issues), but it does solve this particular problem by adding the necessary commit in the one place it's needed.

Cal Leeming [Simplicity Media Ltd]

unread,
Aug 9, 2012, 5:56:16 PM8/9/12
to django-d...@googlegroups.com
Thanks for the detailed explanation on this.

Am I correct in assuming you are referring to the following decorator as being the fix for the problem?

@transaction.autocommit
def viewfunc(request):
    pass




Karen

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Cal Leeming [Simplicity Media Ltd]

unread,
Aug 9, 2012, 5:58:11 PM8/9/12
to django-d...@googlegroups.com
Sorry, please ignore that last message, I see now that you were referring to this:


So essentially, the official documentation would state that to resolve this problem, you would use the following for your db settings:

'OPTIONS': {
    'autocommit': True,
}

Is that correct?

Cal

Karen Tracey

unread,
Aug 11, 2012, 6:29:22 PM8/11/12
to django-d...@googlegroups.com
On Thu, Aug 9, 2012 at 5:58 PM, Cal Leeming [Simplicity Media Ltd] <cal.l...@simplicitymedialtd.co.uk> wrote:
Sorry, please ignore that last message, I see now that you were referring to this:


So essentially, the official documentation would state that to resolve this problem, you would use the following for your db settings:

'OPTIONS': {
    'autocommit': True,
}

Is that correct?

No...that syntax is pulled out of a PostgreSQL doc note and I don't think it would work with MySQL, though I am not entirely sure of that.

Also I am not sure I would recommend a global DB level setting for this -- you're dispensing with any transactions at that point, which may well be inappropriate for an app that is having trouble with get_or_create. It's very hard for Django to give explicit instructions for what is best to do "in general" since it all depends on the needs of the application with respect to transactions. I would say in general I'd recommend "read committed" isolation level vs. database-level autocommit, but the ticket noted that read committed "can break legacy apps" (why, I'm not sure, and it doesn't explain), so for the sake of completeness I mentioned that database level autocommit would also fix the race condition that exists in get_or_create.

I don't believe the doc can give a blanket "do this and your code will work" statement here. It can say Django's own code in get_or_create requires either that the transaction isolation level be set to "read committed" or DB level autocommit be used. Whether that is best done for the project globally via an init_command or only for certain requests via explicit cursor commands (see http://groups.google.com/group/django-users/msg/55fa3724d2754013) depends on the project itself and what else it is doing besides calling get_or_create.

Karen

Cal Leeming [Simplicity Media Ltd]

unread,
Aug 12, 2012, 11:41:15 AM8/12/12
to django-d...@googlegroups.com
Based on all the responses given so far, here are the options available.

Further feedback would be much appreciated.

A) Use READ COMMITTED as a global/my.cnf:

Last time we tried read committed isolation levels, it caused various PHP applications to break for an unknown reason - as it was in a production environment we had to instantly revert to save downtime, and after it was reverted the problem went away. Sadly no time time was put into finding the exact cause of why this broke.

This also broke PHP applications to which we did not have any source code access, and caused some deadlocking problems - again, due to lack of source code we were unable to determine the root cause of the problem.

In general, it seems that READ COMMITTED may break apps that execute database queries in a certain way.


B) Use "SET TRANSACTION ISOLATION LEVEL READ COMMITTED" before every transaction (or apply to the session somehow).

It is not clear how this would integrate nicely into the ORM. Is there a cleaner way of ensuring a transaction isolation level is set to read commited, other than having to call the following before every transaction?

>>> connection.cursor().execute('SET TRANSACTION ISOLATION LEVEL READ COMMITTED')

You could apply the above to the session but, as above, I'm not sure how you'd ensure every db session had this query executed, other than doing it manually (which again, seems ugly).

>>> connection.cursor().execute('SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED')

Should there perhaps be an additional ticket that raises the need to have a decorator that does this for us, or a settings.py attribute of some sort?


C) Execute a commit before each get_or_create() call

It is worth noting that committing the transaction prior to calling get_or_create() has given our apps a 100% success rate in preventing this race condition, where as previously the app wouldn't even last 60 seconds without throwing an IntegrityError exception.

So although this approach is not fool proof (as you detailed in your earlier threads), it has been close enough to prevent the problem from happening in our use case.


D) Use database level auto commit

Karen - could you clarify further on this, as I might have misunderstood.

Looking at the docs, MySQL states that autocommit for transactions are enabled by default

I couldn't find any other mentioning of 'autocommit' in the MySQL docs - so I'm not sure this would have any impact?


Cal Leeming [Simplicity Media Ltd]

unread,
Aug 12, 2012, 11:46:08 AM8/12/12
to django-d...@googlegroups.com
On Sun, Aug 12, 2012 at 4:41 PM, Cal Leeming [Simplicity Media Ltd] <cal.l...@simplicitymedialtd.co.uk> wrote:
Based on all the responses given so far, here are the options available.

Further feedback would be much appreciated.

A) Use READ COMMITTED as a global/my.cnf:

Last time we tried read committed isolation levels, it caused various PHP applications to break for an unknown reason - as it was in a production environment we had to instantly revert to save downtime, and after it was reverted the problem went away. Sadly no time time was put into finding the exact cause of why this broke.

This also broke PHP applications to which we did not have any source code access, and caused some deadlocking problems - again, due to lack of source code we were unable to determine the root cause of the problem.

In general, it seems that READ COMMITTED may break apps that execute database queries in a certain way.


B) Use "SET TRANSACTION ISOLATION LEVEL READ COMMITTED" before every transaction (or apply to the session somehow).

It is not clear how this would integrate nicely into the ORM. Is there a cleaner way of ensuring a transaction isolation level is set to read commited, other than having to call the following before every transaction?

>>> connection.cursor().execute('SET TRANSACTION ISOLATION LEVEL READ COMMITTED')

You could apply the above to the session but, as above, I'm not sure how you'd ensure every db session had this query executed, other than doing it manually (which again, seems ugly).

>>> connection.cursor().execute('SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED')

Should there perhaps be an additional ticket that raises the need to have a decorator that does this for us, or a settings.py attribute of some sort?


> You certainly can do this already with the OPTIONS key on the Database dictionary, see  http://docs.djangoproject.com/en/dev/ref/settings/#options
>  To set ISOLATION LEVEL use something like {"init_command": "SET SESSION TRANSACTION ISOLATION LEVEL $DESIRED_ISOLATION_LEVEL"}

Anssi Kääriäinen

unread,
Aug 13, 2012, 2:19:13 PM8/13/12
to Django developers
On 12 elo, 18:41, "Cal Leeming [Simplicity Media Ltd]"
<cal.leem...@simplicitymedialtd.co.uk> wrote:
> *C) Execute a commit before each get_or_create() call*
>
> It is worth noting that committing the transaction prior to calling
> get_or_create() has given our apps a 100% success rate in preventing this
> race condition, where as previously the app wouldn't even last 60 seconds
> without throwing an IntegrityError exception.
>
> So although this approach is not fool proof (as you detailed in your
> earlier threads), it has been close enough to prevent the problem from
> happening in our use case.

I have been long wondering how come users hit this race condition so
often. But now it is clear: the problem isn't usually that the second
get can't see the DB state, it is the first get that doesn't see the
already existing instance. As such, committing before each
get_or_create() should fix the issue for most users, just running
short transactions would usually be enough.

But, I really do think the correct fix for this is to allow using
different isolation levels per transaction.

As for the get_or_create(): I don't think we can do much else than to
document this. I am not sure suggesting use of non-default isolation
level globally is a good idea, but just suggesting using short
transactions and repeat the transaction on integrity error would be a
good start. Later on we could then suggest use of READ COMMITTED for
the transactions containing get_or_create().

Option B) seems scary to me. I am not sure but wouldn't issuing such
commands confuse the backend library, that is the lib thinks the
isolation level is REPEATABLE READ, but it was changed to READ
COMMITTED behind its back. If so, surprising bugs might arise.
Autocommit (option D) will not help if the get_or_create() is ran in
transaction.

- Anssi

lucky

unread,
Sep 4, 2012, 10:53:32 AM9/4/12
to django-d...@googlegroups.com
E) Use locking with SELECT query
According to http://dev.mysql.com/doc/refman/5.5/en/innodb-consistent-read.html

 If you want to see the “freshest” state of the database, use either the READ COMMITTED isolation level or a locking read:
 SELECT * FROM t LOCK IN SHARE MODE;

It locks the SELECT query until all concurrent transactions will commit requested rows. They are conventional conflict resolution mechanisms for databases.

Cal Leeming [Simplicity Media Ltd]

unread,
Sep 4, 2012, 11:36:42 AM9/4/12
to django-d...@googlegroups.com
Curious - I didn't know about that.

At some point I will have to build a unit test for all these approaches, and see which comes out best - prob won't have time to sort this out for a few weeks though.

Cal

To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/eyBFQG9jtmIJ.

Anssi Kääriäinen

unread,
Sep 4, 2012, 7:32:34 PM9/4/12
to Django developers


On 4 syys, 17:53, lucky <e.genera...@gmail.com> wrote:
> *E) Use locking with SELECT query*
> According tohttp://dev.mysql.com/doc/refman/5.5/en/innodb-consistent-read.html
>
>  If you want to see the “freshest” state of the database, use either the
> READ COMMITTED isolation level or a locking read:
>  SELECT * FROM t LOCK IN SHARE MODE;

If I recall correctly the problem with this approach is that while you
can read the object from the DB, you can't save it back. UPDATE and
SELECT in .save() doesn't see the row, INSERT results in integrity
error.

I might remember this wrong, so this is something to test when trying
this approach.

- Anssi

lucky

unread,
Sep 5, 2012, 3:09:36 AM9/5/12
to django-d...@googlegroups.com


среда, 5 сентября 2012 г., 5:32:44 UTC+6 пользователь Anssi Kääriäinen написал:

The snapshot of the database state applies to SELECT statements within a transaction, not necessarily to DML statements. If you insert or modify some rows and then commit that transaction, a DELETE or UPDATE statement issued from another concurrent REPEATABLE READ transaction could affect those just-committed rows, even though the session could not query them. If a transaction does update or delete rows committed by a different transaction, those changes do become visible to the current transaction.

Therefore REPEATABLE READ is all about the SELECT results. To spy on the results committed by another transaction you could use "SELECT .. LOCK IN SHARE MODE". All DML statements (INSERT, DELETE, UPDATE, REPLACE, and etc.) will be affect to committed rows in any way.
If DML statement from the current transaction changes the row, committed by another transaction, then it becomes available among ordinary SELECT results.
(but it seems that there is no way to hack the consistent reading with "dummy" update at INSERT INTO t (k, v) VALUES (1,2) ON DUPLICATE KEY UPDATE v=v; because, technically it has no affect any rows).

               Session A                            Session B
           CREATE TABLE t (k INT, v INT, PRIMARY KEY (k));

SET autocommit=0;
SET autocommit=0; time | SELECT * FROM t; | empty set | INSERT INTO t VALUES (1, 2); | v SELECT * FROM t; empty set

SELECT * FROM t LOCK IN SHARE MODE; (it blocks here) COMMIT; --------------------- | 1 | 2 | --------------------- 1 row in set

           SELECT * FROM t;
           empty set
 
           UPDATE t SET v=3 WHERE k=1;
           1 row affected

SELECT * FROM t;
           ---------------------
           |    1    |    3    |
           ---------------------
           1 row in set
 - Anssi

Anssi Kääriäinen

unread,
Sep 5, 2012, 3:34:50 AM9/5/12
to Django developers
On 5 syys, 10:09, lucky <e.genera...@gmail.com> wrote:
>                Session A                            Session B           CREATE TABLE t (k INT, v INT, PRIMARY KEY (k));
>
>            SET autocommit=0;                    SET autocommit=0;
> time
> |          SELECT * FROM t;
> |          empty set
> |                                               INSERT INTO t VALUES (1, 2);
> |
> v          SELECT * FROM t;
>            empty set
>
>            SELECT * FROM t LOCK IN SHARE MODE;
>            (it blocks here)                                                COMMIT;
>            ---------------------
>            |    1    |    2    |
>            ---------------------
>            1 row in set
>
>            SELECT * FROM t;
>            empty set

This is going to be a problem. The .save() logic is implemented as
SELECT, if found UPDATE, else INSERT. There also seems to be a
possibility of a deadlock. These comments contain more information
about this problem: https://code.djangoproject.com/ticket/13906#comment:22,
https://code.djangoproject.com/ticket/13906#comment:25.

The proposal in that ticket is to use FOR UPDATE select, but it seems
FOR SHARE could be better. Maybe the deadlock problem is solved by
this, too.

There are two ways to solve the .save() issue: either use SELECT FOR
SHARE in .save() when trying to save an object fetched with FOR SHARE,
or change the logic to UPDATE, if nothing updated, INSERT. The latter
choice is implemented in https://code.djangoproject.com/ticket/16649

Unfortunately we have documented explicitly that .save() does SELECT,
if found UPDATE, else INSERT.

>
>            UPDATE t SET v=3 WHERE k=1;
>            1 row affected
>
>            SELECT * FROM t;
>
>            ---------------------
>            |    1    |    3    |
>            ---------------------
>            1 row in set

Apart of the above .save() problem this looks promising.

- Anssi
Reply all
Reply to author
Forward
0 new messages