Consistency issues with Cloud SQL

72 views
Skip to first unread message

Matt Doran

unread,
Sep 25, 2013, 7:55:37 AM9/25/13
to google-cloud...@googlegroups.com
Hi there,

We have an appengine application that makes use of Cloud SQL for some of it's storage needs.  We are seeing cases where a record written in a previous transaction is not available a short time later, and then it reappears again after that.

Here's the basic gist of what happens:
 * Task 1 runs creating a record in the CloudSQL instance.  We have added debugging to retrieve this row and confirm it's saved within this task
 * Task 2 runs 30 seconds later and looks for the row created in task 1.  It does not exist.   We have the same debug logging code from task 1 and it returns nothing.  Because this record cannot be found our task execution fails, and google app engine re-schedules it for execution.
 * Task 2 re-runs a short period later and this time the record is found and the task is processed correctly.

This situation occurs regularly but without an obvious pattern.   It probably occurs a few time in every 40-50 tasks of this kind that occur over the period of a day.   It's only for the retry behaviour of the task queues that allows this to eventually succeed.

Our cloud SQL instance has the replication mode set to "Synchronous".

We are at a loss to explain it.  All the extra debug logging and testing we've added only confirms this strange behaviour... and hasn't provided any answers.

Does anyone have any ideas on what is happening here?   It's like we're seeing some sort of eventual consistency on our cloud SQL instance???

Regards,
Matt




Debangsu Sengupta

unread,
Sep 25, 2013, 2:08:09 PM9/25/13
to google-cloud...@googlegroups.com
Hi Matt,

I'd check to see whether task 1's transaction is being committed prior to task 2 reading it. Please try it with autocommit mode enabled.

References:

Thanks,
-debangsu


--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/5b6a0a2c-cd2c-41db-a3d2-b809cba4353d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Matt Doran

unread,
Sep 27, 2013, 8:43:46 AM9/27/13
to google-cloud...@googlegroups.com
Hi there,

I'm certain that this is being written within a transaction in a separate task queue (web request) to the queries that happen 10s of seconds later and the data doesn't exist.

I'm trying to get to the bottom of this .... but I just can't get my head around it.

Regards,
Matt


On Thursday, 26 September 2013 04:08:09 UTC+10, Debangsu Sengupta wrote:
Hi Matt,

I'd check to see whether task 1's transaction is being committed prior to task 2 reading it. Please try it with autocommit mode enabled.

References:

Thanks,
-debangsu
On Wed, Sep 25, 2013 at 4:55 AM, Matt Doran <matt....@papercut.com> wrote:
Hi there,

We have an appengine application that makes use of Cloud SQL for some of it's storage needs.  We are seeing cases where a record written in a previous transaction is not available a short time later, and then it reappears again after that.

Here's the basic gist of what happens:
 * Task 1 runs creating a record in the CloudSQL instance.  We have added debugging to retrieve this row and confirm it's saved within this task
 * Task 2 runs 30 seconds later and looks for the row created in task 1.  It does not exist.   We have the same debug logging code from task 1 and it returns nothing.  Because this record cannot be found our task execution fails, and google app engine re-schedules it for execution.
 * Task 2 re-runs a short period later and this time the record is found and the task is processed correctly.

This situation occurs regularly but without an obvious pattern.   It probably occurs a few time in every 40-50 tasks of this kind that occur over the period of a day.   It's only for the retry behaviour of the task queues that allows this to eventually succeed.

Our cloud SQL instance has the replication mode set to "Synchronous".

We are at a loss to explain it.  All the extra debug logging and testing we've added only confirms this strange behaviour... and hasn't provided any answers.

Does anyone have any ideas on what is happening here?   It's like we're seeing some sort of eventual consistency on our cloud SQL instance???

Regards,
Matt




--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-discuss+unsub...@googlegroups.com.

Debangsu Sengupta

unread,
Sep 27, 2013, 5:55:07 PM9/27/13
to google-cloud...@googlegroups.com
Hi Matt,

Could you verify that the data was written out within transaction within Task 1 before being read by others?

How about trying the following to get a repro:
- Task 1 opens connection, writes record, transaction commit, closes connection.
- Task 1 opens new connection. Reads back data after transaction commit. Should see the write.
- Task 2 opens new connection. Reads back data. Should see the write.

Also, how did the scenario work with autocommit enabled?

Thanks,
-debangsu

To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/1be8d7ac-4f9a-44b0-aa08-f2fd13b3fbf6%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages