Does Datastore (or its backups) include multiple versions of an entity?

94 views
Skip to first unread message

Anastasios Hatzis

unread,
Apr 11, 2016, 12:54:04 PM4/11/16
to Google App Engine
Hi, please let me know if this question is better suited for the dedicated Google Cloud Datastore group or some other online resource. However, I use Datastore in combination with GAE Python apps.

This week-end I have migrated one of my production apps and now have received user reports about some stale data in datastore they have discovered, while their corresponding documents in Search API are up-to-date. I'm still looking into the issue, but it seems that the datastore somehow jumped back in time for a few entities of a specific kind, maybe 0.5%, most of them where originally created in the same time-span of a few weeks in late November/early December 2015, though, not all of them in this time-span shown this issue.

Migration steps:
  1. Created a new project P2 (in EU)
  2. Deployed the Python code (version V1) to P2 with appcfg.py and waited until all Datastore indexes were shown as "serving"
  3. Datastore backup in project P1 (in US), as of April 9th
    1. disabled writes for the datastore
    2. created a backup (all namespaces, all kinds, including a "_DeferredTaskEntity"), using Cloud Console's "Datastore Admin" page, as usual stored in my GCS bucket for backups
  4. In project P2, again with "Datastore Admin" page, right after backup in P1 completed:
    1. disabled writes the almost empty datastore (none of them of the kind that has shown the issue later)
    2. imported the same backup information from the backup bucket, and restored into P2's datastore, again: all kinds, all namespaces
    3. when the restore tasks were completed, I enabled datastore writes
  5. Deployed the Python code (version V2) to P2 and did run a batch handler that changed a property value of all entities, where each entity's version counter is increased +1, the updated timestamp changes automatically, and the corresponding search doc is updated, too.
  6. For Search API of P2: wiped all documents from all indexes in Search API (just in case); when wipe tasks completed, queried the datastore entities and wrote excerpts of them as search documents
Interestingly, for the effected entities of that kind, the corresponding search doc in Search API has more recent data than the original entity in datastore.

Datastore Entity in P1 and P2:
  • version counter: 8
  • last update on: 2016-02-15
  • status: 'executing'
Search document of this entity in P1 and P2:
(search doc ID is always the URLsafe encoded NDB key, and I can tell from all other fields/properties, it is the correct search doc)
  • version counter: 13
  • last update on: 2016-03-15
  • status: 'completed'

In P1 I had expected, that the entity has the same data than its search doc, but in fact was stale.

In P2, I have expected for both, entity and search doc:
  • version counter: 14
  • last update on: 2016-04-10
  • status: 'completed'
because of the migration script that updated one property for all entities in this kind, and should also have triggered an update of the search doc.

There are two observations:
  1. P1's entity already had stale data, older than the search doc. This could be explained with an inconsistent / failed write to the datastore, at least in theory. The app uses transactions for reading/writing of this kind. In _post_put_hook(), if future.check_success() is None, the search doc is written/updated. I can think of exotic situations where the search doc could be older than its original entity in datastore, but since the datastore write happens in a transaction, and the search export happens only with a successful write operation, I fail to explain how the entity in datastore could prevail the change (or revert to an older version). We talk about 5 different types of changes during one month that have all been lost. There are also no deferred tasks that write potentially old entities back into the datastore.
  2. P2's entity again shows stale data, older than the search doc. This is particularly confusing, because the search doc is only written with the data read from datastore. And since the search docs were not copied from P1, the only source was the data freshly restored from the P1 backup. Although, if I look into the P1 datastore, as shown above, the data is already stale. Where did P2's datastore then get the new data from? So while the batch handler was running, the datastore had the data of version counter 13, but at some point after writing the search doc, the datastore reverted the entity to version counter 8. However, all the datastore writes for this entity have happened long time ago in the original datastore of P1. So, it looks to me, that the datastore in P2 somehow got both data for this entity, version counter 8 and version counter 13. Wouldn't this imply that the backup data could contain multiple versions of the same entity, or could there be another leak that works across projects? And for some reason, after the version counter 13 data was written to search docs, the entity got reverted to version counter 8. 
I'm running out of possible explanations for this, other than Datastore is able to have multiple versions of the same entity and those are even part of a backup.

Paint me confused :) However, maybe you have any idea what could cause this.

Ani

Christian F. Howes

unread,
Apr 11, 2016, 3:20:58 PM4/11/16
to Google App Engine
it sounds like you have some write transactions in P1 that never actually committed.  those then got backed up and restored to the second project.  is that possible?  if so then i'm sure you next question is how to detect the missed writes.....to which i don't have a great answer. :(

cfh

Anastasios Hatzis

unread,
Apr 11, 2016, 3:54:51 PM4/11/16
to Google App Engine
Christian, good thinking. And yes, you got pre-cog skill on max ;-) Exactly that would be my next question.

I wonder, if _DeferredTaskEntity (which I included in my backup/restore) isn't about deferred tasks, as in task-queue, which actually were all empty at the time of backup, but Datastore Admin showed 1 object of this kind with 704 KBytes. So, maybe this kind is for ancient pending transactions, which after the restore and the export to search docs, were finally executed. That wouldn't sound as crazy as my other ideas.

Adam (Cloud Platform Support)

unread,
Apr 12, 2016, 2:35:12 PM4/12/16
to Google App Engine
_DeferredTaskEntity is used for storing the payload of tasks that are enqueued with deferred.defer(), which exceed 100KB. The data is only loaded by the task handler that was created along with the entity, so there is no chance that loading these entities from a backup would spawn new tasks or otherwise cause data to be overwritten.
Reply all
Reply to author
Forward
0 new messages