project with python and node services: inconsistent data

80 views
Skip to first unread message

Faried Nawaz

unread,
Jun 8, 2016, 4:33:15 AM6/8/16
to Google App Engine
Hello,

I have a Python application deployed as my App Engine project's default service.  I also have a node service deployed in the same project for long-running tasks (uploads of large files to GCS).  After a file is uploaded, the node code updates an existing Datastore object with the path to the file on GCS.

I've noticed that if I fetch the object from the Python service after the update, I don't always get the latest data.  Other than by sprinkling use_cache=False, use_memcache=False in a bunch of places on the Python side (and taking a performance hit), how can I work around this problem?  Is there any way for the node service to flush or update the object in memcache?  I use ndb on the Python side.


Faried.

timh

unread,
Jun 8, 2016, 5:46:04 AM6/8/16
to Google App Engine
Are you getting that data via a query or via key using get,
If you are using a query then you will need to start reading up on Eventual consistancy

T

Faried Nawaz

unread,
Jun 8, 2016, 6:31:32 AM6/8/16
to google-a...@googlegroups.com
On Wed, Jun 8, 2016 at 2:46 PM, timh <zute...@gmail.com> wrote:
> Are you getting that data via a query or via key using get,
> If you are using a query then you will need to start reading up on Eventual
> consistancy

It's a "get". I just need some way to flush the cached data without
putting "use_memcache=False" everywhere.

It's an issue I don't run into if all my code runs under the Python
service. It can't do that in this case, since file uploads can take
longer than 60 seconds.

Nick (Cloud Platform Support)

unread,
Jun 8, 2016, 2:53:00 PM6/8/16
to Google App Engine
Hey Faried,

As Tim H suggested, this is likely due to eventual consistency. Datastore has an eventual-consistency model of replication which allows it to be highly-available and scalable while making a slight trade-off on the ability to get the latest copy of entities at every given moment. You can, however, use the key of an entity to retrieve it immediately, if it exists anywhere in the system, as opposed to a regular "get" query, which can show old results (including results where the entity doesn't exist yet). I suggest replacing your "get" query with a "get-by-key" query to force strong consistency. You can read more about Datastore and consistency in the documentation.

Cheers,

Nick
Cloud Platform Community Support 

Christian F. Howes

unread,
Jun 8, 2016, 3:36:09 PM6/8/16
to Google App Engine
Does your node service use NDB or DB?  if it is using NDB then NDB takes care of flushing the cache for you when you update an object, but you have to `get_by_key` in order to get the latest copy.

Faried Nawaz

unread,
Jun 9, 2016, 3:36:38 AM6/9/16
to Google App Engine
On Wednesday, June 8, 2016 at 11:53:00 PM UTC+5, Nick (Cloud Platform Support) wrote:
 
 I suggest replacing your "get" query with a "get-by-key" query to force strong consistency.

Where is this "get-by-key" or "get_by_key" query documented?  I can't find any references to it in the docs.


Faried. 

Faried Nawaz

unread,
Jun 9, 2016, 3:42:39 AM6/9/16
to Google App Engine
On Thursday, June 9, 2016 at 12:36:09 AM UTC+5, Christian F. Howes wrote:
Does your node service use NDB or DB?  if it is using NDB then NDB takes care of flushing the cache for you when you update an object, but you have to `get_by_key` in order to get the latest copy.

NDB is a Python API, and I believe it only works on the App Engine, not even with Python services on flexible environments (they're deployed on GCE).

Nickolas Daskalou

unread,
Jun 9, 2016, 5:38:38 AM6/9/16
to Google App Engine
Hi Faried,

Try deleting the Memcache key which is used by NDB after you update the Datastore entity on the Node server (FYI, NDB Memcache implementation found here).

Something like this (NOTE: below code is 100% untested):

================
// Check NDB release notes in case this prefix changes, or get the latest
// value from Python SDK (google.appengine.ext.ndb.Context._memcache_prefix)
var memcache_prefix = 'NDB9:';

// You might need to implement get_urlsafe_key() so that
// it returns the same value as NDB's key.urlsafe()
var urlsafe_entity_key = get_urlsafe_key(key);

// Construct the Memcache key
var mkey = memcache_prefix + urlsafe_entity_key;

//  Now delete from Memcache
memcached.delete(mkey, function(err) {
    ...
});
================

Let me know if that works.

Nick


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/7b703095-c63d-423a-ae53-c8519895aa06%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Nickolas Daskalou

unread,
Jun 10, 2016, 5:04:36 AM6/10/16
to Google App Engine
Hi Faried,

Did you give this a go?

I'm curious to find out if it worked for you.

Nick

Nick (Cloud Platform Support)

unread,
Jun 10, 2016, 11:05:20 AM6/10/16
to Google App Engine
Hey Faried,

To access Datastore on Flexible Environment apps, you should use the gcloud library for python.

In both NDB and gcloud, you can use a Datastore entity key to get an entity. As described in the documentation linked in my prior comment, getting an entity by its Key is a strongly-consistent operation, as opposed to normal queries which are eventually-consistent.

I hope this helps. Feel free to come back with any further questions you might have!


Cheers,

Nick
Cloud Platform Community Support

Faried Nawaz

unread,
Jun 14, 2016, 8:37:11 AM6/14/16
to Google App Engine
I decided not to go down that route -- it's all too likely that the key prefix will change sometime in the future, when I've passed the project on to other devs.  I ended up writing a small API endpoint in the Python code to accept an HTTP post from the Node code with the updated fields.  It's not the best solution, but...


Thanks for the help,

Faried. 
Reply all
Reply to author
Forward
0 new messages