Checking if an entity exists

1,241 views
Skip to first unread message

John Beckett

unread,
Oct 11, 2016, 8:50:45 AM10/11/16
to google-appengine-go
I have a situation where I am passed an ID (which happens to be the IntKey) for an entity, and I simply want to check whether that entity exists or not.

One option would be to do a KeysOnly query, but there is a good chance that the entity in question is already in memcache, and so it doesn't seem to make much sense to bypass memcache by performing a query instead of a get.

Another option (what I'm doing now), is to try to perform a Get with a key that I've generated with the ID, and simply checking for any errors.  However, I don't like the fact that I'm essentially retrieving the entire entity only to look for an error.

What is the cheapest and fastest way to check if a given ID corresponds to an existing entity?

Jon

unread,
Oct 12, 2016, 3:54:36 AM10/12/16
to google-appengine-go
From what I understand of the new pricing structure, a KeysOnly query should be cost nothing and be faster than a regular query. Don't forget that indexes are eventually consistent so it will take time after an entity has been inserted into the datastore for it to be queryable.

Another option is to use a memcache backed datastore API such as https://github.com/qedus/nds (mine), https://github.com/mjibson/goon or https://github.com/luci/gae. NDS provides strongly consistent guarantees. I am not sure about the others.

John Beckett

unread,
Oct 12, 2016, 6:25:10 AM10/12/16
to Jon, google-appengine-go
Hi Jon,

I already use NDS - great package by the way.  I'm just looking for the fastest way to check if an entity exists given that I already have what should be its intID, and therefore, its key.  From what I understand, a keys-only query is always slower than a Get, unless the Get is returning some large object.  So I'd like to use a Get, but without the actual contents of the Get being returned.  It may be that there is no way of doing that, and I have to simply stay with a keys-only query instead or a Get.

--
You received this message because you are subscribed to a topic in the Google Groups "google-appengine-go" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine-go/ch6ozDE4JIM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-appengine-go+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jon

unread,
Oct 12, 2016, 9:23:04 AM10/12/16
to google-appengine-go, jonathan...@gmail.com
Unfortunately there's no way to test for the existence of an entity without at least also returning its key.

Maybe you could profile the query option vs the Get option and see which is more performant. If your entities are small then Get may well be the way to go although a bit more expensive.

Daniel Jacques

unread,
Oct 12, 2016, 11:56:08 AM10/12/16
to google-appengine-go
Another option is to use a memcache backed datastore API such as https://github.com/qedus/nds (mine), https://github.com/mjibson/goon or https://github.com/luci/gae. NDS provides strongly consistent guarantees. I am not sure about the others.

Just filling in a blank here, but "luci/gae"'s memcache-backed datastore algorithm is based on "qedus/nds" and, consequently, also provides these guarantees :)

As noted, the most cost-optimal way of doing this appears to be running a memcache-only "Get" without falling through to the underlying datastore, followed by a keys-only query if memcache returns negative. I don't think there is an API for "check memcache but not real datastore" in "qedus/nds". In "luci/gae" it could be implemented as a special-case GetMulti filter underneath of memcache. Either would probably be trivial to add.

Speed-optimally, also as noted, performing a Get is best. At a low level, all AppEngine requests are converted to protobuf. The GetRequest protobuf doesn't appear to include provisions to instruct it to not return the data, so I think you're out of luck with respect to not incurring cost.

However, since you are using a memcache layer here, is this really a problem? If most "Get" requests either hit a hot cache object or fail with "does not exist", only the case of entities that exist but aren't in memcache will incur a cost. If this is uncommon, using memcache-backed Get or equivalent is probably the best course here.

Julius Kovac

unread,
Oct 12, 2016, 5:13:43 PM10/12/16
to google-appengine-go
I will suggest creating another entity kind which will have same keys as the original one but will not have data (or just some minimal field) and doing get on it. It is quite big redundancy but if this check is going to be frequent operation or should be fast it can be worth it.

-- j

dcco...@gmail.com

unread,
Oct 18, 2016, 2:20:16 PM10/18/16
to google-appengine-go
Just use luci/gae Exists:


The luci/gae.datastore package has everything python ndb has, and much more.

Your question keeps getting asked over and over again, I do not know why the GAE Go team do not come up with something pragmatic like luci/gae. The official GAE Go api is basic and low level.

I wish the GAE Go team was half ambitious as the Google Luci team about the Go api. We feel neglected all the time.

Daniel Jacques

unread,
Oct 19, 2016, 11:35:00 AM10/19/16
to google-appengine-go, dcco...@gmail.com
Just to be clear, luci/gae's Exists is still issuing a full Get behind the scenes. Unfortunately, the fundamental protocol doesn't offer a better option.

John Beckett

unread,
Oct 20, 2016, 10:06:52 AM10/20/16
to Daniel Jacques, google-appengine-go, dcco...@gmail.com
Just to be clear, luci/gae's Exists is still issuing a full Get behind the scenes. Unfortunately, the fundamental protocol doesn't offer a better option.

I realised that, and I don't think that the Go team really have any way to change this, as (I believe) it is a limitation from BigTable.

--
Reply all
Reply to author
Forward
0 new messages