Firestore in Datastore mode: Index hotspots for common property values?

185 views
Skip to first unread message

Travis Martin

unread,
Dec 17, 2019, 5:51:44 PM12/17/19
to Google App Engine
I'm experiencing symptoms which suggest that Cloud Firestore in Datastore mode can be slow when querying for properties that are shared by many other entities. It seems this may be related to an index hotspot, though I can only find documentation recommending against monotonically increasing values, not a small number of enum values.

My situation (simplified) is as follows:
  • I have 1M entities written to a database (with only the built-in indices)
  • All entities have the property: prop1 = 'all'
  • All entities have a unique property, id in ['000000' - '999999'], and another property, id2=id
  • 1/10th of all entities (so 100k entities) have the properties first_dig = '0'
So, there are a couple ways I can query for the same entity (either using GCL in the cloud console or via the Java API):
  1. SELECT * FROM kind WHERE id = '000000'
  2. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0'
  3. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0' AND id2 = '000000'
  4. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0' AND prop1 = 'all'
I find that query #1 takes 5 seconds, #2 takes 15 seconds, #3 takes 15 seconds, and #4 takes ~50 seconds. The fact that #4 is much slower than #2, but #3 is not slower than #2 makes me think that there is index hotspotting when searching for prop1='all' (for which all index entries might be on the same tablet) but not for id2='000000'.

My questions are:
  1. Is this indeed how things work?
  2. Is there a recommended practice for querying for indexed properties with low uniqueness?
Thanks!

Elliott (Cloud Platform Support)

unread,
Dec 20, 2019, 6:01:41 PM12/20/19
to Google App Engine

Hello Travis,


Please note that Google Groups are reserved for general Google Cloud Platform and product discussions and not for reporting issues, which is why I suggest moving the troubleshooting to Issue Tracker, where issues can be turned private in case we need to gather any project specific details.


Reply all
Reply to author
Forward
0 new messages