[Firestore] Best approach(es) to run fast queries on large collections

1,196 views
Skip to first unread message

Francois Schaus

unread,
Jun 22, 2018, 12:10:45 AM6/22/18
to Firebase Google Group
Hello,

I am using Firestore for the messaging features of an iOS App I am currently building. To retrieve the messages in chat groups, I use a addSnapshotListener as it provides real-time updates.

As I hacked my solution together, I took a very naive approach and basically download every document (i.e., message) associate with the chat group (my collection). This can mean up to 600-700 messages being downloaded (although I think most of them should be locally cached by FireStore...) and it's therefor now time for a better approach. 

The simplest approach I can think of is continuing to use my addSnapshotListener method but to first order my query by timeStamp and limit it to say the most recent 20-50 documents. 

I ran some performance tests and the strange thing is that this approach is actually slower than querying all documents in my collections. Here are my results:
  • Query full collection (~700 documents) and then sorting my resulting documents array by date (in Swift): takes 1.04 sec between execution and conclusion of the method
  • Query first 20 results, ordered by date (to be sure it's the last 20): 1.68 sec
  • Query first 20 results, not ordered by date: 0.62s (not very helpful since it gives me 20 random documents)
So I have two questions:
  • Any idea of what's causing the query of 20 collections ordered by date to be so slow?
  • What would be the recommended approach to most effectively (i.e., fastest/ lowest data usage) to do a proper query? A few ideas I had but haven't tested yet:
    • Should I create an index manually
    • Is there a more efficient method than to ask FireStore to sort my messages by date (e.g., my docs seem sorted when I explore my DB on Firebase website, so maybe just calling the first 100 messages could work - but how do I do that?)
    • Could I cache my message (but I thought FireStore was already doing that) and query only those not included in that cache (but if so how do I tell my snapshot listener to only query those messages not yet in my cache?)
I didn't create any manual index or anything else of the sort and the code is identical between each run (so it is not the factor explaining why querying 20 results is 65% slower than if I take my full collection). My Firestore version is the one before 5.0

This is obviously worrying me a little bit as I really expected to maintain the performance of my app by limiting the size of my queries but the results I experienced seem to show that this won't really be an option and I am a bit at a loss as to how I should maintain performance as my app scales.

Any help would be greatly appreciated!

Thanks,
Francois

See below some code excerpt: 


// My reference:

let channelRef = UserDataService.instance.FbDB.collection("dmChannels").document(channelId).collection("messages")

                    .order(by: "timeStamp", descending: true).limit(to: 20)


// My snapshotListener:

let messageListener = channelRef.addSnapshotListener { [weak self](snapshot, error) in


if success {


for message in snapshot!.documents {

                                let messageDictionnary = message.data()

                                let messageID = message.documentID


                                self!.updateMessageToLatest(messageDictionnary: messageDictionnary, messageID: messageID, modified: false) // Does the parsing and checks if the doc is already in the array or not 

                            }

       }

       }

Kato Richardson

unread,
Jun 28, 2018, 10:51:53 AM6/28/18
to Firebase Google Group
Hello François,

I can't tell by looking at the snippet you have here what's going on, but I wouldn't expect queries ordered by date would be significantly slower. I wonder if the index isn't being auto-created on the server? One way to test this might be by creating one manually and seeing if there's any improvement.

☼, Kato

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.
To post to this group, send email to fireba...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/687390db-f65f-4718-91d4-12496b0d67a7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--

Kato Richardson | Developer Programs Eng | kato...@google.com | 775-235-8398

Tyler Rockwood

unread,
Jun 28, 2018, 5:10:50 PM6/28/18
to Firebase Google Group
Hello François,

All queries in Firestore use an index (it's impossible to run a query that doesn't). An all fields are indexed by default.

I'm a little surprised at the results. Did you run the queries multiple times? Sometimes one off queries can have varying latencies.

Another factor to take into account is that the iOS SDK does local caching of all results for you. If you really want to test latency, I suggest to turn off persistence and run each query a couple dozen times.

-Tyler

Francois Schaus

unread,
Jun 28, 2018, 7:16:56 PM6/28/18
to Firebase Google Group
Thank you both! Yes I was very surprised too that doing a sorted query is that much longer than querying all documents + doing a sort myself in swift. 

My profiling was repeated 3-4 times and the results seemed to hold. I am also 99% sure that this is pulled from the local cache on the device and not a weird server connectivity issue (and I was on strong wifi for all above experiments anyway)... 

What I did notice though (but haven't yet ran the experiment again) is that the query of the 900+ documents from the cache is much faster (not calculated how much but seems to be an order of magnitude faster) on my release vs. debug build (the above results are from my debug build).

Any idea what could be driving this? This could be what's explaining why my sorted query is so slow vs. query the full collection (e.g., if the index are not optimized on the debug version)... 

I would also love to test creating an index manually but I am a bit lost as to how to do it. Could you give me a few pointers? I'll also re-profile my app on my release build to see if the above results remain.

Thanks again for your help - really appreciated!

Best,
Francois  

Kato Richardson

unread,
Jun 28, 2018, 8:23:33 PM6/28/18
to Firebase Google Group
Nothing obvious comes to mind. I'm fairly sure the auto-index isn't failing in any way, so ignore that request.

Can you try with local persistence disabled and turn on debug logging? My initial guesses would be local disk persistence being weird. Particularly given that the query is so much slower than a straight download. Would be interesting to see some combinations of those.

Also verify you've tried from the latest SDK version?

☼, Kato

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.
To post to this group, send email to fireba...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages