Realtime Database freeze when running a query for the first time

218 views
Skip to first unread message

Vlad Potra

unread,
Jul 1, 2021, 6:17:50 AM7/1/21
to Firebase Google Group
We're running a chat system on realtime database and we clear some of the oldest threads daily, for business purposes. We have a `/conversations` endpoint in the db, where all conversations are stored, this grew very large overtime, I'm not sure how large, but it's considerably large since we store close to around 150GB of data in the realtime db - but that amount includes denormalized data duplications for each conversation metadata, so this endpoint clearly hold less than 150GB (20-30GB I assume).

Conversations objects under `/conversations` have a createdAt timestamp field, which is indexed using ".indexOn" from security rules. We query by this child field using Python Firebase Admin SDK, and the query is as follow:

Using a Python generator:

firebase_admin
  .db
  .reference('/conversations')
  .order_by_child('createdAt')
  .start_at(min_created_at) <- updates after fully consumed
  .end_at(max_created_at)
  .limit_to_first(batch_size)

We have an archiving task that currently runs daily. What happens is that the first run of this query blocks (freezes) the db for around 5 minutes, if the 5 minutes mark is exceeded then the task fails with a "Failed to establish a connection httpsconnectionpool". After the db unblocks, the same query runs instantaneous, its execution time being around 0ms. After profiling using firebase database:profile, it shows that the blocking query is taking 300 seconds to execute and any subsequent query of this type takes 0-2ms. Blocking the entire database for 5 minutes is as you can guess, unacceptable.

I'm writing here to debate my proposed theory and fix to this and understand from the firebase team why this happens.

My theory: assuming this is an expensive query (understandable, millions of items to be filtered), and rarely executed, there's some caching involved at some level, which is discarded at some point since it's not executing anymore. We've found that waiting 24h between executions and 6h between executions do have the same outcome and the first request blocks the db. A fix for this theory (if it's a valid one) would be making this a minute task, and querying every minute, to keep the cache valid and execution times low.


Kato Richardson

unread,
Jul 2, 2021, 12:03:09 PM7/2/21
to Firebase Google Group
If the indexOn is working (this is a likely culprit) then querying every minute would help. Essentially, we load all data into memory on the server and create the index in memory. After some time without use, this gets dropped from memory and reloaded the next time the query is accessed.

Ultimately, this data structure can't scale infinitely in any event. You should consider segmenting (essentially archiving) the data by some reasonable time constraint (e.g. weekly/monthly/yearly based on volume). So something like this perhaps:

/conversations (current cycle here)
/archive/YYYY (previous cycles in these)

Since you won't be viewing all data on a client at one time, you can draw from the current list (which will be fast) and when clients "view more" past the current cycle, load up prior cycles accordingly. 

An approach like this will be quite resilient and efficient. Additionally, if your app scales to the point where you need to scale across multiple database shards, you'll be well positioned to start doing so.

☼, Kato



--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/9511ca78-f71d-483f-9c41-eaa0c313f8abn%40googlegroups.com.


--

Kato Richardson | Developer Programs Eng | kato...@google.com | 775-235-8398

Reply all
Reply to author
Forward
0 new messages