Firebase Realtime Database: Fan-out costs and limitations

242 views
Skip to first unread message

fionbio

unread,
Sep 18, 2017, 1:16:05 AM9/18/17
to Firebase Google Group
Originally posted here (topic marked for close): https://stackoverflow.com/questions/46263884/firebase-realtime-database-fan-out-costs-and-limitations

I am in the process of evaluating firebase. Feeds are a core part of application I'm planning for. Going with the fan-out strategy for new user posts, I am attempting to understand limitations, and costs before committing. I want to fully wrap my head around this world view so I may clearly articulate considerations to the client, and so I don't build them into a corner. (f hate landmines)

New user-posts would invoke a cloud function through HTTP. The post would be written to the database, and another cloud function would execute in response to onCreate/onWrite for that db path for fan-out. (Aside: Can this be made more efficient?)

Let's consider cost, then consider limitations. Costs for fan-out for a user with a 1000 followers:

Cloud function pricing (network outbound, invocation cost and free tier ignored). Resource: 128MB, 200MHz @ $0.000000231 per 100 ms of execution. cost = ms * (cost/100) * invocations. For 3 seconds to complete, with 1 million invocations: 3000 * 0.00000000231 * 1000000 = $6.

Realtime database pricing (ignoring data storage cost) Per the pricing, the real cost for purposed app is data leaving the database going anywhere, including your cloud function at $1/Gb. This is confusing at first, because cloud functions report that data coming in is free, and suggests your only costs are data moving to outside world. I would request from the database the tree of followers for a given user /followers/{userId}, then write the post into each user feed /feed/{followerId}/{postId}:true. The data-out would cost the tree, + protocol overhead, and the data-in cost would be free. I don't see any cost for database compute time. Documentation states protocol overhead is ~ 5%, Cost = (outBound + outBound * .05) * invocationsData is not compressed.

To estimate getting all 1000 followers, I typed this as rough approximation for each row of data that comes back per the request: {"v4NXj7AS1dUhl83V2XhZ5i9tzq42":true} = 35 characters / 35 bytes (UTF-8). Let's call that 50 bytes for error. 50 * follower-count = outBound = 50 * 1000 = 50k.

Cost = (50k + 50k * 5%) * 1,000,000 = 52,500,000K = 52.5Gb = $53

So the cost for someone with a 1000 followers to post 1 million times is $53 + $6 = $59 Ok - and since this is going to happen never, I have an upper bound I'm happy with. Now lets talk limitations.

Cloud function limits limits Focusing on limitations that cannot be increased that may be a problem, I only see Max concurrent invocations for non-HTTP functions: The maximum concurrent invocations of a single function that is not triggered by HTTP is 1000. I assume this means a database trigger like onCreate would count. Does this then mean that if each function takes 3sto execute as mentioned above, my throughput would be (1000 / 3s), or around 333 operations a second? Just getting a feel here. Largely I don't suspect CF limits will be a problem.

Realtime Database limits limits I don't see any write limits except for Bytes written 64 MB/minute The total bytes written through simultaneous write operations on the database at any given time. Let's call write output twice the size of the tree fetch from above, because full path is written along with key. That would be 100k for fan-out. 64MB/minute / .1MB = 640. So basically you can only have around 640 people with 1000 friends write a minute on average. Ok - that gives me a sense of scale per database.

I think this gives me a pretty reasonable sense of the kind of scale I can expect, at what cost, for what upfront and maintained investment.

Finally, can move to https://cloud.google.com/datastore/ as a clean fallback if realtime database gives us any grief along any dimensions. A clean backout plan for contingencies without having to make broad changes. Pleased.

Is this estimate sensible? What do you think?

Other open questions:

  1. What do I do if something goes rouge by someone misusing the API running up costs? Blacklist? Would I know in time? With things like cloud functions, costs are unbounded, whereas instances and infrastructure, costs are much more bounded. hmm unclear.
  2. How long does it take for the function to execute? How much variability is there and where would it come from?
  3. How consistent are write-times in the firebase realtime database?
  4. How do I choose the type of cloud function resources I want? I just decided I didn't need more than 128MB of ram, but maybe I need more CPU to take less running time? The function is IO bound, so, I guess this makes sense as an opener.
Reply all
Reply to author
Forward
0 new messages