Multitenancy Sharding?

Skip to first unread message

Stefano Benatti

Oct 20, 2014, 10:25:20 AM10/20/14
I don't know if sharding is the correct concept, so I will describe my problem and I am open to suggestions.

Im developing a very big gov app, which is gonna be used by a bunch of organizations and offices. Each organization / office may have its own domain / subdomain, that all point to the same application, albeit showing a different layout and content. I have two different models one called ODA (which are the documents to be indexed) and another called Network (which represents which domain is being accessed). Network has a N:N relation with ODAs, and most Networks won't have all ODAs.

Im also using MongoDB, because Networks can also configure which fields they are gonna use for each ODA (and even add new ones). When searching inside a network, it should only:
  1. find ODAs that belongs to that network
  2. search for fields / values that are enabled in that network (more about this: say that i have a field called category_ids, it stores all category IDs selected by all networks for an ODA, but when searching inside a given network I only care for those IDs that belong to that network, ignoring all others)

Since i don't want to end up with a ridiculously big index (which is gonna happen, because there are lots of duplicated information since each network can change the fieds / values for the same ODA), I was thinking of sharding the search index, so that there is 1 ODA search index per network, that only indexes ODAs that belong to that specific Network, and only indexes fields/values that are enabled in that network. This will allow for horizontal growth, and I won't need to be concerned about Database size (which will skyrocket) because the search index itself will always be a lot smaller, so i will only access the database to fetch individual information, never to do a huge search.

How do i go about implementing something like this with Sunspot?

I alread have an initial implementation of those ODAs with 'dynamic fields', but so far I only ignore/exclude values that do not belong to the current network. The best thing would be for the search to not even index/return information that does not belong to its context.
Reply all
Reply to author
0 new messages