Need modelling help for multi-tenant CMS in MongoDb

Tim Hardy

unread,

May 6, 2015, 11:33:52 PM5/6/15

to mongod...@googlegroups.com

I'm creating a cms using MongoDb. I'm having trouble figuring out the basic modelling strategy and would appreciate some help.

The basic use case is just like many out there where each tenant can create their own schema - i.e. tenant1 can say "I want to have Products, and each product will have a Name, Description, Quantity, and a Price", while tenant2 could say "I want to have Bands, and each band will have a Name, HomeCity, MusicCategory, etc". Tenants could define multiple content types and also specify if a field is searchable. Tenants would then be able to create, update, and query entries of these types for use on their websites.

Here are the three solutions I've read about and seen presented and my concerns with each:

1. Have two collections, schema and data, and have all tenant's data together in these two collections. My main concern with this is the indexing/searching. If each tenant can create multiple unique data types and specify one or more fields in each as searchable, it seems like this would be a nightmare on the indexing.

2. Have the above pair of collections (schema and data) for each tenant. I've read that having tens of thousands of collections is generally a bad idea, but I don't understand exactly why or if WiredTiger makes any difference now. I also don't know if this would completely solve the indexing concerns that solution 1 has.

3. Create a database per tenant. If we plan on having tens of thousands of tenants, this seems untenable.

The front-runner so far is number 1, using a separate tool, like solr or elasticsearch, to handle querying the data.

Any thoughts or modelling strategies will be greatly appreciated. I know there are existing CMS's out there that use MongoDb and have these exact same features. I'd really like to know how they handle the indexing and querying of such widely unique data across tens of thousands of tenants.

s.molinari

unread,

May 7, 2015, 10:04:23 AM5/7/15

to mongod...@googlegroups.com

I've also looked into this and I've come to the conclusion, the best solution is a database per tenant/ customer and with WiredTiger, it works even better, because you don't have the pre-allocation of a large amount of disk space for a relatively small amount being used that MMap has. The advantages of a database per tenant are easier administration of the single databases. Data in- and export can run over normal tools. One tenant won't be getting in the way of the other in any way directly. A physical separation of data. Ability to more easily scale out large customers to their own instances.

If the system is automated (with software like Scalr for management and MMS for monitoring), the number of databases shouldn't be too much of an issue really. One caveat though. It isn't advisable to have 1000s of databases running in one instance of MongoDB (from what I've been told). We are looking at running "pods" of clustered servers. The advantage to this is if one pod has issues, not all tenants are affected. Only the ones on that pod are affected. This is also how Salesforce.com runs their SaaS platform, although they do have one database for every tenant, the tenants are run on pods of servers. They needed to have one database, because of their Oracle back-end and their NoSql over RDBMs solution. Well, it is still SQL, but they had to rework how data is stored to fit more like a NoSQL storage functionality with flexible schemas. At any rate, with Mongo, the tricks they went through aren't necessary.

Hope that helps.

Scott

Tim Hardy

unread,

May 7, 2015, 11:19:16 AM5/7/15

to mongod...@googlegroups.com

Thanks for your response. That is helpful. All the articles and comments I have read stated that multi-databases weren't scalable, so it's good to hear that WiredTiger changes that to some degree.

Perhaps we could use a single database for freemium users, and split out separate databases for anyone who upgrades. I still imagine having thousands of empty databases, maybe people just taking advantage of a free trial or tinkering, wouldn't be good. Regardless, I will try it. I'll spin up tens of thousands of databases and see where the pain points are.

s.molinari

unread,

May 7, 2015, 11:52:39 AM5/7/15

to mongod...@googlegroups.com

If you go with 1000s of users in one DB, your application itself would need to be built considerably different, than if you had 1000s of databases. You'd have to have tenant separation at document level. If you do have a freemium offer, then one database per tenant might be overkill. But then again, you could also say to freemium customers, if no activity happens after X days, then the database will be deleted automatically and that can happen through automation. The next possibility (and what we were considering with MMap) is to host a group of smaller customers in one database. This would mean the application must have tenant separation at collection level, which would be less work and concern than separation at document level. However, with WT, we aren't going to worry about this as the database footprint on disk is much smaller than with MMap. Again, don't forget that you can't host 1000s of databases in one instance of Mongo. That might be what those people meant about it not scaling well. That condition would be the case with any database though, not just with Mongo.

Scott

Tim Hardy

unread,

May 7, 2015, 4:33:25 PM5/7/15

to mongod...@googlegroups.com

I started thinking about the implementation on this... how do you handle connections with 1000s of databases? There'd be no way to take advantage of connection pooling on your server, which would cause an inherent slowdown over any solution with connection pooling.

s.molinari

unread,

May 7, 2015, 5:17:38 PM5/7/15

to mongod...@googlegroups.com

Like I said, you need to have multiple instances of MongoDB.

Scott

Reply all

Reply to author

Forward