Multi Tenant setup with Database Sharding sharing same API

1,078 views
Skip to first unread message

Wesley van Rensburg

unread,
Jan 23, 2015, 7:54:24 PM1/23/15
to loopb...@googlegroups.com
Hi, we are trying to setup the following.

- An API that is shared across tenants
- All tenants get their own access token that would identify their Database Shard

We managed to get something where we dynamically associate the datasources to models at request time, but ran into the issue outlined here:
which when you change the datasource on a model, it is changed globally across all incoming requests.

Does anyone have experience or examples on how to do this in Loopback?

Thanks!
Wes

Wesley van Rensburg

unread,
Jan 27, 2015, 1:58:50 PM1/27/15
to loopb...@googlegroups.com
Anyone??

Raymond Feng

unread,
Jan 27, 2015, 2:33:29 PM1/27/15
to Wesley van Rensburg, loopb...@googlegroups.com
We don’t direct support for the per-request based datasource at the moment. One thing you can try is to override the following methods for your model:

Model.prototype.getDataSource = function () {
  return this.__dataSource || this.constructor.dataSource;
};

Model.getDataSource = function () {
  return this.dataSource;
};

Thanks,

---
Raymond Feng
Co-Founder and Architect @ StrongLoop, Inc.

StrongLoop makes it easy to develop APIs in Node, plus get DevOps capabilities like monitoring, debugging and clustering.

--
You received this message because you are subscribed to the Google Groups "LoopbackJS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to loopbackjs+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Wesley van Rensburg

unread,
Jan 27, 2015, 7:39:26 PM1/27/15
to loopb...@googlegroups.com, wesley.va...@gmail.com
Thanks for that guidance. We did try that already, but we are having a lot of issues with Roles and ACL's. Roles for us are at the tenant level, so we are having the same issue with ACL's being in memory and shared across requests causing 401's. Perhaps there is a suggested way to override the built-in ACL and Role data sources as well?

Also, when we did override the getDataSource functions in a new custom model (without ACL and Roles), it still didn't work, and only worked when we called "attachTo" to link the datasource and model and visa versa. But that again, set it into memory creating a race condition in the requests.

Would it be worth while perhaps forking the datasource-juggler or connector-mongodb to provide the tenant functionality we require?

We really like the framework a lot, and want to try make it work. The framework would work with hundreds of tenants and millions of users. 

Thanks!

Raymond Feng

unread,
Jan 27, 2015, 7:51:44 PM1/27/15
to Wesley van Rensburg, loopb...@googlegroups.com
You can try to override the methods at PersistedModel as other models extend from it. BTW, ACL and Roles can be backed by other data sources too. See server/model-config.json.

Thanks,

---
Raymond Feng
Co-Founder and Architect @ StrongLoop, Inc.

StrongLoop makes it easy to develop APIs in Node, plus get DevOps capabilities like monitoring, debugging and clustering.

drywo...@gmail.com

unread,
Nov 20, 2015, 8:53:10 AM11/20/15
to LoopbackJS
Hi Wesley,

I am currently trying to achieve the same thing (one shared API across multiple tenants with separated data-sources).
Were you able to get it working for your usecase in this way (by deciding the data-source for the tenant to use at request time) ?

Or did you solve it in a different way? I'd appreciate any guidance from someone who has already walked that road before :)
I was also thinking about hosting separate REST endpoints for each tenant, e.g.


is this something you also made use of ? or did you identify the tenants for each request in a different way ? (request-parameters ?)

Thanks & Regards
Wolfgang

Wesley van Rensburg

unread,
Nov 20, 2015, 10:30:29 AM11/20/15
to loopb...@googlegroups.com
Hi Wolfgang,

Yes we solved it by overriding the getDataSource method in summary.

The details:

1) Created a Central Database that held all usernames and pointers to tenants that served as a tenant locator

2) Created a middleware component that caught the request right after loopback#token. AccessToken and other loopback models were all pointed to the Central Database thus allowing the builtin access token and user lookups to happen by default. Once we knew who the user was, we set their tenant on the accessToken/Current Context

3) We created a new Base Model with a structure like this…

- Model
— PersistentModel
—— NewBaseModel
——— All Our Models

And overrode the getDataSource method on NewBaseModel. This looked at the tenant set on the current context / access token, and returned the datasource for that tenant. To do this efficiently (sort of), we boot all tenant datasources at startup, and make them available at request time inside getDataSource. We have to call the method: Model.attachTo(Datasource) method to allow this to happen, so it does cost a little, but in our experience, not enough to make concern with our load.

We do have plans to change that override and push it to the connection level rather than swapping the datasources which reloads and reconfigures all models, but haven’t started that yet.

Hope this helps!

Thanks!
Wes



-- 
You received this message because you are subscribed to a topic in the Google Groups "LoopbackJS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/loopbackjs/KIhO2_W5dF4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to loopbackjs+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/loopbackjs/4e3d542d-13e5-4fc3-bb25-347d5f546ee3%40googlegroups.com.

drywo...@gmail.com

unread,
Nov 23, 2015, 4:18:09 AM11/23/15
to LoopbackJS
Thanks Wesley,

that already helped a lot as guidance for a potential implementation in our usecase.
I wonder what the loopback team's take on this kind of multi-tenancy usecase is, or other multi-tenancy usecases in respect to the loopback architecture in general.

Searching for "tenant" related questions here in the discussion boards gives a lot of results, I'm just wondering if there wouldn't be potential for a collaborative effort to better integrate / support these kind of usecases.


Thanks & Regards,
Wolfgang

drywo...@gmail.com

unread,
Nov 26, 2015, 7:47:42 AM11/26/15
to LoopbackJS
Hi Wesley,

if I may ask one more question about your implementation ... what client-technology are you using to access the server-side loopback models that you described ?
I guess that the models on the server are made available via REST for clients to access.

To access those REST endpoints are you using something like the loopback-remote-connector (https://github.com/strongloop/loopback-connector-remote) or are you using some other JS AJAX client in the browser ?
How do you handle the active tenant on the client side and during requests to the server ?
I would assume that the active tenant-id has to be included in the requests that are sent to the REST services, to be able to correctly distinguish which tenant's datasource to query on the server.

I'd be thankful for some hints on how you handled the client-part in your solution.


Thanks again & Regards,
Wolfgang

Fernando Tóffolo

unread,
Nov 26, 2015, 1:59:50 PM11/26/15
to LoopbackJS
I'm working on something similar(all tenants share a db) and what I did(but still experimenting) is have each tenant in a separate subdomain (clientA.project.com, clientB.project.com but you can also have something like project.com/clientA). That way I can get the host of the request and check which tenant to filter by. Then I added a mixin(I think is the name) to intercept every request, check the host for the tenant and add the filterBy tenantId. That way every request will be filtered by tenant no matter what.

But my project is a little bit simpler because each user can only belong to one tenant, so I can save a tenantId property on the user model(User extension) and use that to determine the tenant. That way users from different companies can be at project.com/somemodel and only see the data from the tenant they belong to.

Wesley van Rensburg

unread,
Nov 26, 2015, 2:59:14 PM11/26/15
to loopb...@googlegroups.com
Hi Wolfgang,

Yes we use the loopback-connector-remote. All users are required to login. When they login, we look in the central database for their tenant, and generate an AccessToken thru loopback’s regular token system (lives in central). Once they user has a token, they use that on every API call. Our SaaS product has one domain (no sub-domains for each tenant), and all API requests are made with the token that identifies their tenancy. Thats where step #2 comes in below. Using middleware, we look up the users tenants database and swap their datasource in per every request. 

We did try bench mark what that cost was of swapping vs having a tenant under the subdomain (dedicated datasource), but the cost was so minimal that we abandoned that idea.

Thanks!
Wes

drywo...@gmail.com

unread,
Dec 7, 2015, 10:23:06 AM12/7/15
to loopb...@googlegroups.com
Thanks for the reply and all the help, Wesley.

We are now implementing this approach in our architecture. Unfortunately we have problems using the loopback context for storing the tenant-id.
Did I understood that correctly, you are using the loopback context functionality (https://docs.strongloop.com/display/public/LB/Using+current+context) to pass the tenant-id on a request from the client to the server and on the server you extract the tenant-id from the context object again (using the loopback.getCurrentContext(); API ?)

In our code we define a custom middleware, just like it's shown in the example from the link above, when we call loopback.getCurrentContext(); it always returns null, although the context middleware is attached to app.
Until now we are unable to figure out what is causing this, also there are some older issues on the loopback github page about similar issues.

Is this a bug in loopback that you also experienced ? Were you able to apply some fix for that ?

Thanks again.

Regards,
Wolfgang

Wesley van Rensburg

unread,
Dec 14, 2015, 12:29:21 PM12/14/15
to loopb...@googlegroups.com
We haven’t run into those issues yet. I think its important though where you place your middleware functions. For us, we initiate the loopback context in middleware, and then plug in other middleware functions to use that. Below is an example of what our middleware.json looks like..

{
  …..
  "initial": {
    "loopback#context": {
      "params": {
        "enableHttpContext" : true
      }
    }
  },
  "session": {
  },
  "auth": {
    "loopback#token": {},
    “this is where we look at the token, and lookup which tenant the user making the request belongs to. We put the tenantId back onto current context here": {},
  },
  …...
  "routes:before": {
    “here we do user and tenant context loading. We load all known things about the tenant, and all known things about the user, and attach it back on to the current context": {},
  },
  …...
}

Hope that structure helps

Thanks!
Wes


Wolfgang

drywo...@gmail.com

unread,
Jan 11, 2016, 5:26:14 PM1/11/16
to loopb...@googlegroups.com
Thanks Wesley for all the help.

We had some trouble figuring this out initially, but following your advice I think we now have a working solution.
(one problem that we are still experiencing is caused by a ms-sql connector bug and when querying relations on models ... the loopback context is not passed along and the wrong data-source might be queried).

I have created a simple sample application that maybe others can refer to in the future if they want to implement or test similar things concerning multi-tenancy: https://github.com/drywolf/loopback-full-stack-tenancy


Regards,
Wolfgang
 

drywo...@gmail.com

unread,
Jan 15, 2016, 2:44:47 PM1/15/16
to loopb...@googlegroups.com
Hi,

I have just started testing my implementation of this approach for performance and stability, i.e. I'm trying it out with multiple browser instances shooting at it with many requests per second each.
I think there is a problem with concurrency in this approach when switching the data-source between models like that.

If the HTTP requests from the clients are coming in faster, than the database responds to the SQL requests that Loopback emits, then there is the posibility of a datasource of a model being changed, right before the SQL query is being dispatched.
This causes the SQL query to be executed against the wrong data-source, and as a result returns data that should not have been returned (from a different tenant).

@Wesley: have you done such smoke tests with your implementation ? did you get any problems like that ?

PS: I'm executing requests from the browser in the frequencies of 10ms - 100ms between requests, this already gives me some inconsistent responses with two browser windows open at the same time (sometimes it even messes up in a single browser window by itself)
PSS: Which database system are you using for a backend ?


Regards,
Wolfgang

Wesley van Rensburg

unread,
Jan 15, 2016, 3:27:17 PM1/15/16
to loopb...@googlegroups.com
Hi,

Yes we did do a concurrency test and continue to do them. Our tests were 100% successful.

Just to point out, we overrode the "getDataSource" method associated to the model in juggler, where we return the datasource specific to the identified tenant in context.
Before we return the datasource, we attach it to the instance of the model, using the "attachTo" method.

Thanks!
Wes

drywo...@gmail.com

unread,
Jan 17, 2016, 8:48:46 AM1/17/16
to loopb...@googlegroups.com
Hi,

it looks like the original problems that I had when testing were in how I wrote the test rather than the server implementation (which is the good news).
Now the tests run (almost) ok, but I think there are still some very rare occasions where the tests can fail (I'll have to figure out where this is coming from, it's approx. in the 1 out of 5000 range tho and maybe it's still a concurrency-bug in the testing code).

About the model getDataSource() override ... the juggler is where the getDataSource is being called, but you put the override into the model itself, right ? It's not put into the juggler directly ?
... which just monkey-patches a model's getDataSource function to use my own code to switch to the correct data-source on the fly.

PS @Wesley: how do you inject the tenant-id on your client-side, I currently use a "remotes.before('**', ...);" hook in the client JS, where I put the tenant-id into the request headers that will be sent to the server.

Thanks,
Wolfgang

Tom Kirkpatrick

unread,
Feb 6, 2016, 5:51:03 AM2/6/16
to LoopbackJS
We've been working on a system to allow multi-tenant style access controls within a single datastore. I've released the initial code here:

Young

unread,
Feb 20, 2016, 1:34:07 AM2/20/16
to LoopbackJS
Hi Wolfgang,

We used a similar approach without issue with the loopback context (also injected via middleware).

Initially, we had similar issues as you did. We realized that creating new datasources within getDataSource() was the cause for messing up the context:
IF we call new DataSource() or loopback.createDataSource() in our own .getDataSource(), it will cause any data that we injected earlier into LB context to be lost (can still getCurrentContext, but context seems to be a new one).

Here's what works for us for your reference:

Model.getDataSource = function(model) {
var nsContext = loopback.getCurrentContext();
var tenant = nsContext.get('tenantId');
// Model.app.dataSources.trap - write to this ds when unable to switch to right tenant ds
var ds = Model.app._dsCache[tenant] ? Model.app._dsCache[tenant].ds : Model.app.dataSources.trap;
Model.attachTo(ds);

//  switch all related models to same dataSource
var relation = Model.relations;
for (var relatedModel in relation) {
var model = Model.app.models[relation[relatedModel].modelTo.modelName];
model.attachTo(ds);
}
return ds;
};

For now, we pre-create dataSources for each tenant and cache into app._dsCache in server.js (temporary solution), so that we don't have to call new DataSource() anywhere along the middleware chain.

We are still trying to figure out a better solution/place to create the tenant dataSource.  Any suggestions?

Hope this helps.  Cheers!
Reply all
Reply to author
Forward
0 new messages