Multiple customers - keeping the data separate - how?

29 views
Skip to first unread message

Neil Wilson

unread,
Nov 6, 2006, 11:03:02 AM11/6/06
to rubyonra...@googlegroups.com
I'm trying to get a handle on the different ways of maintaining data
separation in Rails. From what I've read it looks like usually the
security is handled entirely as an aspect within the Model.

I constantly find it amusing that whenever a 'new' way of doing
applications is created, they always ignore the security systems that
have gone before. First we had operating system security with its user
and group database. Then we have databases with their own security
model. Now we have web apps reinventing that particular wheel once
again sitting in a single operating system user space and logging onto
the database with a single all powerful user.

Unfortunately the application I have in mind involves account data, and
I can't afford a bug in an application exposing one customer's data to
another. I need something more substantial than that. (And there are
other reasons - such as backup). However I still want to share physical
infrastructure.

My thoughts are that there should be a URL per customer driving their
own mongrels locked onto their own version of the database. However the
standard infrastructure support tools don't support that way of doing
things.

Are there any other thoughts about how the security separation should
be enforced?

Rgs

NeilW

CWK

unread,
Nov 18, 2006, 11:47:35 AM11/18/06
to Ruby on Rails: Talk
Neil,

I found this because I'm looking at moving a .NET application to Rails
and was looking for suggested ways to accomplish the same thing. I was
curious because it seemed to me as though once you had this kind of
requirement that a lot of the magic SQL methods became difficult to
use, but since 37 signals' apps are all this same vein, perhaps there
was a mor elegant way. Rails seems absolutely genius for building a
single-client app but when you add on multi-tenancy and all the
entitlement issues, it starts feeling more like an incremental
improvement, but perhaps I'm missing the right tricks.

The .NET app I was referring to does use (loosely speaking) model-based
security, as do (I'd guess) very nearly all multi-tenancy applications.
Certainly the Bank of America does not create individual databse
users/instances for each online banking customer. I don't know how
Salesforce.com does it, because they have a huge number of clients on a
very complicated application, but their infrastructure is probably
somewhat exotic and quite certainly expensive.

To me the idea of managing account security through the database is
dreadful. My experience is that you will start out with an entitlement
model that is either too complicated, or too simplistic; I've yet to
see one that was right the first time. In either case security and
entitlement will likely end up as part of your application layer
whether you want it to or not, at least in the sense that you are going
to be turning UI elements on and off based on roles. Also, separate
database instances is in my humble opinion a big mistake unless there
is a really great reason. It is far more advantageous to maintain one
database rather than one DB per client.

The approach you seem to want is industrial-strength, to put it
lightly. Is this because clients want it that way, or you do? Clients
will always ask for more security if you offer it without asking for
more money or taking away features. But, I've found that clients don't
really care how security is accomplished so long as you tell them, "it
is taken care of." Perhaps the domian you're in is different, but I've
done several applications like this over the past 8 years and have
never needed to go the route you're speaking of.

Manu J

unread,
Nov 18, 2006, 12:03:59 PM11/18/06
to rubyonra...@googlegroups.com
It would be nice, if someone can tell the architecture model behind basecamp.
Are basecamp instances separate for each customer ? That is everyone
has their database
and separate rails directory etc.?

My very very naive solution is this
1. Clone from a rails project to a directory
2. New DB for the new client. (Can be from a default SQL script which sets up
the table etc. So no need for migrations )
3. Point the clones rails app to new DB
4. subdomain points to the new rails app.

But this becomes a maintenance headache when you have to update the code.
You have to update the cloned apps.

So is there a better way to do it ? Can one rails app handle multiple
sites via subdomains,
separate db for each site ?

--
Manu

Ross Riley

unread,
Nov 18, 2006, 12:25:31 PM11/18/06
to rubyonra...@googlegroups.com
I'd bet that this is not the case.
I really don't see the need of having separate databases and I'm sure
most if not all of the multi user web apps wouldn't do such a thing.

If you build good model level security, that is making sure that every
query is constrained to the context of a particular user then every
additional functionality you build on top will be locked down to the
access provided in your architecture. The key is to make sure you get
that right before you start adding complexity.


--
Ross Riley
www.sorrylies.com

Neil Wilson

unread,
Nov 18, 2006, 1:00:21 PM11/18/06
to Ruby on Rails: Talk
The point is that making sure that every query is constrained *is*
adding complexity. I might get it wrong. My tests might not be
complete. There might be a bug,

If you have a separate database (as in 'CREATE DATABASE', tied in with
its own database.yml per tenant - not separate mysql/pgsql processes),
then I can build my model as a single tenant system. Problem solved
(although managing it is another matter).

Plus backup per customer are dead easy - just checkpoint and dump the
relevant database.

I'm trying to understand why solving the problem again is better than
what is there already.

Ross Riley

unread,
Nov 18, 2006, 1:23:21 PM11/18/06
to rubyonra...@googlegroups.com
As you say managing it is another matter. Whilst you say that this
method is solving the problem again, I would say that it's probably a
different problem with its own solution.

The separate user system for databases isn't really designed for
storing multiple versions of the same application data, surely you are
multiplying complexity by the number of users you cater for, imaging
rolling out migrations to hundreds/thousands of databases, scaling the
same databases over multiple servers.

I'm sure there's just as much a risk of a bug creeping into this way
of working (what if you read the wrong yml file? or made a mistake in
the backup manager script?) The benefit of a single database is that
you only have to worry about the application security at a single
point of entry, triple the amount of time you spend writing tests and
verifying this point and I'm sure you'll save much more in the
efficiency of a simple scalable system.


--
Ross Riley
www.sorrylies.com

CWK

unread,
Nov 18, 2006, 1:30:42 PM11/18/06
to Ruby on Rails: Talk

Neil Wilson wrote:
> The point is that making sure that every query is constrained *is*
> adding complexity. I might get it wrong. My tests might not be
> complete. There might be a bug,

In theory, yes. In practice I have never found this to be a problem if
your users/groups/accounts model is remotely sane.

> If you have a separate database (as in 'CREATE DATABASE', tied in with
> its own database.yml per tenant - not separate mysql/pgsql processes),
> then I can build my model as a single tenant system. Problem solved
> (although managing it is another matter).

You will spend ten times as much maintaining the app as you will
building it. Building a system that's easier to develop but more costly
to maintain is not the course I would take.

> Plus backup per customer are dead easy - just checkpoint and dump the
> relevant database.

OK, you got me there. I'd still rather take the time to develop the DB
dumper than to maintain multiple databases.

> I'm trying to understand why solving the problem again is better than
> what is there already.

You're speaking as though Rails' approach to application-level security
is "new" but it's not. It's been around as long as mutli-user
applications have which means probably older than UNIX. These are
different approaches that serve different purposes.

Your suggested approach is not ludicrous. SAP has taken this tack with
their on-demand CRM offering. Each client gets their own database
instance, but the application code is booted off a central repository,
and the instance runs in its own sandbox to guarantee the availability
fo RAM, CPU, etc. Time will tell how well this works, but "low
maintenance cost" and "SAP" have rarely been uttered in the same
sentence. I think they did it primarily to differentiate from
Salesforce, and they will eventually end up with a classic ASP-style
system with each client running their own fragmented schema and
codebase.

Still, virtually every multi-tenant application out there is built the
way Rails works by default. If you feel more comfortable implementing
this through the database, then knock yourself out.

Jim Powers

unread,
Nov 18, 2006, 2:14:20 PM11/18/06
to rubyonra...@googlegroups.com
On Sat, 18 Nov 2006 10:30:42 -0800
"CWK" <cking...@gmail.com> wrote:

> Neil Wilson wrote:
> > The point is that making sure that every query is constrained *is*
> > adding complexity. I might get it wrong. My tests might not be
> > complete. There might be a bug,
>
> In theory, yes. In practice I have never found this to be a problem if
> your users/groups/accounts model is remotely sane.

I don't think I'd go that far. Assume that your application's usage
grows unbounded (like an Internet site), you WILL eventually lose to
big numbers. Now, assuming you partition your data along some sort of
meaningful, but arbitrary metric, like all the data that belongs to a
particular customer, you may STILL lose to big numbers, but what I've
found is that the partitioned data grows far more slowly than the
aggregate, greatly reducing the chance that you will have to confront
this problem at all.

> > If you have a separate database (as in 'CREATE DATABASE', tied in
> > with its own database.yml per tenant - not separate mysql/pgsql
> > processes), then I can build my model as a single tenant system.
> > Problem solved (although managing it is another matter).
>
> You will spend ten times as much maintaining the app as you will
> building it. Building a system that's easier to develop but more
> costly to maintain is not the course I would take.

There is another way to do this: make the database connections dynamic
based on URL or some such. So, you wind up with a single DB connection
in your database.yml file that points to a DB that contains connection
information to the DB used by the client. Assuming that you will
eventually need to update software/schemas you can store a schema/app
version in this master DB lookup table as well. The assumption here is
that for a short period of time you will have two versions of
software/schema deployed as you roll out an update. This would go a
long way to reduce the pain of managing such a system. Now, as far as
convincing AR to work this way... I'm not sure, I am investigating the
possibility as well, and I think it can be done. It may need some
overrides (some of the 'find' methods in AR::Base may need to be
overridden to take a connection object).

> > Plus backup per customer are dead easy - just checkpoint and dump
> > the relevant database.
>
> OK, you got me there. I'd still rather take the time to develop the DB
> dumper than to maintain multiple databases.

There are other significant benefits to this approach:

- Incremental application migration
- Overall better performance
- The ability to manage performance better (one big hot client can be
moved to their own Db server)

The issue is how to make an approach like this work within the Rails
framework.

The DB dumper is something that has to be maintained! You're not
getting out of the fact that you will have to do work to make it seem
like each tenant is an island. Furthermore, what about a DB loader?
The separate DB approach means that you get dump/load for free, then
your only costs are related to how you manage getting a tenant to their
data, and migration issues (not trivial, I know).

> > I'm trying to understand why solving the problem again is better
> > than what is there already.
>
> You're speaking as though Rails' approach to application-level
> security is "new" but it's not. It's been around as long as mutli-user
> applications have which means probably older than UNIX. These are
> different approaches that serve different purposes.
>
> Your suggested approach is not ludicrous. SAP has taken this tack with
> their on-demand CRM offering. Each client gets their own database
> instance, but the application code is booted off a central repository,
> and the instance runs in its own sandbox to guarantee the availability
> fo RAM, CPU, etc. Time will tell how well this works, but "low
> maintenance cost" and "SAP" have rarely been uttered in the same
> sentence. I think they did it primarily to differentiate from
> Salesforce, and they will eventually end up with a classic ASP-style
> system with each client running their own fragmented schema and
> codebase.
>
> Still, virtually every multi-tenant application out there is built the
> way Rails works by default. If you feel more comfortable implementing
> this through the database, then knock yourself out.

Generally agreed. But the "all tenants in one DB" causes significant
problems. The fact that it is essentially impossible to "isolate" the
performance and storage between tenants at run time generally makes
Rails less, how shall I say this? "agile". Although I don't think it
is too hard to rectify this.

Jim Powers

signature.asc

Neil Wilson

unread,
Nov 18, 2006, 3:20:30 PM11/18/06
to Ruby on Rails: Talk
The obvious way to do it is of course to split it at the domain level.

tenant1.mydomain.com CNAME server1
tenant2.mydomain.com CNAME server2

One machine - one tenant.

However that's still expensive, even with on demand machines like the
Amazon EC2, (and you end up with a lot of machines to watch). It would
be nice to be able to fold that onto less machines, yet keep the 'share
nothing' approach.

The payoff in making the architecture work in infrastructure is an
infinitely simpler application - less code (as long as that code is
truly gone and doesn't just pop up again in your management systems).
You can write your application as a simple one company setup - which
could then, in theory, be wrapped up in Instant Rails and sold as a
desktop application.

One of the things I like about Rails is its disdain for threading, and
its share nothing architecture. It keeps the code clean and
understandable. Moving the multi-tenancy issue to infrastructure seems
to me to be in the same vein.

NeilW

Manu J

unread,
Nov 18, 2006, 3:38:09 PM11/18/06
to rubyonra...@googlegroups.com
On 11/19/06, Neil Wilson <aldu...@gmail.com> wrote:
>
> The obvious way to do it is of course to split it at the domain level.
>
> tenant1.mydomain.com CNAME server1
> tenant2.mydomain.com CNAME server2
>
> One machine - one tenant.

Are you suggesting that application code resides on each machine separately.?
Then this situation is the same as the one i described before. But how
do you manage code updates and database updates ? Every database and
code that was split will have to be updated.

From what has been discussed so far.
Pros
-------
1. Easy backup
2. One client cannot take down everyone (Also easy to isolate problems )
3. Better performance
4. Scale Better
Cons
--------
1. Rolling out updates is more difficult. ( As Jim said incremental
migration can also be considered an advantage ).
2. Will involve hacking AR

--
Manu

Frederick Cheung

unread,
Nov 18, 2006, 4:39:19 PM11/18/06
to rubyonra...@googlegroups.com
As far as the added complexity of deployment, updating etc... I would
have thought that Capistrano would take away most of the pain.

It also depends on whether we are talking about 100s or thousands of
customers or just a few. If it was just a few i'd definitely be tempted
to have one database per client. I would do this by creating several
environments, i.e. instead of all of your mongrels running the
production environment you would have one bunch running
production_client_a, another running production_client_b and so on, and
in databases.yml each one will have its own bunch of settings.

You could also have a bunch of virtual servers each running one client
instance.

Fred

--
Posted via http://www.ruby-forum.com/.

CWK

unread,
Nov 18, 2006, 5:52:23 PM11/18/06
to Ruby on Rails: Talk

Jim Powers wrote:
> On Sat, 18 Nov 2006 10:30:42 -0800
> "CWK" <cking...@gmail.com> wrote:
>
> > Neil Wilson wrote:
> > > The point is that making sure that every query is constrained *is*
> > > adding complexity. I might get it wrong. My tests might not be
> > > complete. There might be a bug,
> >
> > In theory, yes. In practice I have never found this to be a problem if
> > your users/groups/accounts model is remotely sane.
>
> I don't think I'd go that far. Assume that your application's usage
> grows unbounded (like an Internet site), you WILL eventually lose to
> big numbers.

Well in my current case, all of my data is partitioned--nothing Client
A sees will ever be seen by Client B. I still think one DB all clients
is much easier.

As for losing to big numbers, I'm not sure exactly what you mean. In
any case, it all depends on your business. At small numbers the key is
to survive long enough to acquire customers. At medium numbers it's to
start making money. At large numbers things start to behave very
differently as off-the-shelf solutions kind of cease to work, period.

> > > If you have a separate database (as in 'CREATE DATABASE', tied in
> > > with its own database.yml per tenant - not separate mysql/pgsql
> > > processes), then I can build my model as a single tenant system.
> > > Problem solved (although managing it is another matter).
> >
> > You will spend ten times as much maintaining the app as you will
> > building it. Building a system that's easier to develop but more
> > costly to maintain is not the course I would take.
>
> There is another way to do this: make the database connections dynamic
> based on URL or some such. So, you wind up with a single DB connection
> in your database.yml file that points to a DB that contains connection
> information to the DB used by the client. Assuming that you will
> eventually need to update software/schemas you can store a schema/app
> version in this master DB lookup table as well. The assumption here is
> that for a short period of time you will have two versions of
> software/schema deployed as you roll out an update. This would go a
> long way to reduce the pain of managing such a system. Now, as far as
> convincing AR to work this way... I'm not sure, I am investigating the
> possibility as well, and I think it can be done. It may need some
> overrides (some of the 'find' methods in AR::Base may need to be
> overridden to take a connection object).

This is a !@#$-load of tricky plumbing to avoid setting and watching
client entitlements on database rows. Let alone the havoc this could
wreak with managing the database--depending on which one you use, this
could complicate how you deal with tablespaces and such.

> There are other significant benefits to this approach:
>
> - Incremental application migration

This is beneficial if you want to maintain multiple software versions.
If that's the case then you might as well just install complete app
instances per client and be done with it. Been there many times, will
never do it again unless building something like an ERP where it still
makes some sense.

> - Overall better performance

TANSTAAFL. If your database server is running twenty database
instances, there is going to be some kind of performance hit to that
versus one DB with tables 20 times larger. The overhead associated with
connection pools and query caches et. al. could in many cases be much
larger than the hit to scanning tables 20 times longer. I just don't
accept this as an open-shut benefit right off the bat.

> - The ability to manage performance better (one big hot client can be
> moved to their own Db server)

There's no reason you can't do this with a multi-tenant system too. For
that matter you can run a special client on their own complete system
instance with no or very little fancy plumbing.

> The DB dumper is something that has to be maintained! You're not
> getting out of the fact that you will have to do work to make it seem
> like each tenant is an island.

Snap response: Implement some kind of to_sql method which can be called
recursively through the object tree, starting with the root object
representing a client. For all I know facilities for this already exist
within ActiveRecord which after all has to know how to generate SQL. Or
just serialize stuff into yaml, or something like that.

Not to mention that you may find (as I did) that clients want/like
human-readable backups, not SQL dumps.

> Furthermore, what about a DB loader?

Read the infile, marshal it into your Model objects, then call the
appropriate new/create/save methods. Now you get all your application
validation goodies for free and there's no chance to create
relationships out of whack with anything else.

> The separate DB approach means that you get dump/load for free, then
> your only costs are related to how you manage getting a tenant to their
> data, and migration issues (not trivial, I know).

I still think the "not trivial" aspect understates it by two-thirds. It
seems to me like you're building a unique configuration that will end
up having a lot more dependencies on the versions of the framework,
O/S, database config, etc. than is obvious from this vantage point. The
end result could be that every time you do a major rev of any piece,
you risk the whole thing falling apart and being the one guy in the
world with that specific problem, and needing to stay on MySQL 3.1 for
a year aftr its release until the low-priroity bug gets fixed. Yeah, I
know it's a hypothetical, but it's the kind of hypothetical that's
bitten me in the rear multiple times. The all-in-one approach has been
by far the easiest to maintain and operate of all the approaches I've
been involved with.

> Generally agreed. But the "all tenants in one DB" causes significant
> problems. The fact that it is essentially impossible to "isolate" the
> performance and storage between tenants at run time generally makes
> Rails less, how shall I say this? "agile". Although I don't think it
> is too hard to rectify this.

Well like I said above I agree that it poses certain challenges--you
end up needing to build a high-performance application even though all
your customers are 5-seat installations. I do agree that this is
ultimately probably an issue best solved in the database, but I'm not
sure that the approach posited here isn't trading getting stabbed for
getting shot.

CWK

unread,
Nov 18, 2006, 5:52:23 PM11/18/06
to Ruby on Rails: Talk

Jim Powers wrote:
> On Sat, 18 Nov 2006 10:30:42 -0800
> "CWK" <cking...@gmail.com> wrote:
>
> > Neil Wilson wrote:
> > > The point is that making sure that every query is constrained *is*
> > > adding complexity. I might get it wrong. My tests might not be
> > > complete. There might be a bug,
> >
> > In theory, yes. In practice I have never found this to be a problem if
> > your users/groups/accounts model is remotely sane.
>
> I don't think I'd go that far. Assume that your application's usage
> grows unbounded (like an Internet site), you WILL eventually lose to
> big numbers.

Well in my current case, all of my data is partitioned--nothing Client


A sees will ever be seen by Client B. I still think one DB all clients
is much easier.

As for losing to big numbers, I'm not sure exactly what you mean. In
any case, it all depends on your business. At small numbers the key is
to survive long enough to acquire customers. At medium numbers it's to
start making money. At large numbers things start to behave very
differently as off-the-shelf solutions kind of cease to work, period.

> > > If you have a separate database (as in 'CREATE DATABASE', tied in


> > > with its own database.yml per tenant - not separate mysql/pgsql
> > > processes), then I can build my model as a single tenant system.
> > > Problem solved (although managing it is another matter).
> >
> > You will spend ten times as much maintaining the app as you will
> > building it. Building a system that's easier to develop but more
> > costly to maintain is not the course I would take.
>
> There is another way to do this: make the database connections dynamic
> based on URL or some such. So, you wind up with a single DB connection
> in your database.yml file that points to a DB that contains connection
> information to the DB used by the client. Assuming that you will
> eventually need to update software/schemas you can store a schema/app
> version in this master DB lookup table as well. The assumption here is
> that for a short period of time you will have two versions of
> software/schema deployed as you roll out an update. This would go a
> long way to reduce the pain of managing such a system. Now, as far as
> convincing AR to work this way... I'm not sure, I am investigating the
> possibility as well, and I think it can be done. It may need some
> overrides (some of the 'find' methods in AR::Base may need to be
> overridden to take a connection object).

This is a !@#$-load of tricky plumbing to avoid setting and watching


client entitlements on database rows. Let alone the havoc this could
wreak with managing the database--depending on which one you use, this
could complicate how you deal with tablespaces and such.

> There are other significant benefits to this approach:
>
> - Incremental application migration

This is beneficial if you want to maintain multiple software versions.


If that's the case then you might as well just install complete app
instances per client and be done with it. Been there many times, will
never do it again unless building something like an ERP where it still
makes some sense.

> - Overall better performance

TANSTAAFL. If your database server is running twenty database
instances, there is going to be some kind of performance hit to that
versus one DB with tables 20 times larger. The overhead associated with
connection pools and query caches et. al. could in many cases be much
larger than the hit to scanning tables 20 times longer. I just don't
accept this as an open-shut benefit right off the bat.

> - The ability to manage performance better (one big hot client can be


> moved to their own Db server)

There's no reason you can't do this with a multi-tenant system too. For


that matter you can run a special client on their own complete system
instance with no or very little fancy plumbing.

> The DB dumper is something that has to be maintained! You're not


> getting out of the fact that you will have to do work to make it seem
> like each tenant is an island.

Snap response: Implement some kind of to_sql method which can be called


recursively through the object tree, starting with the root object
representing a client. For all I know facilities for this already exist
within ActiveRecord which after all has to know how to generate SQL. Or
just serialize stuff into yaml, or something like that.

Not to mention that you may find (as I did) that clients want/like
human-readable backups, not SQL dumps.

> Furthermore, what about a DB loader?

Read the infile, marshal it into your Model objects, then call the


appropriate new/create/save methods. Now you get all your application
validation goodies for free and there's no chance to create
relationships out of whack with anything else.

> The separate DB approach means that you get dump/load for free, then


> your only costs are related to how you manage getting a tenant to their
> data, and migration issues (not trivial, I know).

I still think the "not trivial" aspect understates it by two-thirds. It


seems to me like you're building a unique configuration that will end
up having a lot more dependencies on the versions of the framework,
O/S, database config, etc. than is obvious from this vantage point. The
end result could be that every time you do a major rev of any piece,
you risk the whole thing falling apart and being the one guy in the
world with that specific problem, and needing to stay on MySQL 3.1 for
a year aftr its release until the low-priroity bug gets fixed. Yeah, I
know it's a hypothetical, but it's the kind of hypothetical that's
bitten me in the rear multiple times. The all-in-one approach has been
by far the easiest to maintain and operate of all the approaches I've
been involved with.

> Generally agreed. But the "all tenants in one DB" causes significant


> problems. The fact that it is essentially impossible to "isolate" the
> performance and storage between tenants at run time generally makes
> Rails less, how shall I say this? "agile". Although I don't think it
> is too hard to rectify this.

Well like I said above I agree that it poses certain challenges--you

Neil Wilson

unread,
Nov 19, 2006, 1:57:39 AM11/19/06
to Ruby on Rails: Talk

Manu J wrote:

> Are you suggesting that application code resides on each machine separately.?
> Then this situation is the same as the one i described before. But how
> do you manage code updates and database updates ? Every database and
> code that was split will have to be updated.

Capistrano does this already. It deploys to everything - DB servers,
app servers, web servers. What you need to do is reduce the number of
variables that have to be tweaked for each tenant. I see no reason for
separate tenant application code, or even separate branches on the
version control system.

>
> From what has been discussed so far.
> Pros
> -------
> 1. Easy backup
> 2. One client cannot take down everyone (Also easy to isolate problems )
> 3. Better performance
> 4. Scale Better

Separates the tenant management from the application code, allowing the
application code to be much simpler. I reckon the venerable 'depot'
application from The Book could be made multi-tenanted with this
approach.

> Cons
> --------
> 1. Rolling out updates is more difficult. ( As Jim said incremental
> migration can also be considered an advantage ).

Capistrano will do what you want I feel.

> 2. Will involve hacking AR

I don't see that as being necessary. I reckon all the tenant management
can be handled by the DNS or the load balancer, the file system and
differing copies of the database.yml file.

I think the fundamental point at which further division becomes
difficult is if you try and force more than one tenant into a single
running Mongrel instance. I reckon if you stick to the fundamental
principle that One Mongrel = One Tenant and make sure each tenant ends
up at the right Mongrel it'll work.

I'm extremely encouraged by this discussion. Thanks to everybody taking
part.

NeilW

Neil Wilson

unread,
Nov 19, 2006, 2:10:12 AM11/19/06
to Ruby on Rails: Talk

CWK wrote:

> This is a !@#$-load of tricky plumbing to avoid setting and watching
> client entitlements on database rows. Let alone the havoc this could
> wreak with managing the database--depending on which one you use, this
> could complicate how you deal with tablespaces and such.

That tricky plumbing is handled by multiple copies of the database.yml
file.

Why not One Tenant = One Database Login = One Database = One
database.yml file? All running under Mongrel instances unique to that
Tenant.

> I still think the "not trivial" aspect understates it by two-thirds. It
> seems to me like you're building a unique configuration that will end
> up having a lot more dependencies on the versions of the framework,

No. Very definitely no. The approach has to be the McDonalds approach.
Every tenant has to be a replica of the standard model except for the
smallest possible changeset.

I'm not proposing code changes between the tenants. Everything is a
simple checkout from the main branch of the version control system.

The only things that are unique per tenant are the database.yml file,
and the way the load balancing system works out how to send a
particular tenant to the right Mongrel instance running that
database.yml file.

I'm convinced this can be done in a simple and effective manner.

NeilW

Tom Mornini

unread,
Nov 19, 2006, 12:59:21 PM11/19/06
to rubyonra...@googlegroups.com
On Nov 18, 2006, at 11:10 PM, Neil Wilson wrote:

> CWK wrote:
>
>> This is a !@#$-load of tricky plumbing to avoid setting and watching
>> client entitlements on database rows. Let alone the havoc this could
>> wreak with managing the database--depending on which one you use,
>> this
>> could complicate how you deal with tablespaces and such.
>
> That tricky plumbing is handled by multiple copies of the database.yml
> file.
>
> Why not One Tenant = One Database Login = One Database = One
> database.yml file? All running under Mongrel instances unique to that
> Tenant.

You're going to have gigantic memory usage. Rails instances are very
large, on the order of 28M per mongrel instance *mimimum*, and you need
2 minimum per application, more if traffic becomes significant.

I understand the innate natural scalabillity of what you're proposing,
but I think you may be missing another poster's very apropos suggestion
that if you design the application as multi-tenant, there's nothing
preventing you from operating certain instances of it single-tenant.

Since your customers will no doubt fit into some sort of 80/20 rule
with respect to usage patterns and resource requirements, why not
host all new customers in a single instance, and move out the 20%
that require extra juice into separate instances?

Another aspect that I'm curious about is what sort of infrastructure
do you plan to deploy on? If you're talking about hosting a single
instance on a single box, that reliability of a single instance will
be poor, and the reliability of your system once you grow beyond a
few boxes will be terrible! There will literally be something wrong
every day once you have a few hundred boxes, and therefore a few
hundred customers.

With multiple tenancy you can spend money on system side scalability
and reliability measures and performance and reliability will improve
for all of your customers.

There's no question, however, that the "one big system" approach
will eventually prove limiting, but at that point, you'll have a
blueprint for how to design a scalable and reliable platform that
can host many customers' instances, and you can replicate those
entire multi-customer systems as you grow further.

--
-- Tom Mornini, CTO
-- Engine Yard, Ruby on Rails Hosting
-- Reliability, Ease of Use, Scalability
-- (866) 518-YARD (9273)

Phlip

unread,
Nov 19, 2006, 1:24:02 PM11/19/06
to Ruby on Rails: Talk
Neil Wilson wrote:

> Unfortunately the application I have in mind involves account data, and
> I can't afford a bug in an application exposing one customer's data to
> another.

So you have wall-to-wall unit tests, right?

--
Phlip

Vishnu Gopal

unread,
Nov 19, 2006, 2:34:09 PM11/19/06
to rubyonra...@googlegroups.com
I may be way off track here, butt this seems to be pretty simple to do.

1. Maintain a master DB which contains data common to your app.
2. All models which have client-specific information overloads find to
take another parameter: client, something along the lines of:
def find(client, *args)
#Establish new ActiveRecord connection here. and push off to super
end
3. When you create a new client, run a script to create a new database
using a convention: a sanitized client name. Use the same convention
above.
4. Associations should still work since in the end everything uses
find (might be wrong here).

Vish

Neil Wilson

unread,
Nov 19, 2006, 3:06:38 PM11/19/06
to Ruby on Rails: Talk

Tom Mornini wrote:

> On Nov 18, 2006, at 11:10 PM, Neil Wilson wrote:
>
> You're going to have gigantic memory usage. Rails instances are very
> large, on the order of 28M per mongrel instance *mimimum*, and you need
> 2 minimum per application, more if traffic becomes significant.

Memory is cheap, and there is still swap space at a push - depending
upon usage patterns.

> I understand the innate natural scalabillity of what you're proposing,
> but I think you may be missing another poster's very apropos suggestion
> that if you design the application as multi-tenant, there's nothing
> preventing you from operating certain instances of it single-tenant.

I'm trying to reduce the complexity of the application model and trade
that off against a different archtecture in the infrastructure. I'm
trying to see if the trade-offs work or not.

> Since your customers will no doubt fit into some sort of 80/20 rule
> with respect to usage patterns and resource requirements, why not
> host all new customers in a single instance, and move out the 20%
> that require extra juice into separate instances?

It's not a juice issue. It is entirely data separation and whereabouts
in the layers that separation should be enforced, and whether tenanting
can be separated out and solved as an infrastructure problem.

> There will literally be something wrong
> every day once you have a few hundred boxes, and therefore a few
> hundred customers.

That's overly negative. I don't find simple computer systems set up
well that unreliable. Good ones run for years and years without
problems.

Plus an application can be just as unreliable in code - particularly if
it is overly complex.

>
> With multiple tenancy you can spend money on system side scalability
> and reliability measures and performance and reliability will improve
> for all of your customers.

Actually I've always found that when you start messing around with
complicated hardware structures and start pushing for that next 9 on
your reliability percentage, things start to go pear-shaped. At that
point, not only do you have a potentially complex and fault prone
application, you also have a potentially complex and fault prone
architecture that is very, very difficult to change or improve.

I'd rather stop at 'reliable enough ' and replicate thereon using the
simplest techniques possible. If you have a problem it affects a very
small percentage of users. And there is nothing to say that you will
have that many problems. Simplicity pays large dividends.

NeilW

CWK

unread,
Nov 19, 2006, 3:06:39 PM11/19/06
to Ruby on Rails: Talk

Neil Wilson wrote:
> CWK wrote:
>
> > This is a !@#$-load of tricky plumbing to avoid setting and watching
> > client entitlements on database rows. Let alone the havoc this could
> > wreak with managing the database--depending on which one you use, this
> > could complicate how you deal with tablespaces and such.
>
> That tricky plumbing is handled by multiple copies of the database.yml
> file.
>
> Why not One Tenant = One Database Login = One Database = One
> database.yml file? All running under Mongrel instances unique to that
> Tenant.

Because every time you want to do an update you will need to replicate
it perfectly across every instance. If the push fails part of the way
through for some reason, you end up in an uncertain state which could
leave one or more clients SOL. With multi-tenancy it is a lot easier to
set up a staging environment which closely resembles production so you
can test updates closely before release.

I'm nto nearly so worried about new installs as I am maintaining your
existing clients. I've been burned too many times over the years with
weird update issues to take anything for granted. The idea of
maintaining one big instance is much more palatable.

>
> I'm convinced this can be done in a simple and effective manner.
>

I'm convinced you can have one of the two. In any case, if you do go
this route, I hope you'll let us know how it works out. I'm genuinely
curious.

Neil Wilson

unread,
Nov 19, 2006, 3:11:15 PM11/19/06
to Ruby on Rails: Talk

Phlip wrote:


> So you have wall-to-wall unit tests, right?

No, and neither do you. Even if you think you have ;-)

NeilW

Jim Powers

unread,
Nov 19, 2006, 3:43:36 PM11/19/06
to rubyonra...@googlegroups.com
"Neil Wilson" <aldu...@gmail.com> wrote:

> I'm convinced this can be done in a simple and effective manner.

I'm sure it can. Based on your comments you seems to be going for a
coarser-grained solution compared to the problem that I'm trying to
solve.

Doing this through DNS implies a number of things:

1. You can wait for DNS propagation (assuming you're talking the
Internet and not an Intranet).
2. You have the liberty to create a separate application cluster per
customer (all using essentially the same code base, with a config per
customer).

Assuming this is true for you then, I would agree that you can do this
pretty simply.

My situation is different.

I need quick provisioning for new customers (on the order of seconds
to a couple of minutes), and I cannot consider automatic provisioning
of a new application cluster each time for a new customer.

In my situation I have to be able to share application clusters
(running on many machines) with a number of databases on the back end.

To CWK's points:

I'm not thinking that building a multiple tenant by breaking up the DB
by tenant is trivial in my case, but it is also not as dire as you
portray is. But I generally agree with you: I would like to have
everything in one DB for maintenance, but reality is forcing my hand.

Firstly: the application I'm working on is a live Internet application
and is already database limited. Performance of the middleware is not
even remotely a factor. Our Web servers are basically asleep. So my
primary concern is scaling the DB layer.

We've already investigated a number of possible cluster/federation
schemes and they do not scale nearly as well as the vendors would like
you to believe. In our tests data partioning per customer has given,
by far, the best overall performance boost.

My comment about big number is this: No matter what your DB solution is
there is some number of aggregate rows in the DB where performance will
diminish "quickly". Generally speaking, you go about and index your
data to get better (read) performance, but indexing provides the maximal
benefit when either the indexes of all your hot tables can fit in RAM
or result in very few hits to disk. However, as time goes on fewer and
fewer of your indexes will fit in RAM, and even your less hot tables
and indexes become significant. Your DB starts to become disk bound.
"Disk bound" is one foot in the grave. So what to do? I can't ignore
it.

Right now we have a logically partitioned customer data in the same
DB. All living cozy inside the same schema. This does make many
things easy, but it makes scaling VERY hard.

> At large numbers things start to behave very differently as
> off-the-shelf solutions kind of cease to work, period.

Well, I think that you would agree that this is hyperbole. Off the
shelf solutions, RDBMSes you mean, are clearly not the fastest things on
the planet, but they keep the data mailable, I'm not aware of
alternatives that have both the relative ease of data manipulation of
RDBMSes and the reasonably good performance they posses. (Ah, perhaps
Google has some goodies in-house, alas, I'm not Google)

Keeping RDBMSes operating at a healthy level can be done for a long
time, but you will eventually need to give up some comfort. In this
case the "all-in-one-db" approach.

> This is a !@#$-load of tricky plumbing to avoid setting and watching
> client entitlements on database rows. Let alone the havoc this could
> wreak with managing the database--depending on which one you use, this
> could complicate how you deal with tablespaces and such.

No doubt that there is some plumbing that needs to be put into place.
But I'm doing this to gain performance primarily. I gain the
performance on two levels: the DB server in question is operating on
relatively "small" DBs meaning intrinsically improved performance,
then there are logical performance improvements I can make. Right now
I have to do checks on security for the "owners" as well another users
(our application allows clients to publish their data to other users of
our system ans well as the general public). Once an owner (or
assistants, who have the same privileges as the owner) has logged in I
need perform no checks on access to their data.

> > - Incremental application migration
>
> This is beneficial if you want to maintain multiple software versions.
> If that's the case then you might as well just install complete app
> instances per client and be done with it. Been there many times, will
> never do it again unless building something like an ERP where it still
> makes some sense.

Agreed, this can be a pain. I wasn't referring maintaining an
arbitrary number of versions of the app, just no more than two at once:
old and new. This situation would only be temporary, while a roll-out
was occurring.

> > - Overall better performance
>
> TANSTAAFL. If your database server is running twenty database
> instances, there is going to be some kind of performance hit to that
> versus one DB with tables 20 times larger. The overhead associated
> with connection pools and query caches et. al. could in many cases be
> much larger than the hit to scanning tables 20 times longer. I just
> don't accept this as an open-shut benefit right off the bat.

The benefit come from the fact that I can 1 or more DB instances on a
given DB server. How I want to tune performance is entire up toy me.
In fact what ever tuning trick you can do in a single Db instance to
gain performance I can do with a one DB per customer setup, but the
converse is not true: there are things that can be done in a one DB per
customer configuration that cannot be done in a single DB approach.
I'm not saying that I can accomplish this easily, but it can be done.

> > - The ability to manage performance better (one big hot client can
> > be moved to their own Db server)
>
> There's no reason you can't do this with a multi-tenant system too.
> For that matter you can run a special client on their own complete
> system instance with no or very little fancy plumbing.

In my case the DB, not the web application is the problem. Otherwise,
yes, I agree with you.

> > The DB dumper is something that has to be maintained! You're not
> > getting out of the fact that you will have to do work to make it
> > seem like each tenant is an island.
>
> Snap response: Implement some kind of to_sql method which can be
> called recursively through the object tree, starting with the root
> object representing a client. For all I know facilities for this
> already exist within ActiveRecord which after all has to know how to
> generate SQL. Or just serialize stuff into yaml, or something like
> that.
>
> Not to mention that you may find (as I did) that clients want/like
> human-readable backups, not SQL dumps.

But with a per client DB approach *I* get the ability to backup and
restore data in a per client basis, a far more regular occurrence.
And I can do this using high-performance tools without writing
anything (except my app to work like this). As far as the "human"
readable part I could also write such a script. Also, in case you
haven't tried serializing with Ruby is REAL SLOW, loading the data
with Ruby is no race winner either. Clearly no one has to be confined to
using Ruby to do this. But yet again the per client DB approach wins for
flexibility out of the box.

> I still think the "not trivial" aspect understates it by two-thirds.

Fair enough.

> It seems to me like you're building a unique configuration that will
> end up having a lot more dependencies on the versions of the

> framework, O/S, database config, etc. than is obvious from this


> vantage point. The end result could be that every time you do a major
> rev of any piece, you risk the whole thing falling apart and being
> the one guy in the world with that specific problem, and needing to
> stay on MySQL 3.1 for a year aftr its release until the low-priroity
> bug gets fixed. Yeah, I know it's a hypothetical, but it's the kind of
> hypothetical that's bitten me in the rear multiple times. The
> all-in-one approach has been by far the easiest to maintain and
> operate of all the approaches I've been involved with.

Possibility, but I generally doubt it.

> Well like I said above I agree that it poses certain challenges--you
> end up needing to build a high-performance application even though all
> your customers are 5-seat installations. I do agree that this is
> ultimately probably an issue best solved in the database, but I'm not
> sure that the approach posited here isn't trading getting stabbed for
> getting shot.

In my case (unlike Neil) I'm doing this explicitly for the performance
benefits I can attain.

Your warnings have been heard, I will take them into account. Much
appreciated.

Jim Powers

signature.asc

Phlip

unread,
Nov 19, 2006, 4:55:29 PM11/19/06
to Ruby on Rails: Talk
Neil Wilson wrote:

> Phlip wrote:

I can add an 'assert(false)' to any block in my program, and tests will
get to it.

(The remaining useless debate centers on a useful definition for "wall
to wall".)

The more security you want, the more tests you need.

--
Phlip

S. Robert James

unread,
Nov 19, 2006, 9:16:55 PM11/19/06
to Ruby on Rails: Talk
We're actually planning on doing something exactly like this. In our
case, each tenant represents an institution, with up to hundreds of
users. We're not concerned about quick provisioning of new tenants -
signing one up and migrating them is a large, manual process
regardless. Having one db, one OS user, and one domain name per tenant
simplifies *a lot*.

* Frees the code from having to track tenants. Otherwise, every row
would need a tenant_id, and *every* find would have to scope to
tenant_id
* Built in, bullet proof data partitioning
* Ability to move tenants to separate servers, or let them host their
own

Implementation is straightforward:
* All tenants run off the same codeline
* The codeline is checked out once - to one place
* Each tenant has their own environment - which is identical except for
the db
* and their own domain name, which is tenantsname.ourapp.com
* deployment / new versions are run by script against all the dbs /
mongrels
* one mongrel per tenant - all running off the same code dir - but in
different env, and different db

Now, the only thing which really concerns me is the fact that we're
stuck with 1 Mongrel per tenant. With a lot of tenants, and each
mongrel using 20-50MB of memory, that could get ugly. Its possible
that the swap file will handle all of this - swapping out Mongrels
belonging to tenants that aren't online - but this won't help much, as
during peak times nearly every tenant will be using the system.

One very simple solution to this would be to mod ActiveRecord to not
use persistent database connections. This could be something as simple
as an around_filter, establishing a connection to the appropriate db,
and tearing down afterwards. This would let all of our mongrels be
used for any tenant.

Persistent db connections aren't necessarily that helpful, anyway.
With MySQL, for instance, on a LAN, they're hardly noticable. I know
that for SQL Server, MS stopped recommending them, as well. They're
feeling being that if you have a lot of apps on 1 db, it's better for
each one to hang up when they're done. Better the overhead of
connect/teardown than of keeping numerous dormant connections.

Another possible concern I have is session collision - although I'm not
sure if this is even possible - I need to investigate the different
ways Rails handles sessions.

Last, we've already written a little code to help with some of the
unique db issues - enforcing that only one tenant ever uses one db.

Neil, if you or anyone else is interested on collobarating to help make
the scripts and tools needed to make this a reality, please speak up.
(Please keep posts to list, not private email.)

snacktime

unread,
Nov 20, 2006, 3:13:53 AM11/20/06
to rubyonra...@googlegroups.com
Coming in a bit late here... This is an issue we have had for quite a
while, as we store financial data and it absolutely cannot get mixed
up. IMO this is one area where some logic should go in the database,
and the easiest solution is using a database that gives you the right
tools. You can absolutely keep client data separate and have it all
in one database and normalized by using functions and views, at least
with databases like postgresql and oracle. We make a few adjustments
here and there, such as not being able to use some of the AR methods
for inserts and updates, but a small library of custom methods is a
whole lot easier then having hundreds of databases.

A bigger issue is proper testing and good change management habits.
Most bugs I see in working production systems is when some developer
gets the itch to upgrade something or fix an existing bug and pushes
it into production without adequate testing. The other leading thing
would be making too many changes over too short a time period. If
reliability and data integrity are at the top of your list, then the
fact is you just have to be more conservative with how often you
change stuff or add new features. You can have the best system in the
world for keeping your client data separate, but if your people have
bad habits it won't matter.

Chris

Neil Wilson

unread,
Nov 6, 2006, 9:51:25 AM11/6/06
to Ruby on Rails: Talk
I'm trying to get a handle on the different ways of maintaining data
separation in Rails. From what I've read it looks like usually the
security is handled entirely as an aspect within the Model.

I constantly find it amusing that whenever a 'new' way of doing
applications is created, they always ignore the security systems that
have gone before. First we had operating system security with its user
and group database. Then we have databases with their own security
model. Now we have web apps reinventing that particular wheel once
again sitting in a single operating system user space and logging onto
the database with a single all powerful user.

Unfortunately the application I have in mind involves account data, and


I can't afford a bug in an application exposing one customer's data to

Neil Wilson

unread,
Nov 21, 2006, 7:30:08 AM11/21/06
to Ruby on Rails: Talk

I can understand the desire to try and get the Mongrel count down, but
the worry I have with reusing Mongrels is that the objectspace is
potentially polluted with ActiveRecord data from a previous tenant. I
don't want the added complexity of database separation and then find
that the separation has broken down because I'm recycling Objectspaces
and there is a cyclic graph in my object hierarchy keeping old AR
instances out of the clutches of the garbage collector.

I see this as akin to an operating system. Let's get processes working
and see how they handle things before we invent threads. It may be that
Moore's Law rides to the rescue again.


S. Robert James wrote:

> Neil, if you or anyone else is interested on collobarating to help make
> the scripts and tools needed to make this a reality, please speak up.
> (Please keep posts to list, not private email.)

I want to see how this works. Let's build it, but let's build the
simplest thing that will work first - total separation and a separated
tenancy provisioning system.

NeilW

S. Robert James

unread,
Nov 21, 2006, 9:29:23 AM11/21/06
to Ruby on Rails: Talk

Neil Wilson wrote:
> I see this as akin to an operating system. Let's get processes working
> and see how they handle things before we invent threads. It may be that
> Moore's Law rides to the rescue again.
>
> I want to see how this works. Let's build it, but let's build the
> simplest thing that will work first - total separation and a separated
> tenancy provisioning system.


Agreed. Consider the project started. And with the motto "make the
simplest thing that could possibly work".

I think the first task is expand capistrano to be able to tell it to
run one task for a list environments. Migrate all the environments,
restart all the mongrels, take 'em all down.

SCM check outs remain the same - we'll use one SCM branch for all the
instances.

I'm also working on a simple tool for cron / daemon jobs - again, one
cmd to start/stop them all for all of the environments.

Alan C Francis

unread,
Nov 21, 2006, 11:05:10 AM11/21/06
to rubyonra...@googlegroups.com
Neil Wilson wrote:
> Unfortunately the application I have in mind involves account data, and
> I can't afford a bug in an application exposing one customer's data to
> another. I need something more substantial than that. (And there are
> other reasons - such as backup). However I still want to share physical
> infrastructure.
>
> My thoughts are that there should be a URL per customer driving their
> own mongrels locked onto their own version of the database. However the
> standard infrastructure support tools don't support that way of doing
> things.

This seems crazy to me.

Amazon manage to be very secure and they certainly don't have one
database or appserver per client. I can't imagine many (any?)
service-based webapps that would do this.

If application level security is good enough for, say, my bank or ebay,
is it really not enough for you ?

A.

p.s. I spotted some reference to 37s later in the thread, they've tended
to use a separate URL per client client1.backpackit.com /
client2.backpackit.com but all this does is tell the app what subset of
data to restrict to (ie an additional join clause). It's not an app
instance per client.

Brian Hogan

unread,
Nov 21, 2006, 12:28:07 PM11/21/06
to rubyonra...@googlegroups.com
You can deal with a lot of your application security by just using associations correctly.

A before filter sets a user object based on the value in session.

  @user = User.find session[:user]


In ProjectController, the list method is something like this


  def list
    @projects = @user.projects
  end


There's simply no need to worry about screwing up the relationships as long as you track what user owns things. When you save the data, make sure you save the owner of that record on every table and then let the relationships work themselves.

Worried about extra database hits? Then use eager loading where appropriate. Use a before_filter for the project controller that eager loads the projects for a user.  Maybe even load more stuff.  Or create methods on the user object to do your loading. 

Roderick van Domburg

unread,
Nov 21, 2006, 2:15:58 PM11/21/06
to rubyonra...@googlegroups.com
Alan C Francis wrote:
>> My thoughts are that there should be a URL per customer driving their
>> own mongrels locked onto their own version of the database. However the
>> standard infrastructure support tools don't support that way of doing
>> things.
>
> This seems crazy to me.
>
> Amazon manage to be very secure and they certainly don't have one
> database or appserver per client. I can't imagine many (any?)
> service-based webapps that would do this.

I agree. I've seen two approaches:

1. Perform scoping in-database by using views and triggers. A stored
procedure is used to set up the views for the specific customer or user.

2. Perform scoping in the application. We've been using around_filter in
Rails to wrap entire controllers in a with_scope. However, reading
recent threads on Rails-core, with_scope will go protected which will
make this approach extremely impractical.

I'm no fan of option #1 because it's behavior isn't explicit or
traceable. From experience I know that even (e.g.) PostgreSQL itself
doesn't like that black box -- it's query planner just fails to perform
necessary optimizations that would otherwise have been obvious.

Seeing how my idea of going about option #2 is going to be deprecated in
Rails, I share your curiosity as to what _is_ the optimal solution.
Starting every single action with a with_scope sure may be traceable but
its repetition seems greatly inefficient.

Very interested to hear your ideas!

- Roderick

nuno

unread,
Nov 21, 2006, 3:42:14 PM11/21/06
to rubyonra...@googlegroups.com
May be the solution would be to use a virtualization system ? Like the
one that is available under Linux (Xen)

- initial virtual partition can be software prepared in a minute
- each virtual machine is insulated from the others
- you don't have to fear maintaining dozens of real servers
- backup once backup all
- lower costs
- versioning mechanism of the virtual partitions (quite instant rollback
in case of failure, just after a big maintenance task for example)

If you can't afford dns propagation, use one tcp port for each client on
the frontend then forward them to each virtual server

My 2 cents...

Cw K.

unread,
Nov 21, 2006, 3:53:08 PM11/21/06
to rubyonra...@googlegroups.com
Roderick van Domburg wrote:
> 2. Perform scoping in the application. We've been using around_filter in
> Rails to wrap entire controllers in a with_scope. However, reading
> recent threads on Rails-core, with_scope will go protected which will
> make this approach extremely impractical.
>

I'm on therecord here as preferring the in-application approach for
reasons as already stated.

In our case, the world of possible actions is too complex to make a
simple filtering security model practical. In our case, we have not only
clients to worry about, but user groups and individual user permissions.
Determining the list of allowable actions for User A at Point B involves
a number of tests.

This is actually something we're revisiting to see if there's a better
way as we are looking to allow clients to define custom entitlement
schemes. My experience, at least in a B2B environment, is that
entitlement schemes always become more complex over time. Part of me
doubts whether there is a good generalized approach to this at the
framework level.

Roderick van Domburg

unread,
Nov 21, 2006, 3:57:41 PM11/21/06
to rubyonra...@googlegroups.com
Cw K. wrote:
> This is actually something we're revisiting to see if there's a better
> way as we are looking to allow clients to define custom entitlement
> schemes. My experience, at least in a B2B environment, is that
> entitlement schemes always become more complex over time. Part of me
> doubts whether there is a good generalized approach to this at the
> framework level.

Good point that may not be far from the truth. After all, such schemes
are not the common case and so frameworks may not provide in them.

I too am in favor of doing it in-application. But truth be told, we have
a system with an equally complex authorization scheme (ACLs based on
role, division and subdivision) and we're doing that rather successfully
in-database. It's even passed the test of evolution as the schemes
indeed grew more complex.

- Roderick

S. Robert James

unread,
Nov 21, 2006, 10:00:25 PM11/21/06
to Ruby on Rails: Talk

Roderick van Domburg wrote:
> I agree. I've seen two approaches:
>
> 1. Perform scoping in-database by using views and triggers. A stored
> procedure is used to set up the views for the specific customer or user.
>
> 2. Perform scoping in the application. We've been using around_filter in
> Rails to wrap entire controllers in a with_scope. However, reading
> recent threads on Rails-core, with_scope will go protected which will
> make this approach extremely impractical.
>
>
> Seeing how my idea of going about option #2 is going to be deprecated in
> Rails, I share your curiosity as to what _is_ the optimal solution.
> Starting every single action with a with_scope sure may be traceable but
> its repetition seems greatly inefficient.

Just keep on doing. You don't have to agree with the core - you can
just send(:with_scope, params).

But, even better: it's protected, not deprecated. Define a method
with_scope_for_user(user) in your model, mark it public, and have it
call with_scope. That's much better anyway.

Neil Wilson

unread,
Nov 22, 2006, 2:03:39 AM11/22/06
to Ruby on Rails: Talk

nuno wrote:

> May be the solution would be to use a virtualization system ? Like the
> one that is available under Linux (Xen)

I see Xen as part of the solution, but not in the way that you imagine.

NeilW

Neil Wilson

unread,
Nov 22, 2006, 2:34:46 AM11/22/06
to Ruby on Rails: Talk

Cw K. wrote:

Part of me
> doubts whether there is a good generalized approach to this at the
> framework level.

Does a tenant ever need to see another tenants data in a manner that
couldn't be achieved simply by giving an individual a user id in both
tenant's user list?

You see I still see the user list, group list, access control lists and
authentication/authorisation role system within an application space.
You have to do that and the structure is indeed different and evolving
for every application there is.

But the tenant can be removed to framework level, cos a tenant is just
a good old fashioned user at infrastructure level and half the job is
already done by the standard Unix user tools.

You've got to admit that

rake remote:exec ACTION="invoke" COMMAND="adduser new_tenant"
SUDO="yes"
cap -s user=new_tenant -a cold_deploy
rake remote:exec ACTION="invoke" COMMAND="invoke-rc.d apache2 reload"
SUDO="yes"

has a certain succinct charm to it. I wonder how close to this ideal I
can get and how much it costs in real terms?

NeilW

Neil Wilson

unread,
Nov 22, 2006, 2:56:52 AM11/22/06
to Ruby on Rails: Talk

S. Robert James wrote:

> Agreed. Consider the project started. And with the motto "make the
> simplest thing that could possibly work".
>
> I think the first task is expand capistrano to be able to tell it to
> run one task for a list environments. Migrate all the environments,
> restart all the mongrels, take 'em all down.

That depends how you separate the tenants. If you make a tenant a Unix
user, then the job is (potentially) trivial:

for word in `cat list_of_tenants`; do cap -s user=$word -a update; done
for word in `cat list_of_tenants`; do cap -s user=$word -a restart;
done


> I'm also working on a simple tool for cron / daemon jobs - again, one
> cmd to start/stop them all for all of the environments.

Again in theory if you make a tenant a Unix user, then the cron jobs
all run in the user's crontab in the user space, and so do all the
daemons for that tenant. So restarting them just needs a dose of
'killall' and a script running while running as the correct user.

You can user the @reboot facility of cron to bring the Mongrels up for
a tenant when the machine starts, and a daily cron entry to restart
them to keep memory under control.

I like the idea of tenant = Unix user. It has a certain conceptual
charm to it, and if I can make it work it gives me a ton of leverage of
the base Unix tools.

Barking?

NeilW

S. Robert James

unread,
Dec 12, 2006, 1:32:46 PM12/12/06
to Ruby on Rails: Talk
Two great articles discussing exactly this:
http://msdn2.microsoft.com/en-us/library/aa479086.aspx
http://msdn2.microsoft.com/en-us/library/aa479069.aspx


Neil Wilson wrote:
> I'm trying to get a handle on the different ways of maintaining data
> separation in Rails. From what I've read it looks like usually the
> security is handled entirely as an aspect within the Model.
>
> I constantly find it amusing that whenever a 'new' way of doing
> applications is created, they always ignore the security systems that
> have gone before. First we had operating system security with its user
> and group database. Then we have databases with their own security
> model. Now we have web apps reinventing that particular wheel once
> again sitting in a single operating system user space and logging onto
> the database with a single all powerful user.
>

> Unfortunately the application I have in mind involves account data, and
> I can't afford a bug in an application exposing one customer's data to
> another. I need something more substantial than that. (And there are
> other reasons - such as backup). However I still want to share physical
> infrastructure.
>

> My thoughts are that there should be a URL per customer driving their
> own mongrels locked onto their own version of the database. However the
> standard infrastructure support tools don't support that way of doing
> things.
>

snacktime

unread,
Dec 12, 2006, 2:01:12 PM12/12/06
to rubyonra...@googlegroups.com
On 12/12/06, S. Robert James <srober...@gmail.com> wrote:
>
> Two great articles discussing exactly this:
> http://msdn2.microsoft.com/en-us/library/aa479086.aspx
> http://msdn2.microsoft.com/en-us/library/aa479069.aspx
>

One thing I would add to this is that even when using separate
databases or schema's, it pays to design your tables as if the data
was all in one database/schema.

Also as an FYI for those that are interested. We spent a good amount
of time working on different ways to use rails in an environment where
user data was separated by schema's. One thing that's worked fairly
well is the set_table_name method, which can be used to set the
schema.tablename at the start of each request. At a slight hit in
performance we actually do something like the following:

- Start of request
- set_table_name 'schema.table'
- Do stuff
- set_table_name 'none'

Neil Wilson

unread,
Dec 12, 2006, 2:15:27 PM12/12/06
to rubyonra...@googlegroups.com

V. Interesting. Thanks for that.

BTW You'll be glad to hear that the Multi-tenant system is progressing
(at snail's gallop, but at least it's moving forward). I have a
brittle proof of concept up on a Debian Etch Xen platform.

One of the interesting side effects of using Capistrano to deploy code
once per tenant is that file system sessions suddenly scale rather
well.

Since multi_tenant is built entirely as a set of Capistrano recipes
and plugins I'll probably run any posts on the Capistrano group rather
than here - where it may get lost in the noise.

Stay tuned

NeilW

S. Robert James

unread,
Dec 12, 2006, 2:37:16 PM12/12/06
to Ruby on Rails: Talk
Neil - I've tried emailing you directly but they bounce - could you
email me - I think we may be able to collobarate here.

S. Robert James

unread,
Dec 12, 2006, 2:43:49 PM12/12/06
to Ruby on Rails: Talk
One thing I've done to keep DRY:
# environment.rb
require 'config/tenants'

# tenants.rb
PRODUCTION_TENANTS = ['joe', 'fred', 'bob']

# database.yml
<% PRODUCTION_TENANTS.each do |tenant| %>
production_<%= tenant %>:
adapter: postgresql
database: <%= tenant %>
username:<%= tenant %>
password: useasharedpasswordforalltenants
host: localhost

One hitch I've had is rails wants to load (environemntname.rb) and
crashes if it can't. I'd rather use the same production.rb for all of
them. Any ideas?

Reply all
Reply to author
Forward
0 new messages