Hello
We have a project where we try to add multitenancy to an application that is not able to handle multitenancy on it's own by running multiple instances of the application within OpenStack tenants in combination with consul service discovery and a custom service gateway (nginx+lua).
Our first plan was to start a consul client agent in every tenant to announce the services and limit the capabilities the client agents using ACL tokens and service name prefixes. The whole consul cluster would be a single datacenter with 3-5 server agents and multiple (up to 100) client agents.
During development we also had the idea to take the separation of tenants a step further and run a single server agent as its own datacenter within every tenant and run a 3-5 node consul cluster in the "master" datacenter that would also be the acl_datacenter. The outcome would be that we have a master datacenter with only a small number of nodes and up to 100 datacenters connected to this master datacenter. This idea might sound completely crazy, but worked out pretty well in the first small (3-5 DCs) tests.
The question that is arose is about the scaling capability of the gossip WAN pool. Is consul able to handle such a high number of datacenters and what could be problems that we might face on scale? Keep in mind that the normal WAN considerations (latency, timeouts, ...) wouldn't apply here, because all tenants would be on the same private cloud.
I would be very grateful if some developer could provide some input on this idea. Also if you think that this questions is better suited for the serf mailinglist, just tell me.
Best regards
Lukas