Any recommendations to implement clustered service instances?

yssk22

unread,

Jun 19, 2012, 11:15:21 AM6/19/12

to vcap-dev

Hi,

I'm planning to implement cluster version of some services gateway/
node (especially the first target would be mongodb). Our basic
requirements are:

A. replica set (in mongodb's term) support via plan config
B. support add/delete instances for cluster instance.

As for these requirements, does dev team have any direction for
implementation?

I suppose following two types of topologies.

A. gateway <-> node <-> cluster manager <--> [instance, instance,
instance, ...]

B. gateway <-> node <-> [instance, instance, ...]
node <-> [instance, instance, ...]
node <-> [instance, instance, ...]

As for A, the cluster management is out of scope for vcap-serviece
implementation, just it should proxy provisioning request to outside
CF. This case would be easy to be applied for some management API
friendly products (I think Couchbase would be the good example). But
the infrastructure deployment (Bosh or something) would be more
difficult to adopt each products.

As for B, the cluster management should be done inside vcap-service
implementation so that cloud controller database (and node local
sqlite3 db) should have cluster info and keep the cluster health and
availabilities. Gateway and node implementation would be more complex
but infrastructure deployment would be the same as the current.

How do you guys think about provisioning the cluster version of
service instances?

Nicholas Kushmerick

unread,

Jun 19, 2012, 2:24:59 PM6/19/12

to vcap...@cloudfoundry.org

We are actually in the design phase of doing this as well. In the medium/long term we certainly don't want to duplicate effort, but in the short term it would be great for multiple approaches to help us reach consensus on the best approach. Here's an overview of our plan:

A Cloud Foundry Service Gateway would be configured to provision a service instance across N Service Nodes
N is just a parameter, as far as the gateway is concerned it can be any value, though only some values might make sense for specific services
for review, the provisioning process today (ie, N=1) works like this:

Gateway selects one Node using a simple load balancing algorithm
Gateway forwards provisioning request to the selected Node
Node receives the request and performs whatever operations are appropriate for the service (eg for a database: CREATE DATABASE foo; CREATE USER bar; etc)
Gateway receives back from Node the instances credentials (host, port, username, password, etc -- whatever makes sense for the service)

The proposed protocol for supporting N > 1 is as follows:

Gateway selects N best Nodes according to the same load-balancing algorithm
Gateway forwards provisioning request to one 'seed' Node, chosen arbitrarily from the selected nodes
'Seed' Node receives the request and performs whatever operations are appropriate
Gateway receives back from 'seed' node the instance's credentials
Gateway forwards provisioning request to the other N-1 "child" Nodes, along with the credentials for the 'seed' node
Child nodes perform whatever operations are appropriate, including remembering/registering/etc the seed node
Gateway receives back from the child nodes the instance credentials.
Gateway bundles all N sets of credentials together to supply to the user's app, which will know what to do with them all

For service that are clustered in a master-slave organization, this instance on the seed node naturally becomes the master. But for services in which (for example) all nodes are peers, this seed/child protocol is just a way for the instances to discover each other during the provisioning process. After provisioning is complete, the nodes can interact with each other in whatever way they way.

We plan to build this into the services base library (https://github.com/cloudfoundry/vcap-services-base) to make life easy for any service that wants to use this capability.

I'd certainly be interested in hearing more about your plans/ideas.

-- Nick

yssk22

unread,

Jun 20, 2012, 2:03:20 AM6/20/12

to vcap-dev

Thanks for sharing your plan, which is very similar to our plan #B

I'd like to make some discussions for more details.

> For service that are clustered in a master-slave organization,

In this case, there would be an limitation of switching the master
server when the original master is down. Is it right? or do you have
any 'recovery' mechanism to be implemented?
If master fails and it is not recovered by vcap, we should provide
notifications and recovery operation feature in vmc.
# a command like 'vmc cluster-service instance1 instance2 instance3 --
master instance0' might be required..

> But for services in which (for example) all nodes are peers, this seed/child protocol

In this case, applications should be able to receive notifications not
to use the dead nodes when some of instances are down... The current
service configuration mechanism (using env vars) would be an potential
issues for this implementation.

> After provisioning is complete, the nodes can interact with each
> other in whatever way they way.

I agree with this idea but applications would like to control when the
provisioned nodes join the cluster because there would be a
performance issue in production services.

On 6月20日, 午前3:24, Nicholas Kushmerick <nichol...@rbcon.com> wrote:
> We are actually in the design phase of doing this as well. In the
> medium/long term we certainly don't want to duplicate effort, but in the
> short term it would be great for multiple approaches to help us reach
> consensus on the best approach. Here's an overview of our plan:
>

> - A Cloud Foundry Service Gateway would be configured to provision a

> service instance across N Service Nodes

> - N is just a parameter, as far as the gateway is concerned it can be

> any value, though only some values might make sense for specific services

> - for review, the provisioning process today (ie, N=1) works like this:
> - Gateway selects one Node using a simple load balancing algorithm
> - Gateway forwards provisioning request to the selected Node
> - Node receives the request and performs whatever operations are

> appropriate for the service (eg for a database: CREATE DATABASE
> foo; CREATE
> USER bar; etc)

> - Gateway receives back from Node the instances credentials (host,

> port, username, password, etc -- whatever makes sense for the service)

> - The proposed protocol for supporting N > 1 is as follows:
> - Gateway selects N best Nodes according to the same load-balancing
> algorithm
> - Gateway forwards provisioning request to one 'seed' Node, chosen

> arbitrarily from the selected nodes

> - 'Seed' Node receives the request and performs whatever operations
> are appropriate
> - Gateway receives back from 'seed' node the instance's credentials
> - Gateway forwards provisioning request to the other N-1 "child"

> Nodes, along with the credentials for the 'seed' node

> - Child nodes perform whatever operations are appropriate, including
> remembering/registering/etc the seed node
> - Gateway receives back from the child nodes the instance credentials.
> - Gateway bundles all N sets of credentials together to supply to the

dick....@senzilla.com

unread,

Aug 1, 2012, 4:20:56 AM8/1/12

to vcap...@cloudfoundry.org

Thanks all for bringing this up. Taking vcap services in this direction is crucial, imho.
Has there been any more work on this front? Where or how can I help to move things forward?

Nicholas Kushmerick

unread,

Aug 2, 2012, 12:44:38 PM8/2/12

to vcap...@cloudfoundry.org

I'm afraid there hasn't yet been any progress beyond some email discussions.

This work will happen in the "multinode" branch: https://github.com/cloudfoundry/vcap-services-base/tree/multinode . Feel free to dig in and submit your code to reviews.cloudfoundry.org.

--
Nick
phone +1.206.293.5186 · skype nicholaskushmerick

Reply all

Reply to author

Forward