Support for multiple vsphere datacenters in vsphere CPI

88 views
Skip to first unread message

Mike Youngstrom

unread,
Apr 3, 2014, 3:42:58 PM4/3/14
to bosh-...@cloudfoundry.org
Does the vsphere CPI yet support deploying a single deployment accross multiple datacenters?  If not is there work planned in this space?

Today we deploy a single cloud foundry installation across 3 vsphere datacenters using 5 different bosh deployments.  When we set this up the CPI didn't support multiple vsphere datacenters with a single bosh deployment so this was our only choice.  We've since grown to like breaking our CF deployment into multiple bosh deployments. [1]

This has forced us to fork cf-release since cf-release, in a number of places, assumes that an entire cloud foundry installation should be a single Bosh deployment [2].

In bosh's opinion.  Should cf-release assume a single bosh deployment == a single cloud foundry installation?  Or should cf-release do what it can to support breaking a cloud foundry installation up in to multiple bosh deployments?

What are bosh user's/devs thoughts on this topic?

Mike




Mike Youngstrom

unread,
Apr 8, 2014, 3:01:44 PM4/8/14
to bosh-...@cloudfoundry.org
Turns out that cf-release uses job name + index to uniquely identify a cf components.  So, we just need to uniquely name our jobs in each deployment and that should solve our problem.

Mike

Ryan Grenz

unread,
Apr 9, 2014, 6:51:39 AM4/9/14
to bosh-...@cloudfoundry.org
Hi Mike,

I've been meaning to reply to your post for a short while, and am surprised nobody else has chipped in here yet, as this is a subject I'm sure many people deploying Cloud Foundry must be familiar (or having difficulty) with by now.

We have 2 vSphere datacenters, and 2-3 clusters in those datacenters, and wish to spread a single Cloud Foundry deployment across that as active/active as possible, using 2 BOSH deployments, and leveraging DEA placement strategy functionality. 

So far we have a working CF deployment across these datacenters, with DEA placement strategy working, using mostly spiff generated manifests and a bit of manual manifest hackery on top. We are now looking at testing out spreading job instances across multiple vsphere clusters in a deployment as per https://www.pivotaltracker.com/s/projects/956238/stories/62927424 which was delivered back in mid-Feb. 
This deployment uses a single NATS and HM9000 instances in one of the DCs. Obviously we'd like to run this properly clustered, however it still seems there are issues doing this.
cf-acceptance-tests currently run across this deployment ok.

Unfortunately spiff does not support the concept of multiple deployments of cf-release. For example, the ability to list IPs of etcd's and nats from one or many DCs into other DC manifests. So this introduces an additional layer of YML trickery to get those placed in, which we can script easily - its just an additional layer of surgery on an existing BOSH manifest.

Another thing we're trying to iron out is how to deal with Postgres and whether to use DRBD to replicate its data to a passive DB located in other DCs. This obviously adds more complexity to the deployment though.

I'm very much a supporter of cf-release being a multi BOSH deployment, and also interested to hear what other people are doing :)

Cheers,

Ryan

Ryan Grenz

unread,
Apr 9, 2014, 6:55:03 AM4/9/14
to bosh-...@cloudfoundry.org
I meant to add as well that BOSH doesn't yet support multiple DC deployment from a single manifest, however it does support multiple cluster deployments to a single DC in a BOSH manifest.
So doing multi BOSH deployments to individual DCs, and leveraging the cluster support should give you a pretty nice fine grained deployment as a result. In theory!

We are about to start testing this out :)

Ryan

Wayne E. Seguin

unread,
Apr 9, 2014, 9:23:54 AM4/9/14
to bosh-...@cloudfoundry.org
On Wed, Apr 9, 2014 at 6:51 AM, Ryan Grenz <in...@ryangrenz.com> wrote:
Another thing we're trying to iron out is how to deal with Postgres and whether to use DRBD to replicate its data to a passive DB located in other DCs. This obviously adds more complexity to the deployment though.

It would be better to rely on PostgreSQL 9.3's built in replication over DRBD for this process which is very simple to kickoff from the remote DB once you configure the primary's pg_hba for the passive DB to access.  That said, using DRBD to replicate it's WAL files directory (pg_xlog/) could work quite well as a secondary mechanism.

Mike Youngstrom

unread,
Apr 9, 2014, 12:05:35 PM4/9/14
to bosh-...@cloudfoundry.org
Thanks for the response Ryan.  Turns out we're not just spreading our deployment across datacenters but across vcenter instances as well (not sure why our vsphere architects designed it that way).  We're working right now on etcd integration so you're further along than us in this.

For database we decided to go with clustered mysql Percona [1].  It has been working well for us across the cluster.  Our big hang up was with `spec.index`.  We didn't realize our job names needed to be unique across deployments.  With that out of the way we're good to go.

What kind of issues are you having running running it "properly clustered"?

Mike



To unsubscribe from this group and stop receiving emails from it, send an email to bosh-users+...@cloudfoundry.org.

Reply all
Reply to author
Forward
0 new messages