Diego Transition Plans

232 views
Skip to first unread message

Onsi Fakhouri

unread,
Jan 30, 2015, 6:05:53 PM1/30/15
to vcap...@cloudfoundry.org
Hello again vcap-dev,

Diego continues to make progress.  We're close to an internal beta for the production environment that Pivotal hosts.

A plan for how Diego will roll out is crystallizing.  I wanted to run this by the list.

First, a proviso: the focus of this e-mail is less about distribution and deployment of Diego (we currently have a separate bosh release called diego-release, when/how/whether that gets integrated with cf-release is still up in the air and not in scope for this email). Rather, I'd like to focus on how developers with apps deployed to CF will transition their applications from the DEAs to Diego.

## Phasing Diego in, phasing the DEAs out

We will have an extended period of time in which increasingly stable versions of Diego will be deployed *alongside* the DEAs.  We would like operators to deploy Diego alongside the DEAs during this period and encourage their developers to give Diego a try and give us feedback.

For application developers, switching to Diego should be a relatively transparent process (though there are a handful of ∆s that we will be documenting in time).

After several months we will reach the stage where we call Diego stable and stop running DEAs in our own production environment.  By this point all applications will need to have been transitioned to Diego.

We think this approach will allow people to play with Diego, build trust around Diego, and give us feedback about Diego.  This also minimizes risk (we don't want anyone to be in a situation where they've turned off the DEAs and end up stuck with an application that doesn't work with Diego).

## How do applications opt into Diego?

We'll be introducing a new `diego` boolean column on all applications in the CC database.  This will default to false.  Setting the column to `true` will cause the application to stage and run on Diego.  This will be possible via `cf curl`, though we will likely add a transitional flag to `cf push` that will enable setting this more easily from the CLI.

For downtimeless transitions to Diego we recommend using your favorite blue-green technique to spin up a version of the application on Diego, then spin down the old version off of the DEAs.

Instead of a blue-green deploy one could simply set the `diego` boolean to true on an existing application.  This will cause the application to immediately start on Diego and *eventually* get reaped from the DEAs.  There will be no guarantees that the application is safely running on Diego before reaping it from the DEAs.

Once we live in a Diego-only world we will delete the `diego` boolean.

## Who sets the `diego` boolean?

This will strongly depend on operator<->developer culture.  Some operators may want to give their developers control over the boolean.  Some may want to retain control themselves.  To enable both use-cases we will add a bosh-configurable option to the CC that controls who has authority to set the `diego` boolean.  We'd like to keep things simple and have this configuration apply across the entire CF installation, not on a per-org basis.

With this one could cook up a transition scenario like:
- Operators give developers the ability to switch to Diego and encourage them to do so and send feedback
- As the permanent switch to Diego approaches operators could audit all applications.  They might then reach out to developers who haven't migrated their applications to Diego yet.
- Eventually, operators could revoke the developer's ability to modify the `diego` boolean.  Operators could then slowly bleed remaining applications off of the DEAs onto Diego ahead of the deploy.

## How do operators audit the `diego` boolean?

Operators will be able to use the CC API to fetch applications with `diego=true` or `diego=false`.

Open-source tooling could be built on top of this to (for example) slowly bleed load from the DEAs onto Diego's Cells.  We will wait on community feedback before committing to building any such tooling.

Thoughts?

Onsi

Mike Youngstrom

unread,
Jan 30, 2015, 6:15:04 PM1/30/15
to vcap...@cloudfoundry.org
I think this plan sounds great.  For our org I don't think we'll have any trouble running Diego and DEAs side by side for a while breaking in Diego over for a couple months till we eventually change the default and communicate a date to our users where we kill the DEAs forcing them to transition.

Question.  Will the exact same droplet theoretically also run in Diego or will an apps forced to switch have to be restaged?

I'm getting excited for Diego! :)

Mike

--
You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.
To view this discussion on the web visit https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/CAFwdB-yiEbq25C%3DOVpO5VeCm_sNyJqmuAQZnKq459ZGPJzrysg%40mail.gmail.com.

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

Onsi Fakhouri

unread,
Jan 30, 2015, 6:31:20 PM1/30/15
to vcap...@cloudfoundry.org
Yup.  Exact same droplet should run on Diego and vice versa.  If you flip the Diego bool back and forth your app will ping pong between both runtimes without a need to restage.

Onsi

Michael M Behrendt

unread,
Jan 30, 2015, 11:26:13 PM1/30/15
to vcap...@cloudfoundry.org
thanks for the update, Onsi.

For a hosted version of CF, we wouldn't want to expose to developers the transition from DEAs to Diego, it should be transparent (a developer shouldn't have to worry about changes of the platform he's running on). So ideally we should have the ability to flip a switch (the boolean flag you described below?), which triggers the migration to DEAs in a non-disruptive fashion, while below you're saying there won't be any guarantees wrt availability when moving an app in such a way.

What are your thoughts on how to address this scenario?


Michael

--

You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.
To view this discussion on the web visit

https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/CAFwdB-yiEbq25C%3DOVpO5VeCm_sNyJqmuAQZnKq459ZGPJzrysg%40mail.gmail.com.

Onsi Fakhouri

unread,
Feb 2, 2015, 1:27:05 PM2/2/15
to vcap...@cloudfoundry.org, Michael M Behrendt
Hi Michael,

For a hosted version of CF, we wouldn't want to expose to developers the transition from DEAs to Diego, it should be transparent (a developer shouldn't have to worry about changes of the platform he's running on).

Perhaps.  The nature of this change is fairly massive, however, and I believe there is wisdom in giving developers a chance to kick the tires while Diego is in beta.  In particular, developers who have heavily invested in the platform and depend on it for their production environments would probably be happy to contribute back by helping ensure that the new version of the platform suits their needs.

 So ideally we should have the ability to flip a switch (the boolean flag you described below?), which triggers the migration to DEAs in a non-disruptive fashion, while below you're saying there won't be any guarantees wrt availability when moving an app in such a way.

This is a very important point to clarify.  I am not proposing a single boolean flag that automatically migrates all applications from the DEAs to Diego.  The boolean flag lives on the application and it will be up to operators (either in concert with developers, or not) to migrate applications over.  For very large installations this will be something that should be done with planning and care — (for example: suddenly downloading 10,000 droplets may not go over well with your network).

In terms of guarantees with respect to availability: you are correct.  To solve this problem most correctly we would need to teach the Cloud Controller to run one application on both the DEAs and Diego simultaneously.  This adds a burden of complexity that we would rather avoid if possible.

What are your thoughts on how to address this scenario?

Our goal is to give operators the APIs they need to orchestrate this migration as they see fit (in particular: this is as much about interacting with people as it is interacting with technology).  We’re open to building tooling/pre-packaged opinions on top of the API - though it would be best, I believe, to do this collaboratively with the community.




Michael




From:        Onsi Fakhouri <ofak...@pivotal.io>
To:        "vcap...@cloudfoundry.org" <vcap...@cloudfoundry.org>
Date:        01/31/2015 12:05 AM
Subject:        [vcap-dev] Diego Transition Plans



Noburou Taniguchi

unread,
Feb 3, 2015, 8:52:16 PM2/3/15
to vcap...@cloudfoundry.org, ofak...@pivotal.io
Would you please consider a third option: "diego: half" or making "diego" a decimal fraction ranging from 0.0 to 1.0.

Pros:
- can try Diego with low risk, and transit with confidence

Cons:
- make things complex

Is this a bad idea?

2015年1月31日土曜日 8時05分53秒 UTC+9 Onsi Fakhouri:

Onsi Fakhouri

unread,
Feb 4, 2015, 12:40:25 PM2/4/15
to vcap...@cloudfoundry.org, Noburou Taniguchi
Woops, forgot to reply all:

Would you please consider a third option: "diego: half" or making "diego" a decimal fraction ranging from 0.0 to 1.0.

What exactly would this do?  Pick a random subset of applications and put them on Diego?  My concerns with the decimal fraction is that it removes control from the operator: which subset of applications will be migrated to Diego?  How do I opt an individual application in or out?  What we are proposing will give visibility into and explicit, reproducible, control over what is on Diego and what is not.

One could fairly easily build tooling on top of our proposed API that could select a random subset of the applications and move them to Diego.



Pros:
- can try Diego with low risk, and transit with confidence

I think the existing proposal satisfies this — with minimal effort operators and/or developers can opt into Diego and build confidence that their applications will work well with Diego.  As the time to transition approaches, operators would be able to find the subset of applications that have not migrated yet and take action (either contactingthe developers, or migrating the applications themselves).

Thoughts?

Noburou Taniguchi

unread,
Feb 9, 2015, 9:53:22 AM2/9/15
to vcap...@cloudfoundry.org, d...@nota.m001.jp, ofak...@pivotal.io
Woops, forgot to reply all:


So I resend my reply to this mailing list.

----

Onsi, 

I am sorry I didn't fully follow your first post. My expectation will be 
achieved by: 
 
> For downtimeless transitions to Diego we recommend using your favorite 
blue-green technique to spin up a version of the application on Diego, then spin 
down the old version off of the DEAs. 
 
this blue-green deployment technique. 
 
Now I have a question. Is it able that changing diego attribute to true does not 
cause immediate restart of an app? 
 
If so, we can have another way to try Diego by restarting a portion of instances 
of an application with diego:true, because we already have an API that 
terminates a specific instance of an application. 
 
I know that this method has uncertainity because there may be uncontrollable 
restart of an instance. Yes, the blue-green approach is right and safe. It's 
just from curiosity. 
 
And another question: why don't you use stack for this transition? My guess is 
that changing stack may cause restaging (I'm not sure about it because I have no 
experience of changing stack). 
 
Anyway, thank you for letting me think twice about this topic. 


2015年2月5日木曜日 2時40分25秒 UTC+9 Onsi Fakhouri:

Simon

unread,
Feb 13, 2015, 6:13:19 PM2/13/15
to vcap...@cloudfoundry.org, ofak...@pivotal.io
Has anyone deployed on vmware yet? If so care to contribute your cf-infrastructure-vsphere.yml?
Reply all
Reply to author
Forward
0 new messages