HM9000 Relocation Strategy

Brian McClain

unread,

Apr 25, 2014, 1:59:39 PM4/25/14

to vcap...@cloudfoundry.org

Hi all,

An interesting question came up today that had me thinking, which was how does the health manager decide the priority of instance to restore if a DEA drops off for one reason or another. Is it first come, first server? Or is there more involved?

To give a specific example, Say we have three apps, App1, App2 and App3. They each have 2 instances. We have 2 DEAs that has enough capacity to host 4 instances of applications. Now let's say the instances are distributed as so:

DEA 1

App1 #1

App1 #2

App2 #1

DEA 2

App2 #2

App3 #1

App3 #2

Now let's say DEA 2 goes offline (unexpectedly, so no evacuation done). On CFv1 when I last investigated this, I saw the following behavior:

DEA 1

App1 #1

App1 #2

App2 #1

App2 #2

Which left App3 completely down. Now the ideal (other than building this out with a proper amount of capacity so that there's barely any reserved) is that an instance of App3 would be brought up on DEA 1, not the second instance of App2. This way, both apps are up and online.

I realize this is a super, super simple example and real-world deployments arn't as straight forward, but it's more a question fueled by curiosity

- Brian

Alan Moran

unread,

Apr 25, 2014, 2:32:04 PM4/25/14

to vcap...@cloudfoundry.org

Hi Brian,

As far as I know when you push an application instances are distributed in more than one DEA, CC distributes the instances evenly so that it assures you 0 downtime in case a DEA goes down. I don’t know specifically how CC relocates the instance when they go down. My guess is that HM sends an event back to cloud controller so that it creates the missing instances in an even distribution.

Can anyone confirm this behaviour?

-Alan

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

James Bayer

unread,

Apr 26, 2014, 6:28:14 PM4/26/14

to vcap...@cloudfoundry.org

brian, it's first come first serve. there is currently no priority given so that all apps have a single instance running before other apps get a 2nd instance and so on. instances will be balanced across available DEAs using the criteria here [1]. for operators that need to tolerate a DEA failure or even more a full AZ failure, then we recommend having enough DEA capacity. in the future we will be implementing placement pools which are basically collections of app executors that won't be used up by just any app, but just apps in the spaces that are associated with the placement pool which should ensure capacity for the most important apps as long as the placement pool has enough extra capacity.

[1] https://github.com/cloudfoundry/cloud_controller_ng/blob/3a1907d2941ac5ff1a4568be9b0068b9cdbff16f/lib/cloud_controller/dea/dea_pool.rb#L38-L52

--

Thank you,

James Bayer

Aristoteles Neto

unread,

Apr 27, 2014, 6:10:09 PM4/27/14

to vcap...@cloudfoundry.org

That is indeed an interesting point. Arguably though, if an App has been deployed with multiple instances it may require them to function under "normal" load - meaning that by reducing the amount of instances in an App, you may render both of them unstable and non-operational (instead of only one of them).

-- Neto

Onsi Fakhouri

unread,

Apr 28, 2014, 1:17:18 AM4/28/14

to vcap...@cloudfoundry.org

The Health Manager has no insight into the resources available in the DEA pool so it does not make its decisions based on how much spare capacity is available. In fact, HM simply instructs the CC to start/stop instances. It is the CC's responsibility to place these instances.

One minor point here (in practice, this does not really affect the situation very much). HM orders and prioritizes the start messages based on the *fraction* of missing instances -- the higher this fraction, the higher the priority of the start message. In your case, when DEA2 disappears App2 will have 1/2 missing instances whereas App3 will have 2/2 missing instances. As a result, the start messages for App3 will be sent over NATS first. In practice, these messages are sent out asynchronously and quickly and - with multiple CCs listening - it becomes a crapshoot as to which start message is actually processed first.

Onsi

Brian McClain

unread,

Apr 28, 2014, 10:30:38 AM4/28/14

to vcap...@cloudfoundry.org

Makes sense, guys. Thanks for the insight!

James, I think that's a great idea. I'm sure as with others, there's a core set of apps that are deemed more important than others (ie. if the app that's used as an internal dashboard goes down, no big deal. If the app that handles authentication across all of our apps go down, really big deal) so having the option to separate these out to placement pools sounds very interesting.

Onsi, thanks for the explication on that point, and it does make it clear on my CF doesn't force this behavior (nature of the distributed beast, I suppose).

- Brian

Reply all

Reply to author

Forward