[upcoming] Disk pools

dkal...@pivotal.io

unread,

Aug 7, 2014, 1:45:37 PM8/7/14

to bosh...@cloudfoundry.org

Hey all,

After receiving multiple requests for a way to specify IaaS properties for certain persistent disks I wanted to share some proposed changes before BOSH team implements them.

The name of this new feature is disk pools. They would work similarly to how resource pools work for VMs.

Disk pools would solve following use cases (I'm sure each IaaS has its own disk configuration that can be exposed):

- AWS SSD disks (iops requests)

- vSphere thin provisioned disks

Here are the proposed changes to the deployment manifest (let's say this is for AWS):

```

resource_pools:

- name: fast_machines

cloud_properties:

instance_type: m3.x2large

# disk_pools is optionally specified

disk_pools: # OR persistent_disk_pools: <-------------------- new

- name: fast_disks

cloud_properties:

volume_type: gp2

jobs:

- name: mydb

templates: [ mysql ]

instances: 1

resource_pool: fast_machines

persistent_disk: 3_000

persistent_disk_pool: fast_disks # OR disk_pool <-------------------- new

networks: [ {...} ]

properties: {}

- name: backup

persistent_disk: 10_000

# no persistent_disk_pool key provided so CPI will pick default settings (this is for backwards compatibility)

...

```

An alternative format for jobs with persistent disk (looks slightly cleaner):

```

- name: mydb

persistent_disk:

size: 3000

disk_pool: fast_disks # <-------------------- new

```

From the CPI perspective create_disk method would have to accept cloud_properties (just like create_vm takes cloud_properties of the resource pool).

We will be starting to work on disk pool feature within few weeks in collaboration with Piston. Let us know what you guys think.

Matthew Kocher

unread,

Aug 7, 2014, 1:51:01 PM8/7/14

to bosh...@cloudfoundry.org

I like the second cleaner option.

Since we can infer the integer vs hash it's backwards compatible, but in the long run it's probably a good idea to start versioning the deployment manifest schema.

To unsubscribe from this group and stop receiving emails from it, send an email to bosh-dev+u...@cloudfoundry.org.

Ferran Rodenas

unread,

Aug 7, 2014, 5:38:27 PM8/7/14

to bosh...@cloudfoundry.org

I also like the "cleaner" option. But instead of 'persistent_disk' (singular), can we use 'persistent_disks' (plural) and make the contents an array, even though we only support 1 disk per vm now? This will be helpful if someday Bosh supports multiple disk per job, as it won't require manifest changes.

- Ferdy

Matthew Kocher

unread,

Aug 7, 2014, 6:29:30 PM8/7/14

to bosh...@cloudfoundry.org

While we're wishing for things - we eventually need to figure out a way to have persistent disks(one or many) associated with a job template, not a job. Uncolocating two templates with persistent disk is impossible right now.

Dr Nic Williams

unread,

Aug 7, 2014, 6:29:31 PM8/7/14

to bosh...@cloudfoundry.org

I like the 2nd/3rd approach - persistent_disks:

This is similar to the move from "template: foo" to "templates: { "name": "foo", "release": "bar" }"

On Thu, Aug 7, 2014 at 2:38 PM, Ferran Rodenas <frod...@gmail.com> wrote:

--

Dr Nic Williams

Stark & Wayne LLC - consultancy for Cloud Foundry users

http://drnicwilliams.com

http://starkandwayne.com

cell +1 (415) 860-2185

twitter @drnic

Ferran Rodenas

unread,

Aug 7, 2014, 6:31:30 PM8/7/14

to bosh...@cloudfoundry.org

+100 I've been facing this problem recently, and it really hurts.

- Ferdy

Dr Nic Williams

unread,

Aug 7, 2014, 6:34:10 PM8/7/14

to bosh...@cloudfoundry.org

Zero/One persistent disk (of some size/type) per job template?

jobs:

- name: core

templates:

- name: postgres

release: cf

persistent_disk:

size: 3000

disk_pool: fast_disks

- name: uaa

release: cf

?

Dmitriy Kalinin

unread,

Aug 7, 2014, 8:00:53 PM8/7/14

to bosh...@cloudfoundry.org

Here are my thoughts on the distinction between jobs and job templates and why I think job templates should not have their own disks (or even _know_ what disk is):

- deployment job - represents a single (scalable?) unit (e.g. web_worker, executor, load_balancer, db_node). each unit gets either 0 or 1 persistent disk (in future we can support multiple persistent disks). Currently BOSH places each deployment job on its own vm but that does not have to stay that way.

- job template - represents _one_ of the processes that's needed for a deployment job to be useful.

Because deployment job is a single unit of work, providing individual job templates with persistent disks does not really fit.

To tie this all together, having ability to compose deployment job from multiple job templates provides a powerful way to combine software into a single unit; however, this feature is sometimes misused for trying to place multiple deployment jobs onto a single vm (unfortunately that's the only way to do that today). I believe that placement of multiple deployment jobs onto a single vm is a first class feature that needs to be properly implemented after we solve namespacing of jobs, packages, etc.

David Laing

unread,

Aug 7, 2014, 8:02:18 PM8/7/14

to bosh...@cloudfoundry.org

Great news! Thanks for getting the ball rolling on this one.

I think that all attributes about a specific disk should be managed at the top level disk_pools:, with the job: just containing a reference to which disk it wants to use (just like resource_pools: )

ie:

resource_pools:

- name: zone1_large

cloud_properties:

instance_type: m3.x2large

disk_pools:

- name: zone1_mysql_1

size: 3000

cloud_properties:

volume_type: gp2

jobs:

- name: mydb

templates: [ mysql ]

instances: 1

resource_pool: zone1_large

disk_pool: zone1_mysql_1

networks: [ {...} ]

properties: {}

To unsubscribe from this group and stop receiving emails from it, send an email to bosh-dev+u...@cloudfoundry.org.

--
David Laing
Trading API @ City Index
da...@davidlaing.com
http://davidlaing.com
Twitter: @davidlaing

Dr Nic Williams

unread,

Aug 7, 2014, 8:07:54 PM8/7/14

to bosh...@cloudfoundry.org, bosh...@cloudfoundry.org

David, ideas on disks-per-job-template?

Dr Nic Williams

unread,

Aug 7, 2014, 8:10:39 PM8/7/14

to bosh...@cloudfoundry.org, bosh...@cloudfoundry.org

Dmitry, colocating unrelated job templates is currently a practical reality to efficiently use resources. In AWS/OpenStack users aren't able to create different instance types.

David Laing

unread,

Aug 7, 2014, 8:18:45 PM8/7/14

to bosh...@cloudfoundry.org

disks-per-job-template doesn't really make sense to me; I guess because I tend to have the same template attached to multiple jobs for HA.

eg:

jobs:
- elasticsearch_z1
resource_pool: z1_large
templates: [ "elasticsearch", "metrics-shipper" ]
- elasticsearch_z2
resource_pool: z2_large
templates: [ "elasticsearch", "metrics-shipper" ]

- elasticsearch_z3
resource_pool: z3_large
templates: [ "elasticsearch", "metrics-shipper" ]

Associating a disk with a template doesn't really make sense (to me) in this scenario.

Matthew Boedicker

unread,

Aug 7, 2014, 10:08:13 PM8/7/14

to bosh-dev

This sounds like a great feature and will allow us to overcome a persistent disk IO bottleneck on AWS.

Should we call these something other than pools? I've always found the term "resource pool" confusing because resource pool and disk pool have other meanings in vSphere. To me those sections of the manifest are configs or profiles.

On Thu, Aug 7, 2014 at 10:45 AM, <dkal...@pivotal.io> wrote:

To unsubscribe from this group and stop receiving emails from it, send an email to bosh-dev+u...@cloudfoundry.org.

Ferran Rodenas

unread,

Aug 8, 2014, 1:29:20 AM8/8/14

to bosh...@cloudfoundry.org

I've a different though on "deployment jobs" and "job templates".

For me a "job template" represents the "single unit of work", and therefore, they should be able to have their own disks. The release developer is responsible to define what processes are the "single unit of work", and no one can modify that unit. If he decides that the "job template" must have more than 1 "process", then he can use monit to start/stop whatever "processes" are necessary to run that "unit of work", including dependencies if necessary (see dea_next job template: https://github.com/cloudfoundry/cf-release/blob/master/jobs/dea_next/monit).

A "deployment job" is the way to deploy that "unit of work". The deployment operator is responsible to define how he wants to deploy the "job template", he can deploy a single one per "deployment job", or he can collocate more than 1 (the best use case is what DrNic pointed out: "efficiently use resources").

I believe this combination is quite powerful and flexible. Adding another layer (vm?) will add more complexity and seems unnecessary IMHO.

Besides that, I second Matt's complain about naming it 'pool'. 'resource_pool' makes some sense (albeit discutible), because CF-BOSH spin ups a pool of vms and then assign a role (job) to each vm. But this is not the case for persistent disks (I guess).

- Ferdy

Reply all

Reply to author

Forward