[upcoming] Disk pools

95 views
Skip to first unread message

dkal...@pivotal.io

unread,
Aug 7, 2014, 1:45:37 PM8/7/14
to bosh...@cloudfoundry.org
Hey all,

After receiving multiple requests for a way to specify IaaS properties for certain persistent disks I wanted to share some proposed changes before BOSH team implements them. 

The name of this new feature is disk pools. They would work similarly to how resource pools work for VMs.

Disk pools would solve following use cases (I'm sure each IaaS has its own disk configuration that can be exposed):
- AWS SSD disks (iops requests)
- vSphere thin provisioned disks

Here are the proposed changes to the deployment manifest (let's say this is for AWS):

```
resource_pools:
- name: fast_machines
  cloud_properties:
    instance_type: m3.x2large

# disk_pools is optionally specified
disk_pools: # OR persistent_disk_pools: <-------------------- new 
- name: fast_disks
  cloud_properties:
    volume_type: gp2

jobs:
- name: mydb
  templates: [ mysql ]
  instances: 1
  resource_pool: fast_machines
  persistent_disk: 3_000
  persistent_disk_pool: fast_disks # OR disk_pool <-------------------- new
  networks: [ {...} ] 
  properties: {}

- name: backup
  persistent_disk: 10_000
  # no persistent_disk_pool key provided so CPI will pick default settings (this is for backwards compatibility)
  ...
```

An alternative format for jobs with persistent disk (looks slightly cleaner):

```
- name: mydb
  persistent_disk:
    size: 3000
    disk_pool: fast_disks # <-------------------- new
```

From the CPI perspective create_disk method would have to accept cloud_properties (just like create_vm takes cloud_properties of the resource pool).

We will be starting to work on disk pool feature within few weeks in collaboration with Piston. Let us know what you guys think.

Matthew Kocher

unread,
Aug 7, 2014, 1:51:01 PM8/7/14
to bosh...@cloudfoundry.org
I like the second cleaner option.

Since we can infer the integer vs hash it's backwards compatible, but in the long run it's probably a good idea to start versioning the deployment manifest schema.


To unsubscribe from this group and stop receiving emails from it, send an email to bosh-dev+u...@cloudfoundry.org.

Ferran Rodenas

unread,
Aug 7, 2014, 5:38:27 PM8/7/14
to bosh...@cloudfoundry.org
I also like the "cleaner" option. But instead of 'persistent_disk' (singular), can we use 'persistent_disks' (plural) and make the contents an array, even though we only support 1 disk per vm now? This will be helpful if someday Bosh supports multiple disk per job, as it won't require manifest changes.

- Ferdy

Matthew Kocher

unread,
Aug 7, 2014, 6:29:30 PM8/7/14
to bosh...@cloudfoundry.org
While we're wishing for things - we eventually need to figure out a way to have persistent disks(one or many) associated with a job template, not a job. Uncolocating two templates with persistent disk is impossible right now.

Dr Nic Williams

unread,
Aug 7, 2014, 6:29:31 PM8/7/14
to bosh...@cloudfoundry.org
I like the 2nd/3rd approach - persistent_disks:

This is similar to the move from "template: foo" to "templates: { "name": "foo", "release": "bar" }"


On Thu, Aug 7, 2014 at 2:38 PM, Ferran Rodenas <frod...@gmail.com> wrote:



--
Dr Nic Williams
Stark & Wayne LLC - consultancy for Cloud Foundry users
twitter @drnic

Ferran Rodenas

unread,
Aug 7, 2014, 6:31:30 PM8/7/14
to bosh...@cloudfoundry.org
+100 I've been facing this problem recently, and it really hurts.

- Ferdy

Dr Nic Williams

unread,
Aug 7, 2014, 6:34:10 PM8/7/14
to bosh...@cloudfoundry.org
Zero/One persistent disk (of some size/type) per job template?

jobs:
- name: core
  templates:
  - name: postgres
    release: cf
    persistent_disk:
      size: 3000
      disk_pool: fast_disks
  - name: uaa
    release: cf

?

Dmitriy Kalinin

unread,
Aug 7, 2014, 8:00:53 PM8/7/14
to bosh...@cloudfoundry.org
Here are my thoughts on the distinction between jobs and job templates and why I think job templates should not have their own disks (or even _know_ what disk is):

- deployment job - represents a single (scalable?) unit (e.g. web_worker, executor, load_balancer, db_node). each unit gets either 0 or 1 persistent disk (in future we can support multiple persistent disks). Currently BOSH places each deployment job on its own vm but that does not have to stay that way.

- job template - represents _one_ of the processes that's needed for a deployment job to be useful.

Because deployment job is a single unit of work, providing individual job templates with persistent disks does not really fit. 

To tie this all together, having ability to compose deployment job from multiple job templates provides a powerful way to combine software into a single unit; however, this feature is sometimes misused for trying to place multiple deployment jobs onto a single vm (unfortunately that's the only way to do that today). I believe that placement of multiple deployment jobs onto a single vm is a first class feature that needs to be properly implemented after we solve namespacing of jobs, packages, etc.

David Laing

unread,
Aug 7, 2014, 8:02:18 PM8/7/14
to bosh...@cloudfoundry.org
Great news!  Thanks for getting the ball rolling on this one.

I think that all attributes about a specific disk should be managed at the top level disk_pools:, with the job: just containing a reference to which disk it wants to use (just like resource_pools: )

ie:

resource_pools:
- name: zone1_large
  cloud_properties:
    instance_type: m3.x2large

disk_pools: 
- name: zone1_mysql_1
  size: 3000
  cloud_properties:
    volume_type: gp2

jobs:
- name: mydb
  templates: [ mysql ]
  instances: 1
  resource_pool: zone1_large
  disk_pool: zone1_mysql_1 
  networks: [ {...} ] 
  properties: {}

 


To unsubscribe from this group and stop receiving emails from it, send an email to bosh-dev+u...@cloudfoundry.org.



--
David Laing
Trading API @ City Index
da...@davidlaing.com
http://davidlaing.com
Twitter: @davidlaing

Dr Nic Williams

unread,
Aug 7, 2014, 8:07:54 PM8/7/14
to bosh...@cloudfoundry.org, bosh...@cloudfoundry.org
David, ideas on disks-per-job-template?

Dr Nic Williams

unread,
Aug 7, 2014, 8:10:39 PM8/7/14
to bosh...@cloudfoundry.org, bosh...@cloudfoundry.org
Dmitry, colocating unrelated job templates is currently a practical reality to efficiently use resources. In AWS/OpenStack users aren't able to create different instance types.

David Laing

unread,
Aug 7, 2014, 8:18:45 PM8/7/14
to bosh...@cloudfoundry.org
disks-per-job-template doesn't really make sense to me; I guess because I tend to have the same template attached to multiple jobs for HA.

eg:

jobs:
- elasticsearch_z1
  resource_pool: z1_large
  templates: [ "elasticsearch", "metrics-shipper" ]
- elasticsearch_z2
   resource_pool: z2_large
  templates: [ "elasticsearch", "metrics-shipper" ]
- elasticsearch_z3
  resource_pool: z3_large
  templates: [ "elasticsearch", "metrics-shipper" ]

Associating a disk with a template doesn't really make sense (to me) in this scenario.

Matthew Boedicker

unread,
Aug 7, 2014, 10:08:13 PM8/7/14
to bosh-dev
This sounds like a great feature and will allow us to overcome a persistent disk IO bottleneck on AWS.

Should we call these something other than pools? I've always found the term "resource pool" confusing because resource pool and disk pool have other meanings in vSphere. To me those sections of the manifest are configs or profiles.


On Thu, Aug 7, 2014 at 10:45 AM, <dkal...@pivotal.io> wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to bosh-dev+u...@cloudfoundry.org.

Ferran Rodenas

unread,
Aug 8, 2014, 1:29:20 AM8/8/14
to bosh...@cloudfoundry.org
I've a different though on "deployment jobs" and "job templates". 

For me a "job template" represents the "single unit of work", and therefore, they should be able to have their own disks. The release developer is responsible to define what processes are the "single unit of work", and no one can modify that unit. If he decides that the "job template" must have more than 1 "process", then he can use monit to start/stop whatever "processes" are necessary to run that "unit of work", including dependencies if necessary (see dea_next job template: https://github.com/cloudfoundry/cf-release/blob/master/jobs/dea_next/monit). 

A "deployment job" is the way to deploy that "unit of work". The deployment operator is responsible to define how he wants to deploy the "job template", he can deploy a single one per "deployment job", or he can collocate more than 1 (the best use case is what DrNic pointed out: "efficiently use resources").

I believe this combination is quite powerful and flexible. Adding another layer (vm?) will add more complexity and seems unnecessary IMHO.

Besides that, I second Matt's complain about naming it 'pool'. 'resource_pool' makes some sense (albeit discutible), because CF-BOSH spin ups a pool of vms and then assign a role (job) to each vm. But this is not the case for persistent disks (I guess).

- Ferdy
Reply all
Reply to author
Forward
0 new messages