[kubernetes/kubernetes] Integration of volume attach limits and delayed volume binding (#65201)

Michelle Au

unread,

Jun 18, 2018, 9:44:49 PM6/18/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

Is this a BUG REPORT or FEATURE REQUEST?:
@kubernetes/sig-storage-feature-requests

What happened:
This is a followup to the 1.11 design discussions regarding volume attach limits and how it will work with delayed volume binding. Here's a rough sketch of how we can integrate the two, based off of previous discussions. I'm also trying to consider how this could be extended to work for arbitrary resources.

Add a new field to StorageClass to indicate how many resources a provisioned volume could consume.

kind: StorageClass
...
nodeResources:
- name: "attachable-volumes-gce-pd"
  quantity: 1
- name: "memory"
  quantity: 100M

name is the name of the resource that is consumed
quantity is how much it consumes

Currently, volume max counts predicate handles resources prefixed with "attachable-volumes", and PodFitsResources predicate handles the rest. Both would have to be extended to also account for resources consumed by unbound and bound PVs via StorageClass.nodeResources.
I'm also trying to see if there's a way we can handle reporting dynamic provisioning capacity for local volumes through a similar or same mechanism. It would need a special resource prefix again for special handling by the volume binding predicate. And if we want to support more than just local volumes, then we would need a way to specify a topology object kind (ie a Rack or Zone CRD, or maybe some first class TopologyResource object) where the allocatable resources are specified. If so, then this may no longer be just nodeResources. I need to see how much overlap there may be with device manager here.

Would like to hear your thoughts on this idea. cc @gnufied @liggitt

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

lichuqiang

unread,

Jun 19, 2018, 1:46:34 AM6/19/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

/cc

lichuqiang

unread,

Jun 19, 2018, 4:42:28 AM6/19/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

It's a little different from it in my mind.

You mean the quantity is the resource a volume consumed, but not total capacity? then how should the predicate make use of it, can you give more details?
Has the issue something to do with memory resource and PodFitsResources predicate? Or it's just a case of "arbitrary resources"?
For local volume capacity, I think we need node-specific values, but not a universal value.

Deep Debroy

unread,

Jun 19, 2018, 12:55:20 PM6/19/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

Looks good. One question regarding:

- name: "attachable-volumes-gce-pd"
  quantity: 1

Will the quantity setting for attachable-volumes-* ever be something other than 1?

Hemant Kumar

unread,

Jun 19, 2018, 1:09:46 PM6/19/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

What is the use case of supporting arbitrary resources within PV or storageClasses? Before we go there - I think we should gather some real world data and how those resources will be applied and how scheduler has to be modified to take that into account. What happens when individual PVs can share the memory? (like a shared fuse mount). Are memory limits applicable to PVs very much storage implementation specific? what happens to those resource requirements when Storage provider version or implementation changes? Such as AWS EFS vs off-the-shelf NFS? Glusterfs vs ceph-glusterfs or glusterfs v1.x vs glusterfs v2.x

Can the volume attach limit and memory limit of PV be in conflict with each other? Who wins in that case?

Michelle Au

unread,

Jun 19, 2018, 1:52:36 PM6/19/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

You mean the quantity is the resource a volume consumed, but not total capacity?

Sorry I wasn't clear. The total capacity is reported in the Node resources, and the consumption of a to-be-provisioned-PV is reported in the StorageClass. Something like capacity would need to be special cased with a prefix since the capacity is inferred by PVC request.

is there a scenario where a driver provisions a single PV that maps to (or spans over) multiple EBS/PD/AzDisk?

@liggitt had a hypothetical example of a driver raiding two ebs volumes together.

What is the use case of supporting arbitrary resources within PV or storageClasses?

Volume drivers like NFS or Ceph have kernel components that can consume system memory and network bandwidth that is unaccounted for. That being said, I agree, we probably have many other volume scaling issues that are more pressing than this. The most important resources we're trying to deal with currently are attachable limits and capacity. Limits for other system resources could potentially be accounted for through attach limits in the near term, but I wanted to see if we could come up with a more general API that could be extended in the future, if needed.

Michelle Au

unread,

Jun 20, 2018, 7:14:01 PM6/20/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

One more use case to further complicate all of this:

In addition to attach limits per node, GCE PD also has maximum capacity per node of 64TB on most instances. To handle this, I think we need to count the capacity of PVCs on the node (and inline volumes won't work).

fejta-bot

unread,

Sep 18, 2018, 7:36:58 PM9/18/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Deep Debroy

unread,

Sep 18, 2018, 7:53:37 PM9/18/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

/remove-lifecycle stale

Michelle Au

unread,

Sep 25, 2018, 6:22:04 PM9/25/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

I updated the design doc on how to integrate the max pd predicate with volume topology: kubernetes/community#2711

I left out supporting a volume taking up multiple attach points and a volume consuming other node resources like memory/cpu. We don't have any concrete use cases or needs for those yet, and we can reconsider if something comes up. The design for integration is much simpler in this case.

Deep Debroy

unread,

Sep 25, 2018, 6:39:29 PM9/25/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

/assign

I will pick this up as we discussed. Thanks for updating the design doc!

fejta-bot

unread,

Dec 24, 2018, 6:13:19 PM12/24/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

—

Deep Debroy

unread,

Dec 26, 2018, 3:15:49 PM12/26/18

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

/remove-lifecycle stale

fejta-bot

unread,

Mar 26, 2019, 4:46:25 PM3/26/19

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

—

fejta-bot

unread,

Apr 25, 2019, 5:16:17 PM4/25/19

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.

/lifecycle rotten

fejta-bot

unread,

May 25, 2019, 5:58:50 PM5/25/19

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Kubernetes Prow Robot

unread,

May 25, 2019, 5:58:51 PM5/25/19

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Kubernetes Prow Robot

unread,

May 25, 2019, 5:59:07 PM5/25/19

to kubernetes/kubernetes, k8s-mirror-storage-feature-requests, Team mention

Closed #65201.

Reply all

Reply to author

Forward