Dynamic Resource Allocation (DRA) was a hot topic at KubeCon last week. The primary reason being that DRA promises to unlock a whole host of use-cases that require fine-grained sharing and custom configuration of accelerated hardware in Kubernetes. If I had to pick a general, overall theme of this KubeCon it was "How do we make Kubernetes THE best platform for running LLMs and GenAI workloads?" -- and having access to the flexible resource management that DRA provides is a key component of this.
However, such flexibility comes at a cost (specifically as it pertains to scheduling and cluster auto-scaling), and that has raised questions that need to be addressed before moving DRA to beta (and eventually GA).
To help resolve these issues (and help move DRA forward in general),
@pohly and I are going to create a formal WG for DRA with ourselves appointed as the "Organizers":
https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md#creation-process-descriptionMore info (including a poll of when / how often people think we should meet) will be coming soon.
Looking forward to working with all of you more closely on this!