Hello!
As you may have seen, Kubernetes 1.19 got a new alpha feature, generic
ephemeral volumes:
-
https://kubernetes.io/blog/2020/09/01/ephemeral-volumes-with-storage-capacity-tracking/
-
https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/
Planning for 1.20 just started, so now is a good time to discuss how
this feature should move forward. During the initial review of the
feature, several questions were raised that still need to be answered:
- Do we really need both "generic ephemeral volumes" and "CSI ephemeral
volumes" (the older feature, currently in beta)?
- How can we avoid confusion given that both (at first glance) are so
similar?
- What should the final API look like? Can we somehow unify different
volume sources under a common API?
Let me start by pointing out that the two volume types are indeed very
different: CSI ephemeral volumes are for simple, local driver
deployments that don't need and don't support provisioning and
attaching. Drivers are specifically written for Kubernetes. Generic
ephemeral volumes on the other hand are based on provisioning, which
opens up the whole range of features that are supported for that. No
changes are needed in the driver, the "ephemeral" part is handled
entirely by Kubernetes.
Because they are designed for different use-cases, the API is also
different: CSI ephemeral volumes are parameterized via arbitrary,
custom, per-volume key/value pairs instead of cluster-wide storage
classes. Generic ephemeral volumes have the same parameters as
persistent volumes (custom parameters in the storage class, size and data
source per-volume).
Could "generic ephemeral volumes" replace "CSI ephemeral volumes"?
Probably not. The driver would become more complex and provisioning is
likely to be slower (but that would need to be measured). Developers of
CSI drivers that were specifically written for "CSI ephemeral volumes"
are in a better position than myself to comment on this - personally I
don't mind deprecating and removing that feature... so speak up now! :-)
What I can say is that the "generic ephemeral volumes" approach was
necessary to enable features like storage capacity tracking that simply
wouldn't have fit into the concept of "CSI ephemeral volumes".
Perhaps if naming was more obviously different, there would be less
confusion. If someone has any bright ideas, then please share them. For
my part, I already tried to explain both concepts in the documentation
and blog posts linked above.
When introducing "generic ephemeral volumes", the following API was
chosen with the notion that the new "ephemeral" volume source might get
extended to also include some of the other, already existing ephemeral
volume sources:
ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: my-frontend-volume
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "scratch-storage-class"
resources:
requests:
storage: 1Gi
For example, "CSI ephemeral volumes" could become:
ephemeral:
csi:
driver:
inline.storage.kubernetes.io
volumeAttributes:
foo: bar
This basically would replace a flat list of alternatives in the
VolumeSource struct
(
https://pkg.go.dev/k8s.io/api/core/v1?tab=doc#VolumeSource) with a
hierarchical structure where all (?) volume sources that are considered
"ephemeral" are grouped in EphemeralVolumeSource
(
https://pkg.go.dev/k8s.io/api/core/v1?tab=doc#EphemeralVolumeSource).
IMHO this does not improve the API and the code just becomes more
complex, too. I would prefer to go the other way: remove
EphemeralVolumeSource and put VolumeClaimTemplate directly into the
VolumeSource struct, together with all other volume sources. Tim Hockin
suggested that during the API review.
As a side effect, EphemeralVolumeSource.ReadOnly would get dropped,
something that Saad was interested in because it is only present for
feature parity with other volume sources that have it. If the volume is
supposed to be read-only inside the pod, then this can still be
specified via the VolumeMount.ReadOnly flag.
Because the fields in the two volume source APIs are so different, I
don't see how they can be merged into one struct. Suppose an
"EphemeralVolumeSource" allows specifying all the fields from both
types. We would need an additional enum to distinguish which of the
fields are supposed to be used - this doesn't make sense to me.
It's also not the case that one is a true superset of the other. It
might technically be possible to shoe-horn the fields from CSI ephemeral
volume into the PVC template (parameters -> labels, driver name ->
storage class name, no separate enum to determine the type) and then
treat that like the current CSI ephemeral volume source when there is a
corresponding CSIDriver object, but that API then is not at all obvious
and error prone because the API server cannot validate it well.
Looking at the beta criteria for generic ephemeral volumes
(
https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1698-generic-ephemeral-volumes#alpha---beta-graduation),
the API design was covered above. I can handle some of the open
technical work (errors as pod events, tests). That leaves gathering of
feedback for the feature.
Let me start by throwing out some questions. Perhaps a more formal
questionaire would be better, but I'm not sure which tool to use for
that.
Do you think it is useful? Do you plan to use it yourself? For which
use-cases? How many of your pods in the cluster (as percentage and/or
absolute number) will use generic ephemeral volumes? Are those short- or
long-lived? This influences how scaleable and fast this solution needs
to be.
We don't have graduation criteria defined for CSI ephemeral volumes. If
we decide to keep the current API for those, are they perhaps already
ready for GA? If not, what's missing?
--
Best Regards
Patrick Ohly