Comparing: subresources vs fine-grained permissions
lavalamp, jpbetz, * 2022-09-07
Lien proposal is blocked on what to do over a subresource, so that permissions can be made safe. Author (at deads2k’s request) wants to add a finalizer subresource to make things consistent. I am balking at the logical end game of adding a bunch of subresources.
Generic controller which operates on (reads and mutates) e.g. finalizers or labels, but should not be able to modify other parts of objects
System admin who wants to be sure that such controllers can’t e.g. modify secrets
allow viewing/modification of certain fields without read permissions on other fields
(if this is critical it can maybe be accommodated with additional effort)
Currently I think I prefer the tradeoffs in design 3. But we need to get feedback/buy-in from SIG Auth.
Make an additional subresource per metadata field needing this property (finalizers, liens, labels, annotations, maybe ownerReferences).
Pros:
Prior art: “scale” subresource, “finalize” subresource (kind of), “status”
Security model already exists (subresource in RBAC configuration)
Security model is mostly straightforward
Cons:
Having many of these would scale up apiserver complexity, each one is special code
Client complexity also increases a lot, for every general subresource you add M*N functions to all clients (M=operations on the subresource, N=number of resource types)
SSA can work on subresources but it was a pain to set up (and making more would require N more Apply functions in clients for each)
clients with permissions for more than one of these subresources would have to make a separate request per separate field they want to change
Changing the version of the subresource’s data type is tricky / undefined
Existing subresources are inconsistent WRT whether they show the whole object (e.g. /status) or a filtered view (e.g. /scale).
Existing subresources are actually very confusing to new users, they do not expect to have to call a special e.g. UpdateStatus method to update the status.
I personally think the cons in design 1 scale with the number of subresources and become overwhelming after ~2; I can only imagine doing this if we can limit in advance the number of general subresources needed and be sure we’d never need additional ones.
Make a new FineGrainLimitedRole CRD and FineGrainBinding CRD (omitting the *Cluster versions of this for simplicity here)
FGLR: contains a field selection expression (explained at bottom of doc) which selects fields which *may* be changed (allowlist).
FGB: contains a {user or group}, resource type, optional namespace, optional name (all for telling whether the given user is limited when touching the given thing; and a link to the FGLR describing the limitation.
A validating webhook runs on updates; if the user/group and type/namespace/name match any FGBs, then: it unions the FGLR’s field selection expressions, it compares the old & new object for differences, and verifies that all changed fields are members of the unioned field selection expression.
Pros:
No apiserver work needed; can be installed in clusters of any recent version; prototype out-of-tree
No additional complexity for people who don’t want to use it, no client impact
Works for all types, even custom resources, with no changes; no per-resource work in the future
Works with existing authz systems (although it’s configured separately)
Cons:
Webhook that hooks everything is a reliability anti-pattern
Security model is now overly-broad RBAC + limitation; if the latter is forgotten, the consequence is overly broad permission; we’d prefer failures to result in less permissions, not more.
2x the configuration is required for clients we wish to be limited
Add a new verb, e.g. LIMITEDEDIT
RBAC roles with that verb include a field selection expression as in design 2
RBAC authz plugin treats this verb as the same as UPDATE/APPLY/PATCH
RBAC package now adds an admission plugin which runs with validating webhooks; if a request was marked with the new verb, the admission plugin verifies that all fields changed are selected by the field selection expression.
If multiple LIMITEDEDIT roles match, the first one is chosen (see “merging vs first match” below for explanation of this choice)
Pros:
Configuration story is almost as straightforward as with subresources
Works for all types including custom resources; no per-resource work in the future
Cons:
apiserver changes are necessary and not trivial
RBAC API objects get more complex
For edit requests, if a user doesn’t have the PUT/PATCH permissions, a virtual “PERFIELD” permission is checked; if the user has that permission, the edit request will proceed to the point at which we know the proposed update (i.e., after the patch or SSA request has been expanded but before webhooks are called).
At that point, a changed fieldset is produced using the structured-merge-diff tools.
Now we do secondary auth checks for each changed field. Two possibilities:
A: for each changed field, beginning from the root field path element, query the authz system about the virtual “type.edit.<field>” permission. ALLOW means it’s allowed, DENY means it’s denied, no answer means to check the next field path element because some are allowed and some not.
B: In a non-RBAC configuration object, we associate virtual permissions with fieldsets. To see if a user is allowed, we list all the virtual permissions having fieldsets which cover the changed fieldset. Then we do authz checks on all these permissions to see if the user has any of them.
This mechanism permits negative permissions, i.e. you can edit spec only if you DON’T have the “nospec” permission.
Pros:
Works with any random existing authz integration (mostly like subresources)
Works for all resource types
Only apiserver needs to be changed (+ potentially a new built in resource)
Cons:
Configuration will require the configurer to understand how this checking works. We may need to provide a utility to help people test the permissions.
At least 2 additional authz calls per write using this permission (existing caching should mostly make this a non-issue though?). More if option A is chosen.
Changes in apiserver will be complex-ish
Many virtual permissions might cause many set-cover computations; I think that should be OK for reasonable numbers and object sizes but this needs to be double checked.
The CEL for Admission Control KEP proposes "Secondary Authz" checks as a potential capability. How it would work:
Policies are defined using a CEL expression to enforce the field access.
How this feature is exposed is still being defined, it would likely be via a function, e.g. "authzCheck("resourcename","labeleditor")" or via a variable, e.g. "admissionReview.resourceRoles.labeleditor"
Example CEL: "object.metadata.labels == oldObject.metadata.labels || admissionReview.resourceRoles.labeleditor"
This validating admission rule would need to match CREATE/UPDATE requests for all resources with metadata access.
Cluster administrators register the policy resources into clusters
Pros:
Works for all resource types
Only apiserver needs to be changed (+ potentially a new built in resource)
Cluster administrators can define their own custom fine grained policies, e.g. only principals with specific verb can set an enum to a particular state.
Cons:
Not a built-in feature. CEL for Admission Control is proposed as an extension mechanism. But, admission controllers could be used to get this same effect as a built-in feature.
Users get overly broad authz permissions which are then later constrained by this feature.
Design 2 but do an admission plugin rather than a webhook
Everything is the same as in design 2, but with the first two pros and first con removed.
I’ve written design 2 as merging all limitations, but design 3 as picking the first matching one.
Merging means fewer roles are necessary, and roles are combinable. More bindings are needed, but you don’t have to figure out “which binding” the system used. This requires an additional property from the field selection expression.
First selection is simpler implementation-wise. It requires more roles, and perhaps fewer bindings. Users may have to puzzle over which binding might win if multiple match.
“More roles” means combinatorially more, which would be damning in the general case, however here we probably don’t expect that many common combinations.
I said this was a non-goal because it’s much harder. But it may be desired.
Most obvious implementation is adding a field mask clause to rbac get/list/watch/update verbs, and have the rbac authz plugin add a translation function to the request (e.g., to the context).
I mention this non-goal here because if it becomes a real goal, design 2 is probably excluded, and design 1 gains more cons (each subresource is even more work and is even more different and surprising to users).
SSA FieldSet
This data type was created for server side apply to mark which fields users own. It’s already serialized in the API.
Con: it’s only kind of human readable/editable
CEL expression
This is being used for validation in the near future.
Pro: very flexible & type checked; could be used to describe limitations other than just field modification
Con: not super obvious to me how to express “only fields XYZ can be changed”
Con: may require both old and new objects as input (ugh)
Con: not clear how to (generally) union multiple expressions (only needed if the “merging” path is chosen)
Other options are possible, too.
Even more fine-grained authz options
lavalamp, * 2022-09-28
Previously on this topic: designs 1-5
The complaint with these designs is that they don’t prevent users with regular permissions from modifying these special fields.
permit modification of fields (but not the rest of the object) to people with special permission
restrict modification of fields (but not the rest of the object) to people with regular permission
Note that Design 5 actually probably accomplishes these goals, but is at least 1.5 years away.
“Participant fields” are children of objectmeta. Each gets a special verb in the authz system.
Which fields are special is configured, either by API object (preferred) or flag. Default is new fields added to objectmeta, but admins can also configure any participant field to be special.
For writes with regular update / create permissions: if special fields are modified, check the permission for each one.
For writes without regular update / create permissions, which instead have “partialmeta” update/create permission: if non-participant fields are modified, reject. For each participant field, check that the user has the corresponding permission
Pros:
limited number of fields, limited number of authz checks.
minimal configuration
doesn’t break existing clusters but admins can make it safer by default
addons / controllers which are broken by making things more secure (e.g. marking finalizers special) can generally be fixed by granting them an additional permission - completely within the administrator’s control
We can ship config making new objectmeta fields special by default
Cons:
Doesn’t work for fields in spec or status (i.e. doesn’t work for conditions)
Can’t e.g. restrict to a specific annotation / finalizer
ugly / ad-hoc design
All fields are “participants” now.
Instead of a verb for each participant field, we define an encoding from field path -> permission verb.
Examples: “.objectmeta”, “.objectmeta.finalizers”, “.objectmeta.labels[‘app’]”, “.spec”
As written this is not typesafe: you can’t grant “.spec.replicas” on deployments without also granting it on replicasets. I think this is probably a feature most of the time; otherwise this can be fixed at the cost of an additional authz call.
Special fields must still be defined as in option 6. This definition has a global and a per-type part.
Note: there is no type safety here; if you get it wrong, it just doesn’t work. You can look at the audit logs to see what permissions apiserver is looking for to make sure you use the right words. (maybe?)
For writes with regular update / create permissions: if special fields are modified, check the permission for each one. There are expected to be very few special fields, so this is not many additional checks.
For writes without regular update / create permissions, which instead have “individualfield” update/create permission: For each modified field, check that the user has the corresponding permission. Permissions are checked starting from the root, so that one check can potentially approve many fields, to minimize authz checks.
Pros:
(same as 6) doesn’t break existing clusters but admins can make it safer by default
(same as 6) addons / controllers which are broken by making things more secure (e.g. marking finalizers special) can generally be fixed by granting them an additional permission - completely within the administrator’s control
Works for all fields in all parts of the object.
We can ship config making ANY new field special by default.
Could restrict to a specific condition (.status.conditions[type=”foo”]) – the existing SSA / structured-merge-diff library supports this
Cons:
Still probably can’t do prefix matching for e.g. labels / annotations / finalizers without extra magic (apiserver would have to understand those fields specifically and format the verb specially)
Essentially unbounded number of verbs
Typo’d or misspelled verbs in the config system might be landmines if a corresponding field shows up in another object or something