Co-installation of different versions of an operator

718 views
Skip to first unread message

Marcin Owsiany

unread,
Apr 19, 2021, 5:14:28 AM4/19/21
to Operator Framework
Hi,

Hopefully this is a good place to ask about how OLM deals with cluster-scoped resources required by an operator, when installing operators in modes other than AllNamespaces?

To be honest I'm not sure how to ask the question. I read https://github.com/operator-framework/operator-lifecycle-manager/blob/master/doc/design/operatorgroups.md and I have to admit I was not able to grasp the model. It is not even clear to me whether OLM ever allows more than one operator controller deployment per cluster at all? Perhaps a diagram could make this more clear?

To start with, let me ask this: is it possible to have OLM install operator Foo in namespace A at version vA, and the same operator in namespace B at version vB (say, by referring to different operator distribution channels). If it is possible, then which CRDs (which are cluster-scoped resources) will be installed? Those from vA or vB?

regards,

Marcin


Daniel Messer

unread,
Apr 19, 2021, 7:08:37 AM4/19/21
to Marcin Owsiany, Operator Framework, operator-framework-olm-dev
Hi Marcin,

you touch on an important topic. First, the most current docs are here: https://olm.operatorframework.io/docs/advanced-tasks/operator-scoping-with-operatorgroups/ - we should probably start to update references in GitHub.

To your question: OLM currently allows to install Operators scoped to a single / subset of namespaces and it also allows to do this multiple times, even with different versions of the Operator. In your example, whatever Operator was installed last, A or B, would overwrite the CRDs. Hence you need to use CRD versioning to have Operator vA still be capable of reading and write its CRDs as described here: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/

OperatorGroups are a way to define the scope of an Operator. They set the scope of each Operator installed in the same namespace where the OperatorGroup lives. Scope means: which namespaces is the Operator watching (OLM can managed WATCH_NAMESPACE env var for that) and in which namespaces does the Operator have permissions to watch / read / write / list objects. The individual RBAC is requested by the Operator as part of its metadata and OLM will scope it depending on what the OperatorGroup says.

On top of that there is logic in OLM that avoids that two Operators owning the same APIs are scoped to the same namespaces. This topic is one of the more advanced aspects of OLM because it effectively creates the illusion of tenancy where in Kubernetes there are only global things (CRD). I hope this gets you started though.

Best,
Daniel

--
You received this message because you are subscribed to the Google Groups "Operator Framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to operator-framew...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/operator-framework/f932cc7b-4379-4f40-a316-00d5a1dd01bbn%40googlegroups.com.


--
Daniel Messer

Product Manager Operator Framework & Quay

Red Hat OpenShift

Marcin Owsiany

unread,
Apr 20, 2021, 2:22:15 AM4/20/21
to Rob Cernich, Daniel Messer, Operator Framework, operator-framework-olm-dev


On Mon, 19 Apr 2021 at 23:50, Rob Cernich <rcer...@redhat.com> wrote:


On Mon, Apr 19, 2021 at 5:08 AM Daniel Messer <dme...@redhat.com> wrote:
Hi Marcin,

you touch on an important topic. First, the most current docs are here: https://olm.operatorframework.io/docs/advanced-tasks/operator-scoping-with-operatorgroups/ - we should probably start to update references in GitHub.

To your question: OLM currently allows to install Operators scoped to a single / subset of namespaces and it also allows to do this multiple times, even with different versions of the Operator. In your example, whatever Operator was installed last, A or B, would overwrite the CRDs. Hence you need to use CRD versioning to have Operator vA still be capable of reading and write its CRDs as described here: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/

I'm curious about what you'd need to do if conversion was required between resource versions.  Would OLM be managing installation of the converter and configuration of the conversion webhook?

Looks like it helps with that if you mention the webhook in the CSV: https://olm.operatorframework.io/docs/advanced-tasks/adding-admission-and-conversion-webhooks/

Marcin 

Marcin Owsiany

unread,
Apr 20, 2021, 3:25:32 AM4/20/21
to Daniel Messer, Operator Framework
On Mon, 19 Apr 2021 at 13:08, Daniel Messer <dme...@redhat.com> wrote:
Hi Marcin,

you touch on an important topic. First, the most current docs are here: https://olm.operatorframework.io/docs/advanced-tasks/operator-scoping-with-operatorgroups/ - we should probably start to update references in GitHub.

Thanks for this link Daniel, this document is great! I finally feel like I know what's going on :-)
 
To your question: OLM currently allows to install Operators scoped to a single / subset of namespaces and it also allows to do this multiple times, even with different versions of the Operator. In your example, whatever Operator was installed last, A or B, would overwrite the CRDs. Hence you need to use CRD versioning to have Operator vA still be capable of reading and write its CRDs as described here: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/

Right, keeping the schemas backward-compatible would prevent information loss.

However as far as I can an unfortunate sequence of installations (or upgrades!) would break one of the operators.
Say:
- vA is older and its CRD only defines v1 API and the vA controller is written to talk that API, while
- vB is newer, and its CRD defines both v1 and v2 APIs, and the vB controller is written to talk the v2 API.

Once vA's CRD is applied (perhaps because vA.1 is released), the v2 API disappears from the API server, and the vB controller starts crashlooping :-/

OperatorGroups are a way to define the scope of an Operator. They set the scope of each Operator installed in the same namespace where the OperatorGroup lives. Scope means: which namespaces is the Operator watching (OLM can managed WATCH_NAMESPACE env var for that) and in which namespaces does the Operator have permissions to watch / read / write / list objects. The individual RBAC is requested by the Operator as part of its metadata and OLM will scope it depending on what the OperatorGroup says.

On top of that there is logic in OLM that avoids that two Operators owning the same APIs are scoped to the same namespaces. This topic is one of the more advanced aspects of OLM because it effectively creates the illusion of tenancy where in Kubernetes there are only global things (CRD). I hope this gets you started though.

I can see how the ability to install multiple copies of the operator is useful. However it seems pretty fragile since as you say CRDs are global.
Was this ever a problem in practice? Perhaps I worry too much? Are there any established guidelines for operator design that would help mitigate breakage such as the one I described above?

Marcin

Daniel Messer

unread,
Apr 20, 2021, 4:51:00 AM4/20/21
to Marcin Owsiany, Operator Framework
Hi Marcin,

you basically described the reason why OLM is looking to move away from this model where you can simply install an Operator multiple times. It's exactly those kind of ordering problems that come into play with shared resources. I don't think too many Operators are using CRD versioning with Conversion webhooks at the moment but this problem will definitely become more pressing. The best way to avoid it today is to make your Operator itself global, as in asking OLM to only support the installMode "AllNamespaces". This will prevent that another Operator instance of an older / newer / same version can be installed on the cluster. That paired with OLM update graph management should give you predictable migration paths from older to newer CRDs.

Marcin Owsiany

unread,
Apr 20, 2021, 6:50:46 AM4/20/21
to Daniel Messer, Operator Framework
Perfect, this confirms my feeling that it makes most sense to only target AllNamespaces.
Many thanks!
Reply all
Reply to author
Forward
0 new messages