I'm currently developing an operator to manage multiple custom resources using the Operator SDK. The operator handles various tasks that can be quite time-consuming, especially when dealing with multiple clusters. The issue I'm encountering is that the operator processes each cluster sequentially during its reconciliation cycle. This sequential approach becomes a significant bottleneck, particularly during high activity periods like upgrades or resizing operations. Consequently, the operator takes a considerable amount of time to complete these tasks, impacting overall system performance and responsiveness.
I'm wondering if there's a way to parallelize the reconciliation process within an operator. If so, what are the recommended best practices or patterns to achieve this while working within the constraints of the Operator SDK?
Any insights or advice would be greatly appreciated. Thanks in advance!
Cheers,
Vic