Hello
We have clusters that consist of geo distributes pairs of machines. Where each pair sits behind a load balancer.
When we perform updates we remove a machine from the load balancer update it once finished we add it back to the load balancer.
We must always have at least one machine per load balancer active to prevent any outages.
The main problems we face are
We want to run a multi threaded release within rundeck that never updates all machines within one load balancer at once.
We want multiple projects to be able to release at the same time without both trying to update the same machine at the same time or removing all machines from one load balancer.
Our idea is to add a new option on the edit job page that allows the selection of an orchestrator.
This orchestrator would be queried whenever an execution goes to process against its list of nodes.
It would pass this list to the orchestrator this will return a machine that can be processed or block until one is available.
For multi threaded jobs it should call the orchestrator multiple times.
Once the execution on that node has finished the orchestrator will need to be informed that that node is available again.
We would like to implement this idea as a plugin allowing the greatest flexability.
For example some generic orchestrator maybe
Never update every machine at once regardless of how many threads are configured.
Only update a maximum of 33% of machines at once regardless of how many threads are configured.
We would like to have a go implement this feature ourself and commit it back to the rundeck project.
We are after some feedback from the rundeck community to make sure we make something that is useful/welcomed.
Plus a general where to start in terms of implementation for example roughly which parts of the code base we should focus on etc.
Cheers,