Good work with CD GO project, we have been using it almost since the beginning.
In our setup we have several go-agents running in different servers, to each agent we assign a heavy long build (many hours per each country compilation).
This requires that each agent runs all stages at the same server. We could have a NFS shared disk between these servers but this causes performance issues, also the access to the local databases is restricted for simplification.
Everything was fine when we had just 1 agent, but we wanted to speed up the builds process and decided to go with multi agent. So we changed the code to support multiple builds in parallel but then we found out that the stages were randomly assigned to other agents, causing incorrect builds.
You could argue "why don't you put all jobs in 1 stage, that will work", well that also silly, because we want be able to rerun stages in case of failure without have to recompile the all country again (we are talking about hours), sometimes there are external factors such network problems, tests failing because addresses have changed, db disconnected ...
A simple feature such as "Stick agents" could solve our issues, because right now we had to work around by assigning each country as resources to a single server, this is dummy manual process and completely destroys flexibility of having more agents on demand.
What can be done?
Thanks in advance

