Sticky agents

37 views
Skip to first unread message

Ricardo Cadete

unread,
Oct 5, 2019, 4:47:23 AM10/5/19
to go-cd
Good work with CD GO project, we have been using it almost since the beginning.

In our setup we have several go-agents running in different servers, to each agent we assign a heavy long build (many hours per each country compilation).
This requires that each agent runs all stages at the same server. We could have a NFS shared disk between these servers but this causes performance issues, also the access to the local databases is restricted for simplification.
Everything was fine when we had just 1 agent, but we wanted to speed up the builds process and decided to go with multi agent. So we changed the code to support multiple builds in parallel but then we found out that the stages were randomly assigned to other agents, causing incorrect builds.

You could argue "why don't you put all jobs in 1 stage, that will work", well that also silly, because we want be able to rerun stages in case of failure without have to recompile the all country again (we are talking about hours), sometimes there are external factors such network problems, tests failing because addresses have changed, db disconnected ...

A simple feature such as "Stick agents" could solve our issues, because right now we had to work around by assigning each country as resources to a single server, this is dummy manual process and completely destroys flexibility of having more agents on demand.

What can be done?
Thanks in advance


Screenshot 2019-10-05 at 10.27.13.png



Screenshot 2019-10-05 at 10.27.31.png


Ricardo Cadete

unread,
Oct 16, 2019, 1:08:48 PM10/16/19
to go-cd
Hi there,

do you have any feedback for me?

Odin Hørthe Omdal

unread,
Jan 17, 2020, 8:37:20 AM1/17/20
to go-cd
I read in another thread that one way to do this is to produce artifacts at each stage, so that another machine can just download that artifact (or material) and start doing its thing.

Would that work for you?  They also mentioned disabled "fetch" of the artifact which would actually have the effect that each job that was started stays on the same agent.

I was searching for a similar-ish solution (one job swamping all our agents), but we'll just add some resources called "lowprio", and make this job only use the "lowprio" agents (which also takes regular tasks).

Odin
Reply all
Reply to author
Forward
0 new messages