Controlling job flow to push based CICD systems.

21 views
Skip to first unread message

HUSSEIN KADIRI

unread,
May 26, 2021, 11:56:26 AM5/26/21
to go-cd
We have a push based CICD workflow where an external system triggers CICD workflows.

Sometimes the CICD server is temporarily offline (for a reboot, upgrade or whatever reason). Other times the CICD server is  or unable to process the vast amount of jobs the external system is sending to it.

Having a buffer (e.g a queue) in-front of the GoCD server would beneficial because:
- Jobs can still be queued up and resumed when the server is back up.
- Job flow to GoCD server can be controlled letting it recover at a pace it can handle.

Does this buffer implementation exist in GoCD. Or is it something we'll have to develop outside?
 

Prakash K

unread,
May 26, 2021, 5:52:53 PM5/26/21
to go...@googlegroups.com
We have a maintenance mode in gocd which will for you in this case I believe. Have you tried it?

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/6564a7d3-98ae-4ae5-b9a8-83fbc9fc927fn%40googlegroups.com.

HUSSEIN KADIRI

unread,
May 26, 2021, 8:26:28 PM5/26/21
to go-cd
I haven't given it a try. I understand how that process works with maintenance. My question is around when the server is unresponsive (for whatever reason) and so we're unable to put it in maintenance mode via the normal means. Fixing the issue might involve rebooting (for physical/vm servers) or recreating the server pod (for kubernetes use case).

What's the behavior when the new server comes online?  Does the new server pick up from where the old one left off?

HUSSEIN KADIRI

unread,
May 26, 2021, 8:35:00 PM5/26/21
to go...@googlegroups.com
I realized i responded to the wrong message. 
Yes i'm aware of the maintenance mode feature in general. Not sure how it works specifically in GoCD.  Oftentimes, in other tools, disabling maintenance mode after its been enabled for a long time leads to a flood of messages the server has to handle. Sometimes it can't handle that much load and it craps out. Being able to control the flow while disabling maintenance would be great. Or is this handled already handled internally? Again, I haven't looked into how GoCD does this. Maybe this is a non-issue. 
Let me know.

Thanks 

You received this message because you are subscribed to a topic in the Google Groups "go-cd" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/go-cd/Ak-T_vvwKKE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/CAKD75xsUXV9HWVjTcemQjAs9FvQZh5dBR8afr0LS-ssV9yBvoA%40mail.gmail.com.


--
Hussein Kadiri

Marques Lee

unread,
May 26, 2021, 11:27:02 PM5/26/21
to go...@googlegroups.com
Does the new server pick up from where the old one left off?

Kind of. If you have a stage that’s in progress, and 3/5 jobs finished and 2 are still running, if the server dies and you replace it with a new instance it should start running those 2 jobs that never finished. But that’s not to say they pick up exactly where they left off; for example, if one of those jobs was halfway through running a gradle task, it won’t magically continue and finish the rest of that task; that job is basically restarted when the server comes back up and quite possibly with a different agent. But you wouldn’t need to rerun the other jobs that had previously finished in that stage run.

HUSSEIN KADIRI

unread,
May 27, 2021, 1:04:38 AM5/27/21
to go...@googlegroups.com
Got it. As usual thanks for your helpful answers Marques.

You received this message because you are subscribed to a topic in the Google Groups "go-cd" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/go-cd/Ak-T_vvwKKE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/CAPKX9jYAQLtRiDTWahcBHxGcJ57XHnP%2BnZscbZ1rjVuEKGE8Zw%40mail.gmail.com.

Marques Lee

unread,
May 27, 2021, 1:07:21 AM5/27/21
to go...@googlegroups.com

HUSSEIN KADIRI

unread,
May 27, 2021, 1:09:58 AM5/27/21
to go...@googlegroups.com
Would you happen to know the answer to this question?
Oftentimes, in other tools, disabling maintenance mode after its been enabled for a long time leads to a flood of messages the server has to handle. Sometimes the server can't handle that much load and it craps out. Being able to control the flow while disabling maintenance would be great. Is that something that can be done in GoCd?

Marques Lee

unread,
May 27, 2021, 11:24:35 PM5/27/21
to go...@googlegroups.com
Hmm, I actually don’t know the exact answer. Probably @arvindsv, @maheshp, @ketan might of off the top. They’ve all been on the codebase way earlier than I have. I would guess that the answer to the question “will GoCD throttle messaging for me” is “no” but I defer to them.

What I do know is this — if your web hooks are firing off and the messages build up in the queue, the MaterialUpdateService won’t necessarily hammer the git repos. If messages are consumed while the material service is updating, it will log a message that an update is already in progress and do nothing. So that will help in that scenario, at least partly; 100 commits != 100 `git fetch` operations, especially if they come all at once. Quite possibly they’ll be effectively reduced to a single fetch operation per repo.

Marques Lee

unread,
May 27, 2021, 11:28:17 PM5/27/21
to go...@googlegroups.com
To be extra explicit/clear, what I believe to be true is, per material (i.e., repo):

100 webhook POSTs != 100 git fetch

Therefore:

100 webhook POSTs != 100 pipeline runs
Reply all
Reply to author
Forward
0 new messages