Running a pipeline multiple times in parallel in not possible?

1,351 views
Skip to first unread message

eran....@riskified.com

unread,
Sep 28, 2016, 6:18:10 AM9/28/16
to go-cd, Haim Ashkenazi
One of the key features that made us decide to migrate to gocd several months ago was the idea that pipelines are able to run in parallel and that more than one instance of a pipeline can be started.
Only now are we starting to realise that actually the same pipeline cannot be executed multiple times in parallel since each stage will always run sequentially even if it's the same stage in different pipeline instances. I'm stating this only based on this post which is the only mention of this issue that I've been able to find.

So first off I'd like to verify if this is correct. Is it not possible to run several instances of the same pipeline concurrently (in parallel )without one instance being constrained by another (i.e. the second instance of the same pipeline is able to complete even if a previously initiated instance has not done so)?

If this is the case, I'd really appreciate any help/ideas on overcoming this limitation in some way.
The pipeline I'm working on has only one stage which simply runs a docker and then deletes the container and image. The docker does some work on our machine learning models, there is no problem to run several containers of this docker at the same time - and that's exactly what I'd like to do - i.e. run another docker each time that the pipeline is triggered (we're using the api to trigger it).
I'd like to see the output created by each docker and, of course, see whether each pipeline has finished successfully or failed. The order of execution and even the material version is not relevant, each instance of the pipeline/docker has it's job to do.

Finally, I'd like to ask if you guys feel that a feature that enables the same stage to run concurrently in different pipeline instances  is feasible. And that a request for this is something which has a chance of being accepted (I will, of course, contribute anything I can within my technical skills) .
I'm sure that the are many use cases that will benefit from such a feature.

Thanks





Zabil C M

unread,
Oct 3, 2016, 2:00:08 AM10/3/16
to go...@googlegroups.com, Haim Ashkenazi
There's an issue logged for this here 

In short, you can't do this at the moment but we are willing to help out with and merge this feature if someone picks it up. 

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mario Giammarco

unread,
Mar 1, 2022, 9:38:09 AM3/1/22
to go-cd
Hi,
I resurrect this thread.
I have one pipeline and several agents. I need to run the pipeline multiple times in parallel with different parameters.
It seems to me a common and easy question. 
Reading this thread, bug description and other threads it seems it is not possible.
I am really surprised about it.
Is it really not possible?
Is there a workaround?
Thanks,
Mario

To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.

Mario Giammarco

unread,
Mar 1, 2022, 9:43:18 AM3/1/22
to go-cd
Sorry I need to change enviroment variables not parameters.

Sriram Narayanan

unread,
Mar 1, 2022, 10:21:57 AM3/1/22
to go...@googlegroups.com
On Tue, 1 Mar 2022 at 10:43 PM, Mario Giammarco <mgiam...@gmail.com> wrote:
Sorry I need to change enviroment variables not parameters.

When I’ve needed to do this, I’ve loaded the required values into environment variables via a file ( eg the “source” command in a shell script) - the file is then delivered as a version controlled file.

You could then trigger the pipeline via API if you want, where the parameter could point to the environment file name that the pipeline’s scripts must use.

— Ram

Chad Wilson

unread,
Mar 1, 2022, 10:26:54 AM3/1/22
to go...@googlegroups.com
Hi Mario

That GitHub ticket is specifically about the case where you want to be building in parallel based on different material versions i.e inputs to the pipeline. (e.g developer pushes to a git material 3x, say 30s apart - can all 3 builds start in parallel?)

If you are not trying to build based on different material versions in parallel and instead want to vary other aspects (env vars, environments, parameters etc) for the same material versions, even without running the same pipeline in parallel as in that ticket, there are different ways to approach parallel runs in GoCD.
  • separate jobs within the same stage of a single pipeline that override env vars differently at job scope, but execute the same tasks in each job
  • separate pipelines entirely, that have the same material inputs/triggers, but have different configuration for the stage/job/tasks
The best way to model it depends on what is inside your existing pipeline (multiple stages? multiple jobs within stages?) and also what you want to do as a result of completion:
  • trigger a downstream pipeline only once all 'n' parallel runs complete? - jobs might work.
  • trigger another pipeline/stage as soon as any single one of the parallel jobs complete and then trigger again for each parallel run? - pipelines might be a better approach.
You'd probably need to share more context on how you want things to behave after your parallel runs and the content of your existing pipeline to suggest the best approach.

There is some config overhead/duplication, but you can generally mitigate duplication using either
-Chad

Mario Giammarco

unread,
Mar 1, 2022, 5:39:24 PM3/1/22
to go-cd
I need to build a parallel integration testing.
I start from same source (same git repository same tag).
I have several PCs (each one with agent running NOT as a service). Each PC must compile and launch a .exe with a command line parameter (that I take from environment) different for each test.
So I supposed to use a pipeline: I start it via API many times, then the pipeline on each PC compile and launch the .exe , it connects to server, dowloads test, runs test and stop.
In this way it is very simple: many tests are launched, they fills agents, I can scale horizontally.
So please tell me the right approach to do this.
Thanks,
Mario

Chad Wilson

unread,
Mar 2, 2022, 5:53:05 AM3/2/22
to go...@googlegroups.com
It sounds like having multiple jobs with the same tasks in a single stage and single pipeline might do what you need. This is what is done within GoCD itself to achieve parallel integration testing.

If you don't want to have multiple job definitions and manually vary the environment variables between jobs you can consider whether using the job option for "Run multiple instances" is suitable for your case.

When you use this option and a stage is triggered, it will start the same job definition in parallel 'n' times with GO_JOB_RUN_COUNT=n and varying GO_JOB_RUN_INDEX=1..n. You can then change your scripting to partition the tests deterministically into buckets depending on the # of jobs in the _COUNT and select partition based on the index. Bit more advanced, but worth considering if you are just trying to split a chunk of work over 'n' agents.

-Chad

Mario Giammarco

unread,
Mar 2, 2022, 9:03:56 AM3/2/22
to go-cd
I have read your suggestions but I think my case is different.
Pipelines of the server side for testing are run by Azure Devops. When a backend est pipeline is ready (has launched a server) it calls via api the pipeline of gocd passing in an environment variable the ip address of the server to test. Now the pipeline of gocd should "find" a free agent and compile and run test client that will call the  test server by ip. If there are no free agents we wait for.

Mario Giammarco

unread,
Mar 2, 2022, 9:28:15 AM3/2/22
to go-cd
Il giorno martedì 1 marzo 2022 alle 16:26:54 UTC+1 Chad Wilson ha scritto:

  • separate jobs within the same stage of a single pipeline that override env vars differently at job scope, but execute the same tasks in each job
  • separate pipelines entirely, that have the same material inputs/triggers, but have different configuration for the stage/job/tasks


Regarding this, can you confirm that I can use same material with different pipelines? I have tried ONLY with web gui and it seems it ignore existing material configuration during creation of a new pipeline.

Mario Giammarco

unread,
Mar 2, 2022, 9:38:54 AM3/2/22
to go-cd
If I create several pipelines from a template can I run them in multiple agents?

Mario Giammarco

unread,
Mar 2, 2022, 11:57:29 AM3/2/22
to go-cd
I would like to simplify even more problem description. Basically I have X agents. I need to call a "manager" that tells me "there is a free agent". Then I tell the "manager" to run a software on that agent. If there are no more free agents I wait.

Ashwanth Kumar

unread,
Mar 2, 2022, 7:43:01 PM3/2/22
to go...@googlegroups.com
> I would like to simplify even more problem description. Basically I have X agents. I need to call a "manager" that tells me "there is a free agent". Then I tell the "manager" to run a software on that agent. If there are no more free agents I wait.

This is exactly how GoCD works - Manager (the GoCD Server) has X agents configured, foreach job (pipeline / stage / job) that you  trigger, the Manager waits until it finds a free agent and executes it there. What to execute is specified as tasks within the pipeline / stage / job spec.




--

Ashwanth Kumar / ashwanthkumar.in

Mario Giammarco

unread,
Mar 3, 2022, 3:45:02 AM3/3/22
to go-cd
Perfect!
But if I launch two times the same pipeline the second one waits even if there are free agents, correct?

Ashwanth Kumar

unread,
Mar 3, 2022, 6:06:17 AM3/3/22
to go...@googlegroups.com
Unfortunately as you've already discovered GoCD locks at a pipeline-stage level today. So if you don't have the pipeline locked -- You can run the same pipeline in parallel, where the parallelism is defined by the number of stages in the pipeline. If you've locking enabled or just a single stage you can only run one instance of the entire pipeline. AFAIK you can't even schedule a new pipeline run until the current one is finished or use a custom SCM plugin that can sequence the commits (or any other parameter) and run the pipeline one after the other automatically for you.

Long story short, if you're very particular about parallel builds. You should be able to create a new pipeline instance using a template via API and run the build for a particular version (it's easy if you're using Git for example) and choose to delete it later on (as part of house-keeping activity). The biggest problem I can think of is you will lose the build history and console.log as part of it. If you're too invested in GoCD and want to go down the custom SCM route or a separate tool, I'm happy to connect offline and discuss more about this.

Mario Giammarco

unread,
Mar 4, 2022, 3:11:27 AM3/4/22
to go-cd
At this moment I need to do a fast proof of concept hoping to move the project from azure devops.
I have no problem in creating a pipeline and deleting it. 
The problem is, for example, that I have built a template and I would like to create pipeline from template, but I see that also in API the "create from template" is missing.
So you can "extract from template" but not create. And so what you do with templates?
Thanks again,
Mario

Ashwanth Kumar

unread,
Mar 4, 2022, 3:33:18 AM3/4/22
to go...@googlegroups.com
Template is a set of stage / job combinations which can be created once and re-used across multiple pipelines. Think of a case where-in you've a deploy pipeline called "Deploy-A-Staging", you might need another deploy pipeline for production push say "Deploy-A-Production". In this way, you want the Deploy-A-Staging and Deploy-A-Production to have the same set of steps, you control the parameters via Environment variables / Parameters depending on how you've set up your pipeline. Before we had the feature of Pipeline as a Code (PAAC), templates were the way to get it done faster since the entire manipulation was mostly done via the UI. Today you achieve the same effect via PAAC out of the box, but it's very specific to that config format plugin and not available on GoCD as a first class entity.

Few points to remember when working with templates:
- Templates can have implicit references to different pipelines as part of "Fetch Artifact" tasks for example. 
- It can assume a particular material (like Git material) presence because as part of "Custom Command" you can give anything in it.
- It can also refer to parameters in it's task definitions. This is why you can choose a template for the new pipeline only if they've all the required dependencies else pipeline save validations might fail. 

These are probably some of the reasons why Templates are always extracted from a valid pipeline rather than be created from scratch.

Now coming to creating a pipeline from a template, As part of this API example in the page, Assuming you've a template called "Template-Name-For-Deploy-A", instead of passing "stages: [...]", you can try passing "stages; null" and  "template: 'Template-Name-For-Deploy-A'" properties. I know the documentation for this API isn't very beginner friendly, we appreciate any type of contribution to improve it. 

Thanks,



Reply all
Reply to author
Forward
0 new messages