[Feature Request] run-x-instances of job

89 views
Skip to first unread message

srinivas upadhya

unread,
Aug 1, 2014, 1:14:41 AM8/1/14
to go...@googlegroups.com
Im planning to implement "run-x-instances" of a job feature (#396). I have implemented a "spike" for it here. You should be able to build installers & test the feature by setting attribute "instanceCount" to some number say 10 for a job config (through config xml) & see 10 instances of the job getting spawned when stage is scheduled. the re-run behavior should be similar to runOnAll agents.

IMO this is the first step towards support for "test parallelization".

Wanted to know what others in the community think about it. Is someone interested in the feature. Any opinions as to how it should behave & what should be the UX. Better, does someone want to get involved in implementing the feature. Good chance to get familiar with code!

Marius Ciotlos

unread,
Aug 3, 2014, 8:23:03 AM8/3/14
to go...@googlegroups.com
Hi, 

I'm a bit uncertain how this would work as currently I've seen just two strategies to run tests in parallel on my side:
- Group tests in groups, and trigger jobs with a group of tests assigned. This way you can segment your tests and somehow run them in parallel.
- Front-end tests usually require a lot of browser interaction and we've used a "grid" to split them up to available browser instances. This way you have 1 job that sends all to a grid that manages the parallel tasks. 

I assume if multiple jobs are spawning from the same task there will be a need to get some sort of parameters to each job so it knows to process from the available left tests. Is this correct, or how will this work?

At the moment after many pipelines setup, the only usage for run on all agents checkbox we've found was to update all agents with something by running a command on each agents without having to manually specify the agent name or something else. 

I would be very interested how tests will be split as this way I could offer feedback and maybe even improve our current test parallelisation view. 

Marius

srinivas upadhya

unread,
Aug 3, 2014, 8:41:59 AM8/3/14
to go...@googlegroups.com
"I assume if multiple jobs are spawning from the same task there will be a need to get some sort of parameters to each job so it knows to process from the available left tests. "
- The run-x-instances feature provides ability to spawn multiple instances of a job. We could provide env. var for total instance & job instance index for consumption by task.

"I would be very interested how tests will be split as this way I could offer feedback and maybe even improve our current test parallelisation view. "
- This feature does not solve the test parallelization problem right away. The spawned jobs could run tests in parallel. How you run them is still decided by your tasks.
> One approach is to do what Test Load Balancer (TLB) does. We currently use TLB to run tests in parallel for Go. What it basically does is prune the list of tests based on instance count & total instances. i.e. say there are 100 tests & 10 jobs TLB on agent side divides the tests for execution. This could be count based (10 test per agent) or time based (which is stored on server side). The tests after division can be run in failed first order (again the data is stored on server side).
> The other approach that the core team has been talking about is agent runs given test & request for new test to run from server.
In both the approaches we would require the ability to spawn a job into required number of instances. Currently we have 'x' copies of same job in the config. We could make it just 1 copy with this feature.

"At the moment after many pipelines setup, the only usage for run on all agents checkbox we've found was to update all agents with something by running a command on each agents without having to manually specify the agent name or something else."
- This feature could also be used for parallel deployments. If there are 10 servers to deploy & they all should be done together you would have 10 agents & make one job runOnAll of 10 agents at once.

srinivas upadhya

unread,
Oct 20, 2014, 11:47:55 AM10/20/14
to go...@googlegroups.com
The first cut of the feature is ready.

EA: As i have already mentioned Go uses TLB to run tests in parallel. This feature allows Go to spin "required" number of instances of a job which are configured to execute tests in parallel (distributed) fashion. Something like this. The spawned jobs expose 2 environment variables GO_JOB_RUN_INDEX & GO_JOB_RUN_COUNT which are used to decide which tests to run.

Blog post on "Test Parallelization" using Go & TLB using this feature coming soon.

If someone has some feedback do let me know. Thanks.

Note: Use "view" & "password" for authentication.

Liping Guo

unread,
Nov 2, 2015, 4:43:21 PM11/2/15
to go-cd
Hi Srinivas,

Does recent GO already has TLB bundled with it?

Thanks,

Liping

Aravind SV

unread,
Nov 2, 2015, 5:47:31 PM11/2/15
to go...@googlegroups.com
No, it doesn't bundle TLB.

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fredrik Wendt

unread,
Nov 3, 2015, 4:51:17 AM11/3/15
to go...@googlegroups.com
gocd does include setting up a few environment variables which are very useful to TLB. Setup your job's settings (Job Settings tab) with "Run Type" set to Run x instances, where x is the number of slices you want to divide your test suite into.


export TLB_BASE_URL=http://tlbserver:7019
export TLB_PARTITION_NUMBER=$GO_JOB_RUN_COUNT
export TLB_TOTAL_PARTITIONS=$GO_JOB_RUN_INDEX
export TLB_JOB_NAME=${GO_PIPELINE_NAME}-${GO_STAGE_NAME}-${GO_JOB_NAME}
export TLB_JOB_VERSION=${GO_PIPELINE_COUNTER}-${GO_STAGE_COUNTER}

/ Fredrik
--
+46 702 778511

Liping Guo

unread,
Nov 3, 2015, 8:42:45 AM11/3/15
to go-cd
Hi Fredrik,

From your email, looks like TLB_BASE_URL environment variable needs to be provided, does it mean If I use TLB in GO, I still need to install TLB server somewhere and activate it? 

If NOT, do I need to install TLB related jars (such as tlb-java-0.3.2.jar) to GO class-path? 

Thanks,

Liping

Fredrik Wendt

unread,
Nov 3, 2015, 5:37:50 PM11/3/15
to go...@googlegroups.com
Yes, you will have to handle the TLB server on your own if you want one.

/ Fredrik

Liping Guo

unread,
Nov 4, 2015, 9:25:01 AM11/4/15
to go-cd
Srinivas,

I am newbie to TLB.

I am trying to use TLB in our GO CD, from TLB doc posted here: http://test-load-balancer.github.io/doc-0_3_2/concepts.html#implementations_of_tlb_server, it states:
  "TLB has inbuilt support for Go, which means TLB can balance against Go just like it balances against the TLB-Server. Running against Go obviously means the tests are run as part of a Go-Task, which will run on a Go-Agent. Additionally, because TLB is environment aware, it can implicit a few things while running against Go server. It deduces equivalent of things like job-name[4], version[5] and total partitions[6] from the way jobs are configured under stage and pipeline. To make TLB work with Go, Server needs to use GoServer[3].
    In this case, you do not need to run a separate process(TLB Server) to act as server, because Go-server plays that role. This does not need any change in the go-server or go-configuration apart from the naming convention your Go job-names need to follow. The convention is that they need to be of the form "<my-job-name>-X"(where X is a natural number 1..n, when you want to make n partitions), or "<my-job-name>-<UUID>"(where each such job will be made to execute only one partition)."

From above statements, it looks like that we don't need to run TLB server if we are using GO Server because Go-server plays that role. If it's true, does it mean we need to set TLB_BASE_URL to http://go-server:7019 or http://localhost:7019 ? And how Go-server enable tlb balancer if Go CD isn't bundled with TLB jars, from the sample (http://www.go.cd/documentation/user/current/configuration/quick_pipeline_setup.html ), will setting up following environment variable be enough to kick off TLB balancer? Is anthing else we should do? Such as tlb jars to GO server classpath or add tlb plugin in maven pom file...?
export TLB_BASE_URL=http://tlbserver:7019
export TLB_PARTITION_NUMBER=$GO_JOB_RUN_COUNT
export TLB_TOTAL_PARTITIONS=$GO_JOB_RUN_INDEX
export TLB_JOB_NAME=${GO_PIPELINE_NAME}-${GO_STAGE_NAME}-${GO_JOB_NAME} 
export TLB_JOB_VERSION=${GO_PIPELINE_COUNTER}-${GO_STAGE_COUNTER}

 I noticed for junit , we need to set  mvn option -Drun.tests.using.tlb=true, my question is how should GO server recognize this option and communicate with TLB balancer?

Thanks for your help.

Liping


On Friday, August 1, 2014 at 1:14:41 AM UTC-4, srinivas upadhya wrote:

srinivas upadhya

unread,
Nov 4, 2015, 11:34:02 AM11/4/15
to go...@googlegroups.com
TLB documentation is pretty old & hasn't been updated since sometime. You will need to setup TLB yourself, which is quiet simple. Refer: http://www.go.cd/2014/10/09/Distrubuted-Test-Execution.html. Let me know if you have more questions.

--

Liping Guo

unread,
Nov 4, 2015, 5:15:57 PM11/4/15
to go-cd
Thank you Srinivas.

I configed GO server's (installed on my local machine) pipeline by following the instructor posted on http://www.go.cd/2014/10/09/Distrubuted-Test-Execution.html and modified my junit pom.xml file following the sample here: https://github.com/test-load-balancer/sample_projects/blob/master/maven_junit/pom.xml , but when GO job kicked off using the following config, somehow '-Drun.tests.using.tlb=true' wasn't recognized. 
          <tasks>
              <exec command="mvn" workingdir="efm-sio-acceptance-tests\acceptance-tests">
                <arg>clean</arg>
                <arg>install</arg>
                <arg>-DskipTests=true</arg>
                <arg>-Drun.tests.using.tlb=true</arg>
                <runif status="any" />
              </exec>
            </tasks>

Here is the maven antrun plugin:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.7</version>
<executions>
<execution>
<id>ant.test.tlb</id>
<phase>test</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<tasks if="run.tests.using.tlb">
<echo message="TLB_JOB_NAME = ${TLB_JOB_NAME}"/>
<echo message="TLB_JOB_VERSION = ${TLB_JOB_VERSION}"/>
<echo message="TLB_PARTITION_NUMBER = ${TLB_PARTITION_NUMBER}"/>
<echo message="TLB_TOTAL_PARTITIONS = ${TLB_TOTAL_PARTITIONS}"/>
<property name="src.dir" location="src"/>
<property name="test.dir" location="test"/>
<property name="lib.dir" location="lib"/>
<property name="test.lib.dir" location="lib"/>
<property name="target.dir" location="target"/>
<property name="classes.dir" location="${target.dir}/classes"/>
<property name="test-classes.dir" location="${target.dir}/test-classes"/>
<property name="reports.dir" location="${target.dir}/reports"/>


<property name="tlb.dist.dir" location="${TLB_CLASSPATH}"/><!-- used in packaged distribution(s) -->

<mkdir dir="${reports.dir}"/>
<mkdir dir="${test-classes.dir}"/>
<mkdir dir="${reports.dir}"/>

<path id="src.classpath">
<pathelement path="${classes.dir}"/>
</path>

<path id="test.classpath">
<path refid="src.classpath"/>
<pathelement path="${test-classes.dir}"/>
<fileset dir="${test.lib.dir}" includes="*.jar"/>
</path>

<path id="maven.test.classpath">
<path refid="test.classpath"/>
<fileset dir="${tlb.dist.dir}/lib" includes="*.jar" erroronmissingdir="false"/>
<fileset dir="${tlb.dist.dir}" includes="tlb-java*.jar" erroronmissingdir="false"/>
</path>

<!-- run tests through TLB -->
<path id="classpath.for.tests">
<path refid="maven.test.classpath"/>
</path>

<typedef name="load-balanced-fileset" classname="tlb.ant.LoadBalancedFileSet" classpathref="classpath.for.tests"/>

<junit failureproperty="test.failed" printsummary="yes" haltonfailure="false" haltonerror="false" showoutput="true" fork="true">
<classpath refid="classpath.for.tests"/>

<batchtest todir="${reports.dir}">
<load-balanced-fileset dir="${test-classes.dir}" includes="**/*Test.class" />
<formatter classname="tlb.ant.JunitDataRecorder"/>
<formatter type="xml"/>
</batchtest>
</junit>
<fail if="test.failed"/>
</tasks>
</configuration>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>org.apache.ant</groupId>
<artifactId>ant-junit</artifactId>
<version>1.9.2</version>
</dependency>
</dependencies>
</plugin>

When I ran through TLB server, the run.tests.using.tlb option was recognized, but got another error such as: 
Caused by: java.lang.IllegalStateException: None of [tlb.splitter.TimeBasedTestSplitter, tlb.splitter.CountBasedTestSplitter] could successfully split the test suites.
	at tlb.splitter.DefaultingTestSplitter.filterSuites(DefaultingTestSplitter.java:45)
	at tlb.ant.LoadBalancedFileSet.iterator(LoadBalancedFileSet.java:51)
	at org.apache.tools.ant.types.resources.Resources$MyCollection$MyIterator.hasNext(Resources.java:97)
	at org.apache.tools.ant.util.CollectionUtils.asCollection(CollectionUtils.java:218)
	at org.apache.tools.ant.types.resources.Resources$MyCollection.getCache(Resources.java:83)
	at org.apache.tools.ant.types.resources.Resources$MyCollection.iterator(Resources.java:78)
	at org.apache.tools.ant.types.resources.Resources.iterator(Resources.java:170)
	at org.apache.tools.ant.taskdefs.optional.junit.BatchTest.getFilenames(BatchTest.java:147)
	at org.apache.tools.ant.taskdefs.optional.junit.BatchTest.createAllJUnitTest(BatchTest.java:126)
	at org.apache.tools.ant.taskdefs.optional.junit.BatchTest.elements(BatchTest.java:102)
	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.getIndividualTests(JUnitTask.java:1474)
	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.execute(JUnitTask.java:804)
	at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
	at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
	... 34 more

Don't know what I missed. Your help is highly appreciated.

By the way, I am using java 1.8.

Thanks,

Liping

Liping Guo

unread,
Nov 4, 2015, 7:57:21 PM11/4/15
to go-cd
Srinivas,

Just want to double check, if I use GO server, do I still need to run separate TLB server?


On Wednesday, November 4, 2015 at 11:34:02 AM UTC-5, srinivas upadhya wrote:

Liping Guo

unread,
Nov 4, 2015, 8:57:09 PM11/4/15
to go-cd
Srinivas,

Just want to double check, if I use GO server, do I still need to run separate TLB server?

On Wednesday, November 4, 2015 at 11:34:02 AM UTC-5, srinivas upadhya wrote:

Ketan Padegaonkar

unread,
Nov 4, 2015, 9:20:06 PM11/4/15
to go-cd

Yes you do need to run a tlb server. The go team runs a TLB server :)

Here is out pom.xml containing TLB https://github.com/gocd/gocd/blob/master/common/pom.xml

You will find the build and environment variables here - https://build.go.cd/go/tab/build/detail/build-linux/640/build-non-server/1/common-split-runInstance-1#tab-console use username: view, password: password to login. Some of the TLB environment vars are optional.


Liping Guo

unread,
Nov 6, 2015, 11:28:17 AM11/6/15
to go-cd
Thank you Ketan.

I am able to run test cases with TLB enabled on GO CD. One more question: if a maven project has multiple sub maven projects, do I only need to modify parent pom file to add maven-antrun-plugin to enable TLB or I need to modify all pom for all sub projects as well?

Thanks,

Liping
Reply all
Reply to author
Forward
0 new messages