best way to parallelize a test suite.

Eric Hexter

unread,

Nov 24, 2008, 5:08:54 PM11/24/08

to gallio-dev

I want to look into parallelizing our ui tests. We are using watin to
execute our ui tests. I want to look into a way to execute multiple
sets of tests concurrently.
My question is.....

1. Is this a bad idea? Can Gallio support this type of test runner?

2. If this is not a bad idea, where should I start looking at
implementing a multi threaded test runner? I see the aggregate test
driver but I am new to Gallio and the domain model so at first pass it
is not clear where I should start looking at this.

Thanks,
Eric

Vladimir Okhotnikov

unread,

Nov 25, 2008, 4:26:27 AM11/25/08

to galli...@googlegroups.com

Hi

IIRC, WATIN does require the tests to be run in single-threaded
apartment (due to the fact that it interacts with non-thread safe IE COM
component), and I assume this means you're out of luck with
parallelizing it. More info is at
http://watin.sourceforge.net/apartmentstateinfo.html

Eric Hexter

unread,

Nov 25, 2008, 8:55:23 AM11/25/08

to galli...@googlegroups.com

OK. good point. a multi-threaded test runner would give me zero benefit because of what I am trying to test.

So it looks like I would need to create a test runner that spins up a new process rather than a new worker thread. Thus the biggest issue to overcome would be creating a deterministic way to split up the tests and than delegating the test operations to multiple worker processes. This does sound like a new TestDriver implementation if not two.

Any one else have a recommendation?

Jeff Brown

unread,

Nov 25, 2008, 2:05:26 PM11/25/08

to galli...@googlegroups.com

This is actually something on the v3.1 roadmap. I've been working on a blog post to describe what's actually involved.

What we do at Yellowpages.com is to reframe the problem. Rather than putting Gallio itself in charge of a distributed network of test machines (which is possible right now, but really will be best left until we redesign a few bits for v3.1) we use some commercial job scheduling software (ActiveBatch) to run multiple instances of Gallio. The test suite is partitioned into small units that we can run on a schedule.

The main reason we don't put a single Gallio process in charge of the whole thing is that web integration tests can run for a very long time. So it's helpful to think of them as independent (but perhaps concurrent) test runs. Also because these kinds of tests tend to be somewhat brittle, building up granular test suites makes it significantly easier to reschedule and rereun just a few tests here and there.

You might try doing something like that just for now. In the end, you may find, like us, that having control over groups of tests is beneficial. For example, using ActiveBatch we can declare mutual exclusion rules for concurrent test runs. Defining the same rule in Gallio right now would be tricky.

For v3.1 I'm thinking of adopting a lightweight distributed message bus architecture as the backbone of the Gallio communications architecture. Instead of having fairly large "TestPackages" with lots of tests to run all at once, we'll break them down into smaller units that can be individually controlled. One of the tricky bits will be knitting together aggregate reports from multiple concurrent test runs.

Finally, this is where Archimedes will fit in. Archimedes will be a test case management application woven into the test communication fabric.

Jeff.

P.S. Another tool to look at is PNUnit.

From: galli...@googlegroups.com [mailto:galli...@googlegroups.com] On Behalf Of Eric Hexter
Sent: Tuesday, November 25, 2008 5:55 AM
To: galli...@googlegroups.com
Subject: Re: best way to parallelize a test suite.

Jeff Brown

unread,

Nov 25, 2008, 2:13:42 PM11/25/08

to galli...@googlegroups.com

Actually, with WatiN in particular you are likely to run into issues running
tests concurrently on the same machine (at least in the same user context)
because IE instances share their cookies and temporary internet files
caches. That can cause all sorts of problems if you have one test running
"logged in" concurrently with another that was supposed to be "logged out".

As a result, to parallelize WatiN tests, it's preferable to run batches of
them on separate (virtual) machines.

Jeff.

-----Original Message-----
From: galli...@googlegroups.com [mailto:galli...@googlegroups.com] On
Behalf Of Vladimir Okhotnikov
Sent: Tuesday, November 25, 2008 1:26 AM
To: galli...@googlegroups.com
Subject: Re: best way to parallelize a test suite.

Eric Hexter

unread,

Nov 25, 2008, 2:31:21 PM11/25/08

to galli...@googlegroups.com

I can see the shared cookie problem. Luckily we are testing a web application and not a website. So we are not using permanent cookies. I think we will be ok, running multiple instances on a single machine.. for now.

I did put together a spike and I am able to create my own thread pool of AparementThread.STA threads. I am able to spin up 8 concurrent watin instances, so I think the threading issue is not such a big problem, as I initially thought.

I like the idea of using async messaging. ala nServiceBus. That could easily scale to be able to execute multi-machine coordination and reporting.

The biggest problem that I wanted to solve was being dealing with the manual categorization and optimization of the test groups. Our dev team is creating the tests along with our testers, and we run the full regression suite after every checkin as a gate before deploying our acceptance environment. I will take a look at pnunit. That being said.

If I was to proto type this .. would I do this through a custom ITestDriver ?

TIA

Eric

Jeff Brown

unread,

Nov 25, 2008, 4:44:02 PM11/25/08

to galli...@googlegroups.com

Hmm.

Actually Gallio is internally designed to handle multiple concurrent tests throughout most of its pipeline... with one exception.

ITestCommand describes a test that has been selected for execution. The commands are sorted to satisfy test dependencies and ordering constraints (where applicable). They are also arranged hierarchically to capture containment relationships. However, they have no concept of concurrency.

Concurrency could either be handled by Gallio at the platform level or by individual test frameworks. The latter is significantly easier for now but we'll have to revisit platform level concerns later (particularly for distributed test execution).

So the changes required are along these lines:

1. Define some concept of a parallelizable test in MbUnit. This could take the form of a "test isolation" attribute.

[Isolation(IsolationType.AppDomain)] -- default: at most one instance of the test can run in a given AppDomain

[Isolation(IsolationType.Thread)] -- multiple instances of the test can run, assuming they run on different threads

Could also define concepts of mutual exclusion, shared resource sets, etc.

The simplest hint would probably be something like:

[Parallelizable]

2. Extend the PatternTestController to recognize the concurrency notations and run the tests in parallel with appropriate synchronization in place.

Jeff.

From: galli...@googlegroups.com [mailto:galli...@googlegroups.com] On Behalf Of Eric Hexter
Sent: Tuesday, November 25, 2008 11:31 AM

To: galli...@googlegroups.com
Subject: Re: best way to parallelize a test suite.

Jeff Brown

unread,

Nov 26, 2008, 3:53:17 AM11/26/08

to galli...@googlegroups.com

Ok, just for kicks I have added an experiment attribute in MbUnit v3 called [Parallelizable] that you can apply at the fixture or test level as you wish. Each test to be parallelized must sport this attribute.

Parallelizable tests will run in parallel assuming there are no other ordering constraints or dependencies preventing them from doing so. Instances of data-driven tests will not run in parallel with each other in the current implementation.

The number of threads used is either 2 or the number of processors you have, whichever is greater. You can change this by setting the value PatternTestGlobals.DegreeOfParallelism to something else. We'll probably need a new attribute to make this more convenient, I'm just not sure at what scope the degree of parallelism should be specified.

Degree of parallism is currently enforced locally only. That means if you have several fixtures and tests with the [Parallelizable] attribute, there may actually end up being more than DegreeOfParallelism tests running concurrently. For example, you could have 4 parallelizable fixtures each of which having 4 parallelizable tests running concurrently for a total of up to 16 concurrently executing tests despite DegreeOfParallelism being only 4.

Note that I consider this support experimental at best. The API may yet evolve.

Please let me know what you think!

Example 1: Test-level parallelism.

public class Fixture

{

[Test, Parallelizable]

public void One() {..}

[Test, Parallelizable]

public void Two() {..}

}

Example 2: Fixture-level parallelism.

[Parallelizable]

public class FixtureOne { ... }

[Parallelizable]

public class FixtureTwo { ... }

Jeff.

From: Jeff Brown [mailto:jeff....@gmail.com]
Sent: Tuesday, November 25, 2008 1:44 PM
To: 'galli...@googlegroups.com'
Subject: RE: best way to parallelize a test suite.

Hmm.

Actually Gallio is internally designed to handle multiple concurrent tests throughout most of its pipeline... with one exception.

ITestCommand describes a test that has been selected for execution. The commands are sorted to satisfy test dependencies and ordering constraints (where applicable). They are also arranged hierarchically to capture containment relationships. However, they have no concept of concurrency.

Concurrency could either be handled by Gallio at the platform level or by individual test frameworks. The latter is significantly easier for now but we'll have to revisit platform level concerns later (particularly for distributed test execution).

So the changes required are along these lines:

1. Define some concept of a parallelizable test in MbUnit. This could take the form of a "test isolation" attribute.

[Isolation(IsolationType.AppDomain)] -- default: at most one instance of the test can run in a given AppDomain

[Isolation(IsolationType.Thread)] -- multiple instances of the test can run, assuming they run on different threads

Could also define concepts of mutual exclusion, shared resource sets, etc.

The simplest hint would probably be something like:

[Parallelizable]

2. Extend the PatternTestController to recognize the concurrency notations and run the tests in parallel with appropriate synchronization in place.

Jeff.

From: galli...@googlegroups.com [mailto:galli...@googlegroups.com] On Behalf Of Eric Hexter
Sent: Tuesday, November 25, 2008 11:31 AM

To: galli...@googlegroups.com
Subject: Re: best way to parallelize a test suite.

Jeff Brown

unread,

Nov 26, 2008, 3:55:37 AM11/26/08

to Jeff Brown, galli...@googlegroups.com

Forgot to mention. Will be in build 561 or newer.

Download: http://ccnet.gallio.org/Distributables/

From: Jeff Brown [mailto:jeff....@gmail.com]

Sent: Wednesday, November 26, 2008 12:53 AM
To: 'galli...@googlegroups.com'
Subject: New experimental [Parallelizable] attribute in MbUnit v3. RE: best way to parallelize a test suite.

From: galli...@googlegroups.com [mailto:galli...@googlegroups.com] On Behalf Of Eric Hexter
Sent: Tuesday, November 25, 2008 11:31 AM

To: galli...@googlegroups.com
Subject: Re: best way to parallelize a test suite.

ARKBAN

unread,

Nov 26, 2008, 7:49:07 AM11/26/08

to galli...@googlegroups.com

Does this use the Parallel FX Library (http://en.wikipedia.org/wiki/Parallel_FX_Library) or does it use separate Processes to achieve the parallelization? (Or something else I'm unaware of?)

I think the idea is great, I can see this greatly improving performance in certain domains.

ARKBAN

Eric Hexter

unread,

Nov 26, 2008, 9:18:38 AM11/26/08

to galli...@googlegroups.com

I will pull down the bits and give it a try..

IF a Fixture is marked for parallelism it sounds like the test within the fixture can run in parrell?

I can see the need for fixtures to run in parrell but not the individual tests within a fixture. For example we spin up a watin instance per fixture. So we would not want to have xxx tests run concurrently, but I would like to run 8 fixtures concurrently.

Is the threading model still STA for each thread? I hope so.

Jeff Brown

unread,

Nov 26, 2008, 1:37:43 PM11/26/08

to galli...@googlegroups.com

If you mark a fixture (or test) as parallelizable, then it is only parallelized with respect to its siblings. So in fact, we have the behavior you desire.

The threading model is STA by default. It can be overridden on a per-fixture or test basis though by adding an [ApartmentState] attribute. You probably won't need to worry about that.

Jeff.

From: galli...@googlegroups.com [mailto:galli...@googlegroups.com] On Behalf Of Eric Hexter

Sent: Wednesday, November 26, 2008 6:19 AM
To: galli...@googlegroups.com
Subject: Re: New experimental [Parallelizable] attribute in MbUnit v3. RE: best way to parallelize a test suite.

Eric Hexter

unread,

Dec 4, 2008, 12:52:38 AM12/4/08

to galli...@googlegroups.com

Awesome stuff. I have had a chance to test out the parallel execution. It is working pretty well. I observed two behaviors which were unexpected.

The first is that there seems to be a race condition that will cause the number of parallel tests to decrease until a test completes than the number of parallel tasks picks back up to the expected number of parallel tests. I submitted an issue for this and a patch to correct the problem. http://code.google.com/p/mb-unit/issues/detail?id=355 I used a semaphore to handle the wait one rather than manually comparing the number of tests running in parallel. This patch corrected the behavior that I witnessed.

The second behavior is that an extra thread spun up and executed in parallel. I believe this may have to do with running some tests as parallelizable and some testfixtures as parallelizable. This will take some more work to dig in and understand the condition that is causing this. I believe this is a scope issue with the semaphore/AutoResetEvent. That being said I am not clear on what an appropriate container for a more global instance should be. This is a massive code base and hard to grock.

That being said, I am able to move on to some more pressing issues in the performance of WatiN now that I can run the fixtures in parallel! I think this feature is great.

If you need to know more about the patch or have any questions.... you know where to find me.

Thanks,

Eric

Jeff Brown

unread,

Dec 4, 2008, 4:59:52 AM12/4/08

to galli...@googlegroups.com

Ahh! That makes more sense. I forgot that the .Net framework actually has a counting semaphore kicking around. Good news. :-)

The scoping of the semaphore is a little tricky because we have to take re-entrance into account. Ideally it would be good to make it "global" because that gives more accurate control over the true degree of parallelism. (No more of this multiplicative effect from nesting parallelizable tests.)

Suppose we have the following chain of fixtures:

[Parallelizable]
public class Fixture
{
    [Parallelizable]
    public class NestedFixture
    {
        [Parallizable]
        public void Test()
        {
        }
    }
}

Suppose DegreeOfParallelism is 2 and we have a "global" counting semaphore limiting the number of parallel tests in progress.

Ok, so first we enter "Fixture", the "global" counting semaphore decrements to 1 since now we have one test running.

Then we enter "NestedFixture". Ok, what should happen? We don't want to decrement the "global" counting semaphore again because then it will hit 0 and we'll have problems if we try to apply the same policy again to run "Test". (deadlock)

Ok, so there are a couple of ways to solve this. We could imagine a baton passing scheme or something. However I think the simplest is the following.

1. Initialize the semaphore with the count equal to DegreeOfParallelism - 1.

2. When scheduling parallelizable children of a test, assume that there is always one "free ticket" available. It represents the CPU resource currently being used by the parent. So when scheduling each parallelizable child test do the following in a loop:

2a. If the "free ticket" is available, acquire it, run the next parallelizable child test in the background, then release the "free ticket".

2b. Otherwise, acquire the semaphore, run the next parallelizable child test in the background, then release the semaphore.

The idea behind the "free ticket" is to avoid starvation of the parent test. If we were not concerned about starvation then a simpler protocol would be for the parent test to call Release on the semaphore before scheduling any of its parallelizable children, then calling Acquire again after finishing them. The problem here is that another test running concurrently might Acquire the semaphore in the middle and cause the parent test to have to wait indefinitely for the opportunity to make progress on its parallelizable children or to finish up.

You should be able to make the Semaphore a member of the PatternTestExecutor class so that it can be shared across all tests and the degree of parallelism can then be tracked more accurately.

How's that sound?

Jeff.

Jeff Brown

unread,

Dec 4, 2008, 5:41:29 AM12/4/08

to galli...@googlegroups.com

Actually on reflection the "free ticket" isn't so hot either since in my earlier proposal the semaphore may be acquired by a test even though the "free ticket" has just been released.

Instead, we should probably just count the tests and release the semaphore eagerly.

Let's see... justing thinking out loud. (probably still room for improvement)

let global-pool-count = DegreeOfParallelism - 1
let global-signal be a monitor object.

function run-test(test)
sort and partition test into parallelizable groups
for each partition in order
    run-parallelizable-tests(partition.parallelizable-tests)
    run-sequential-tests(partition.sequential-tests)
end
end

function run-sequential-tests(tests)
for each test in tests
    run-test(test)
end
end

function run-parallelizable-tests(tests)
let local-running-tests = 0

for each test in tests
    enter global-signal monitor
      loop
        if local-running-tests = 0
          then exit loop
        end

        if global-pool-count = 0
          then wait until global-signal is pulsed and restart loop
          else decrement global-pool-count and exit loop
        end
      end

      increment local-running-tests
    exit global-signal monitor

    fork
      run-test(test)

      enter global-signal monitor
        decrement local-running-tests

        if local-running-tests = 0
          then increment global-pool-count
        end

        pulse all global-signal
      exit global-signal monitor
    end
end

join all
end

Note that I wrote the above pseudo-code to use a "monitor" object directly instead of a counting semaphore because of the requirement that we wait until local-running-tests becomes 0 global-pool-count becomes non-zero, whichever comes first.

C# does not provide any language facilities for waiting on multiple conditions simultaneously. Using a single shared monitor object here, while not as efficient as other possibilities (some threads may be woken unnecessarily), is easy to implement correctly.

Jeff.

Reply all

Reply to author

Forward