If you have more than one machine you can use to run a test program,
you might want to run the test functions in parallel and get the
result faster. This technique is commonly called sharding, where each
machine is called a shard.
We propose to extend Google Test to support sharding. Here's the
plan:
- The environment variable GTEST_TOTAL_SHARDS will be set by the test
runner to define the total number of shards in use. It must be a
positive integer.
- The environment variable GTEST_SHARD_INDEX will be set by the test
runner to define the index of this shard. Each shard will be
assigned a unique index, which must be in the range [0,
GTEST_TOTAL_SHARDS - 1].
- A test program will determine which test functions it should run
based on the above environment variables. It will try to balance the
load on each shard, but it's not guaranteed. If these variables are
not set, the test program will behave as before and run all test
functions it has.
- Running the same test program with the same total shard number and
shard index will always execute the same set of test functions.
- Across all the shards, each test function will be run exactly once.
- If no test functions are run on a given shard (for example, if there
are more shards than test functions), the test program will exit in
success.
Also, your project may have tests that were written without Google
Test and thus don't understand this protocol. In order for your test
runner to figure out which test supports sharding, it can set the
environment variable GTEST_SHARD_STATUS_FILE to a non-existent file
path. If a test program supports sharding, it must create this file
to acknowledge the fact (the actual contents of the file are not
important at this time; although we may stick some useful information
in it in the future.); otherwise it must not create it.
Please let us know if you have concerns with this plan or if you have
suggestions on making it better.
Thanks,
--
Zhanyong
Suppose you have a test program foo_test that has the following 5 test
functions:
TEST(A, V)
TEST(A, W)
TEST(B, X)
TEST(B, Y)
TEST(B, Z)
and you have 3 machines at your disposal. To run the test in
parallel, you would set GTEST_TOTAL_SHARDS to 3 on all machines, and
set GTEST_SHARD_INDEX to 0, 1, and 2 on the machines respectively.
Then you would run the same foo_test on each machine.
Depending on how we implement the sharding, the actual distribution of
the test functions may be different, but here's one possible scenario:
Machine #0 will run A.V and B.X.
Machine #1 will run A.W and B.Y.
Machine #2 will run B.X.
Your test runner will wait for the machines to be done and then check
if any of them has reported a failure.
Cheers,
--
Zhanyong
If no one objects, we'll be implementing this soon. Please speak up
now if you have concerns or suggestions. Thanks,
--
Zhanyong
Sent from my iPhone
On Dec 29, 2008, at 11:22 AM, Zhanyong Wan (λx.x x) <w...@google.com>
wrote:
>
> Hello,
>
> If no one objects, we'll be implementing this soon. Please speak up
> now if you have concerns or suggestions. Thanks,
>
> On Tue, Dec 23, 2008 at 9:05 AM, Zhanyong Wan (λx.x x) <wan@google.c
We are not.
We are extending gtest to *allow* test runners to take advantage of
multiple machines. It's up to the test runners to actually tap this
ability. Existing test runners won't see any difference after this
change.
>
> Sent from my iPhone
>
> On Dec 29, 2008, at 11:22 AM, Zhanyong Wan (λx.x x) <w...@google.com> wrote:
>
>>
>> Hello,
>>
>> If no one objects, we'll be implementing this soon. Please speak up
>> now if you have concerns or suggestions. Thanks,
>>
>> On Tue, Dec 23, 2008 at 9:05 AM, Zhanyong Wan (λx.x x) <w...@google.com>
--
Zhanyong
I haven't spent much time on this, but I don't see why we cannot
easily extend the protocol if we want to in the future. Therefore I'm
not worried here.
> I'm thinking of a scenario where a test case
> has been unchanged for a while, with certain tests running consistently on
> certain shards (per your intended stability guarantee). Then, a new test is
> added. Will that change the distribution of the tests among shards? It
Quite likely.
> seems that it would be advantageous to provide (eventually) a way to
> "shuffle" the tests to expose any unintended interdependencies on colocation
> or execution sequence.
I agree.
> I understand that such enhancements would add
> complexity, so it would also be desirable to allow these features to be
> enabled selectively; hence my "strategy" design pattern suggestion.
yes, these features have to be enabled selectively, as many existing
test programs may have interdependencies between the test functions
and you don't want them to suddenly break. Google Test will provide
the means for a test runner to do that if it cares to, but part of the
job needs to be done in the test runner (e.g. the test runner needs to
provide a way for the user to mark certain tests as
sharding-unfriendly).
> Also, I don't see an explanation of the "test runner" aspect of this
> framework in the public docs. The Wiki's that I can see talk about building
> an executable, but not about maintaining a farm of servers, a common file
> system, consolidating the test results, or any of the other concerns that
> might come with this feature. Maybe you intend this support to be
> complementary to a test runner system that would actually deal with all of
> these issues.
Correct.
> Is that coming soon?
We will try to take advantage of this inside Google. It's up to other
test runners to decide whether they want to do it.
> Do we need to reference an existing
> system that would complement this feature?
That would be the test runner we use inside Google. External users
will need to contribute some code to their favorite test runners if
they want to take advantage of this.
> Also, one minor nit: The "sharding" term does not enjoy such widespread
> usage outside of Google. Maybe we could say that this proposal concerns
> "test concurrency", whereas today we support "test program concurrency." Do
> you think it makes sense to adopt more widely used terminology?
I'll leave this question to Eric. Thanks,
--
Zhanyong
Another nice thing about the term "shard" is that it's both a noun (a
machine that gets one piece of the work) and a verb (to distribute
work across multiple machines). If we use something like "test
concurrency", we'll need to figure out a different work to call one
machine in the pool.
Unless someone has a better idea, I'll make the call and stick with
"shard". Thanks,
--
Zhanyong
Was this ever implemented? We have a large test we'd like to parallelize using something like this, by making it modulo-select parts of input depending on shard number.
--
---
You received this message because you are subscribed to the Google Groups "gunit-users" group.
To view this discussion on the web visit https://groups.google.com/a/google.com/d/msgid/gunit-users/8c99b42b-675f-4653-9a40-29dfc735db87%40google.com.