PSA: give feedback about a new test launcher

251 views
Skip to first unread message

Paweł Hajdan, Jr.

unread,
Sep 11, 2013, 2:03:16 PM9/11/13
to chromium-dev
Do you run Chrome unit tests? base_unittests, net_unittests, unit_tests and so on?

Note this doesn't apply to browser tests like content_browsertests, browser_tests and interactive_ui_tests. These will come next and will have a slightly different launcher, although with a similar look&feel.

Please try passing --brave-new-test-launcher flag to your unit test binary. You should see something like this:

$ out/Debug/base_unittests --brave-new-test-launcher
Starting tests (using 32 parallel jobs)...
IMPORTANT DEBUGGING NOTE: batches of tests are run inside their
own process. For debugging a test inside a debugger, use the
--gtest_filter=<your_test_name> flag along with
--single-process-tests.
[1/1187] AtExitTest.LIFOOrder (0 ms)
[2/1187] AtExitTest.Param (0 ms)
[3/1187] AtExitTest.Task (0 ms)
[4/1187] AtomicOpsTest.Inc (0 ms)
[5/1187] AtomicOpsTest.CompareAndSwap (0 ms)
[6/1187] AtomicOpsTest.Exchange (0 ms)
[7/1187] AtomicOpsTest.IncrementBounds (0 ms)
[8/1187] AtomicOpsTest.Store (0 ms)
[9/1187] AtomicOpsTest.Load (0 ms)
[10/1187] BarrierClosureTest.RunImmediatelyForZeroClosures (0 ms)

If your unit test binary doesn't seem to change behavior when that flag is passed, and you're interesting in the new launcher, just let me know and I can get it added.

Note how it runs jobs in parallel. You can control that via --test-launcher-jobs flag, e.g. --test-launcher-jobs=1.

When individual tests crash or time out, the launcher doesn't exit and runs all remaining tests.

I'd like this launcher to be rock-solid. If you see anything suspicious, crash, hang, leftover processes, please let me know. It's a bug.

I'd also like it to be developer-friendly. Please let me know if there is some functionality you'd like to see, some little tweak that would make things easier/better for you, and so on.

Don't hesitate to just say it works well for you - all feedback is very welcome. I intend it to eventually become default.

Paweł

Scott Graham

unread,
Sep 11, 2013, 2:22:30 PM9/11/13
to Paweł Hajdan, Jr., chromium-dev
Exciting, much faster! 1m04s -> 14s for base_unittests on Windows.

FWIW, it seems to cause TimeTicks.Drift to fail vs. sequential running, but I'm not sure that test is so useful anyway. I did get a crash that I didn't capture the first time I ran it too, but I wasn't able to repro on subsequent runs unfortunately.

 


--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

Marc-Antoine Ruel

unread,
Sep 11, 2013, 3:30:10 PM9/11/13
to Scott Graham, Paweł Hajdan, Jr., chromium-dev
Paweł didn't mention it explicitly but the main goal is to get rid of run_test_cases.py, so there's one less python process in the food chain. This is exciting because running tests is a time critical path and python is not the faster thing on Windows, or at all.

Also, this removes the need of special incantations to run the test locally fast once this gets in by default. All these test case interference fixes are paying off permanently.

In addition, having the parallel execution native it much better than bolting it on like it is currently done with run_test_cases.py. As such in the end the script tools/sharding_supervisor/sharding_supervisor.py will get away.

M-A


2013/9/11 Scott Graham <sco...@chromium.org>

Ilya Sherman

unread,
Sep 12, 2013, 12:55:09 AM9/12/13
to Paweł Hajdan, chromium-dev
Pretty cool -- thanks for working on this, Paweł!

My one big concern is that this seems to swallow all DLOG-type output from test runs, which makes it much harder to debug tests by adding logging.  I've found that when running tests locally, I can pass "--single-process-tests" to restore the old behavior, but I'm worried that I'll have no way to recover logged output from test runs on the build bots.

On Wed, Sep 11, 2013 at 11:03 AM, Paweł Hajdan, Jr. <phajd...@chromium.org> wrote:

--

Maciej Pawlowski

unread,
Sep 12, 2013, 3:33:08 AM9/12/13
to chromi...@chromium.org
On 2013-09-11 20:03, Paweł Hajdan, Jr. wrote:
>
> Note how it runs jobs in parallel. You can control that via
> --test-launcher-jobs flag, e.g. --test-launcher-jobs=1.
>

I know that many unit tests depend on global variables. How does that
work in conjunction with parallel execution?

--
BR
Maciej Pawlowski
Opera Desktop Wroclaw

Alexander Potapenko

unread,
Sep 12, 2013, 8:22:59 AM9/12/13
to Paweł Hajdan, Jr., chromium-dev
The output of out/Release/base_unittests --brave-new-test-launcher
differs for me on Linux and OS X (both x64).
On Linux it looks exactly like you've posted:

$ out/Release/base_unittests --brave-new-test-launcher
Starting tests (using 32 parallel jobs)...
...
[1/1155] AtomicOpsTest.CompareAndSwap (0 ms)
[2/1155] AtomicOpsTest.Exchange (0 ms)
...

On Mac it splits the list of tests into shards and prints the gtest
output of each shard:
$ time out/Release/base_unittests --brave-new-test-launcher
Starting tests...
...
Note: Google Test filter =
HistogramDeathTest.BadRangesTest:AsyncSocketIoHandlerTest.AsynchronousReadWithMessageLoop:AsyncSocketIoHandlerTest.SynchronousReadWithMessageLoop:AsyncSocketIoHandlerTest.ReadFromCallback:AsyncSocketIoHandlerTest.ReadThenClose:AtExitTest.Basic:AtExitTest.LIFOOrder:AtExitTest.Param:AtExitTest.Task:AtomicOpsTest.Inc
[==========] Running 10 tests from 4 test cases.
...
[ PASSED ] 10 tests.

YOU HAVE 21 DISABLED TESTS

Note: Google Test filter =
AtomicOpsTest.CompareAndSwap:AtomicOpsTest.Exchange:AtomicOpsTest.IncrementBounds:AtomicOpsTest.Store:AtomicOpsTest.Load:BarrierClosureTest.RunImmediatelyForZeroClosures:BarrierClosureTest.RunAfterNumClosures:Base64Test.Basic:BindTest.ArityTest:BindTest.CurryingTest
[==========] Running 10 tests from 4 test cases.
... (and so on)


Moreover, on Mac base_unittests seem to run faster without
--brave-new-test-launcher:

real 4m37.801s
user 0m13.012s
sys 3m55.422s

vs.

real 5m59.098s
user 0m20.268s
sys 4m57.302s
> --
> --
> Chromium Developers mailing list: chromi...@chromium.org
> View archives, change email options, or unsubscribe:
> http://groups.google.com/a/chromium.org/group/chromium-dev



--
Alexander Potapenko
Software Engineer
Google Moscow

Sergey Matveev

unread,
Sep 12, 2013, 10:50:44 AM9/12/13
to Alexander Potapenko, Paweł Hajdan, Jr., chromium-dev
This is great news. When deploying LeakSanitizer, I had to configure the bots to run each test case in a dedicated process. With the new launcher, it looks like we should be able to do that without wasting time on loading and unloading the binary. This should speed things up significantly.

However, when I tried using this launcher with LeakSanitizer, there were some issues: 
- the launcher seems to ignore LSan errors. I have confirmed that LSan is invoked correctly for each subprocess. However, its design is such that it will signal an error with a non-zero exit code *after* all tests have passed. It looks like the launcher isn't designed to handle that situation.
- there are memory leaks in base::LaunchUnitTests :) I will file a separate bug.

Also, we will probably need the equivalent of run_test_cases.py's '--verbose' flag.

Sergey

Jói Sigurðsson

unread,
Sep 12, 2013, 11:56:19 AM9/12/13
to earthdok, Alexander Potapenko, Paweł Hajdan, Jr., chromium-dev
This is very cool. Works well for media_unittests and
components_unittests, tested on Mac and Linux and it runs the tests in
about 1/3rd of the time.

Cheers,
Jói

Paweł Hajdan, Jr.

unread,
Sep 12, 2013, 12:50:26 PM9/12/13
to mpawl...@opera.com, chromium-dev
On Thu, Sep 12, 2013 at 12:33 AM, Maciej Pawlowski <mpawl...@opera.com> wrote:
I know that many unit tests depend on global variables. How does that work in conjunction with parallel execution?

This is easy to answer. It uses multiple processes for parallel execution, not different threads. This also helps with e.g. crashes - otherwise one crashing test would take down entire binary, which I'd like to avoid.

Note that depending on global resources is still troublesome - I've noticed reports of tests failing when run in parallel, so I'll most probably have to add a serial retry of tests similar to run_test_cases.py's one.

I'm still pondering a slightly different approach, which would be to have a special mode that would attempt to detect tests that fail when run in parallel. Then we'd have a list of these tests that just wouldn't be even attempted in parallel mode, and would be run serially at the end. This has an advantage of having a list of tests to be fixed, and hopefully preventing that list from growing further.

Paweł 

Paweł Hajdan, Jr.

unread,
Sep 12, 2013, 1:01:27 PM9/12/13
to Ilya Sherman, chromium-dev
Hey Ilya - I very much share your concern. Hey, it's not only useful for debugging - chromium-build-logs.appspot.com will need it - in fact its usefulness is slightly reduced now because run_test_cases.py swallows output of successful tests on the bots.

For buildbots I intend the launcher to create a special JSON-formatted file with much more detailed information that gtest's XML output or say stdout. This would include distinguishing between failures, crashes, and timeouts, having full output of each process and also an output snippet for each test, and metadata about retries (for each test, is it a first try or retry etc - otherwise multiple executions of the same test in the same run can be very confusing).

For local execution I'm thinking about a flag - tentative name --test-launcher-show-test-output or maybe just --test-launcher-verbose (I'd like to have all options start with --test-launcher, but any naming suggestions are welcome).

Paweł

Ilya Sherman

unread,
Sep 12, 2013, 6:23:02 PM9/12/13
to Paweł Hajdan, Jr., chromium-dev
Sounds good!  As long as there's a way to get at all the output, I'm happy :)

Thiago Farina

unread,
Sep 12, 2013, 9:25:43 PM9/12/13
to Paweł Hajdan, Jr., chromium-dev
On Wed, Sep 11, 2013 at 3:03 PM, Paweł Hajdan, Jr.
<phajd...@chromium.org> wrote:
> Do you run Chrome unit tests? base_unittests, net_unittests, unit_tests and
> so on?
>
> Note this doesn't apply to browser tests like content_browsertests,
> browser_tests and interactive_ui_tests. These will come next and will have a
> slightly different launcher, although with a similar look&feel.
>
> Please try passing --brave-new-test-launcher flag to your unit test binary.
> You should see something like this:
>
Nice, I like that the output matches the ninja output (maybe that was
intentional?! hum?) ;)

$ content_unittests --brave-new-test-launcher
Starting tests (using 4 parallel jobs)...
...
[1833/1833] WebRTCAudioDeviceTest.PlayLocalFile (2061 ms)
1833 tests run
Tests took 65 seconds.

--
Thiago

Maciej Pawlowski

unread,
Sep 13, 2013, 2:54:33 AM9/13/13
to chromium-dev
Running them in processes instead of threads indeed explains globals. However, some tests may use files on the filesystem or some other cross-process resource. This might explain why some tests still fail. I agree with the idea to create a list of such broken tests, I think it's nicer than serial retries. Retries hide the problem instead of solving it.

Marc-Antoine Ruel

unread,
Sep 13, 2013, 11:23:22 AM9/13/13
to mpawl...@opera.com, chromium-dev
2013/9/13 Maciej Pawlowski <mpawl...@opera.com>
Using sharing global resources exclusively outside of interactive_ui_tests is wrong and I've fixed the vast majority of these cases (or filed bug reports to have the test cases fixed). Note that what Pawel is doing is not new behavior, it's replacing one tool (run_test_cases.py) by another one (in C++).

Parallel testing as been in use for years already. See my previous emails on this mailing list for more details.

M-A

Paweł Hajdan, Jr.

unread,
Sep 13, 2013, 6:57:22 PM9/13/13
to Alexander Potapenko, chromium-dev
On Thu, Sep 12, 2013 at 5:22 AM, Alexander Potapenko <gli...@chromium.org> wrote:
The output of out/Release/base_unittests --brave-new-test-launcher
differs for me on Linux and OS X (both x64).
On Linux it looks exactly like you've posted:

$ out/Release/base_unittests --brave-new-test-launcher
Starting tests (using 32 parallel jobs)...
...
[1/1155] AtomicOpsTest.CompareAndSwap (0 ms)
[2/1155] AtomicOpsTest.Exchange (0 ms)
...

On Mac it splits the list of tests into shards and prints the gtest
output of each shard:

Please make sure to sync to ToT. I have a success report that it works there as expected on Mac.

You're probably just running a slightly earlier version of the code.
 
Moreover, on Mac base_unittests seem to run faster without
--brave-new-test-launcher:

real 4m37.801s
user 0m13.012s
sys 3m55.422s

vs.

real 5m59.098s
user 0m20.268s
sys 4m57.302s

In the report I got base_unittests took 28 seconds in Debug mode. Are you running this with some memory tool? Given we launch more processes this would increase the overhead of instrumenting each process. It could be worth to make some adjustments for the memory tools - ideas and feedback welcome. Again, it probably has an advantage by taking python out of the picture, and if the memory tool works at compile-time like ASAN, we can easily (and transparently) take that into account in the launcher.

Paweł
Reply all
Reply to author
Forward
Message has been deleted
0 new messages