[Proposal] Allow ExUnit to run asynchronously by test cases (now it's by modules)

Skip to first unread message

Yiming Chen

Aug 31, 2023, 10:57:31 AM8/31/23
to elixir-lang-core
Currently, ExUnit's `async: true` option would run test cases in this module synchronously,
but only run this module asynchronously along with other `async: true` modules.

This is to propose we add an option for ExUnit to run asynchronously by test cases.

# Background
1. Async by module was a surprise

My initial understanding of `async: true` is async by cases, instead of modules. It's a bit surprising the behavior is later.

2. Async by module would behave more like synchronous tests as a module gets more test cases

As we grow our libraries/apps, a test module will have more and more test cases.
It's tedious to break them into separate modules to speed up the test suite run.
And breaking them into modules has the cost of making related tests further from each other.

# Benefits
1. speed up test suite runs for libraries, apps almost effortlessly
2. more accurate `async: xxx seconds, sync: yyy seconds` metrics

# Caveats

1. Async by test cases may not run faster than async by modules:
    - managing these test cases has a cost on its own
    - communicating these test cases between ExUnit Server and Runner has costs as well
2. backward compatibility with current `async: true` behavior

    some libs or apps may rely on the async by module behavior.
    we should still allow user to use `async: ture` by default,
    and make async by test cases an easily opt-in feature.

3. Async by test cases may complex the ExUnit implementation even further

# Potential solution

I looked into current ExUnit implementation a little bit
I think `async by test cases` is doable, but I don't have a concrete solution yet

A initial idea is to:
1. instead of saving modules in ExUnit Server, we save test cases (mfa) in ExUnit Server
2. when Runner asks for more async tests, ExUnit Server returns test cases (and also modules) for Runner

This seems to be a huge change,
so I'd like to know if this feature is desirable/feasible from the core team's PoV before I dig more into it.


José Valim

Sep 3, 2023, 10:48:11 AM9/3/23
to elixir-l...@googlegroups.com
Yes, it is desirable and it has come up in the past: https://github.com/elixir-lang/elixir/pull/11949#issuecomment-1177262901

Although I think async: :per_module is what most people want, since the tests in the same module tend to access the same resource, opting-in for it to be per test will be welcome tho.

You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/51ad9575-71b9-4afe-8996-1dd9e2aea7b8n%40googlegroups.com.

Yiming Chen

Sep 4, 2023, 8:53:01 AM9/4/23
to elixir-lang-core
Thanks for confirming and referencing the past discussion!

100% agree that `async: :per_test | :per_module | true | false` is the way to go 👍
Most people can then switch to it with a global search & replace.

I looked into the current ExUnit implementation and came up with several design directions.

## Current implementation:
  1. register test modules
    1. ExUnit.Case calls ExUnit.Server.add_sync_module in after_compile
    2. ExUnit.Server.add_async_module
  2. mix test
    1. Mix.Tasks.Test
      match test files
    2. Mix.Compilers.Test.require_and_run/4
      require and run matched test files
  3. run tests
    1. ExUnit.run/0
    2. ExUnit.Runner.run/2
    3. ExUnit.Runner.async_loop/3
    4. ExUnit.Server.take_async_modules/1
    5. ExUnit.Runner.spawn_modules/2
    6. spawn_monitor -> ExUnit.Runner.run_module/2
    7. ExUnit.EventManager
      1. module_started
      2. test_started
      3. test_finished
      4. module_finished
    8. ExUnit.Runner.run_module/3
      1. run_setup_all
      2. spawn_monitor -> module.__ex_unit__(:setup_all, test_module.tags)
    9. ExUnit.Runner.run_tests/3
    10. ExUnit.Runner.run_test/3
    11. ExUnit.Runner.spawn_test/3
      1. spawn_test_monitor/4
      2. receive_test_reply/4
      3. exec_on_exit/3
## possible design directions:
  1. registration
    1. save async test cases separately from async modules
    2. save async_per_test modules & async_per_module modules
  2. taking
    1. return async test cases from ExUnit.Server.take_async_per_test_cases/1
    2. return async_per_test modules from ExUnit.Server.take_async_modules(:per_test, count)
      return async_per_module modules from ExUnit.Server.take_async_modules(:per_module, count)
    3. return both modules from ExUnit.Server.take_async_modules/1
  3. running
    1. spawn_tests/3 before spawn_modules/2
    2. spawn_tests/3 inside spawn_modules/2 if it's a per_test async module

I'd prefer 1.2 + 2.2 + 3.1:

  • save async_per_test modules & async_per_module modules
  • return async_per_test modules from ExUnit.Server.take_async_modules(:per_test, count)
  • spawn_tests/3 before spawn_modules/2
What do you think?


José Valim

Sep 4, 2023, 9:08:47 AM9/4/23
to elixir-l...@googlegroups.com
The big question is which properties we want to exhibit.

For example, do we want async tests and async modules to run at the same time? Or even async tests from different modules together? If the answer is no, then we can potentially leave parallelism on the table, because we may be waiting for one test to finish in one module while N other tests from another module could run.

If we say yes, the downside is that we could potentially use too many resources (not CPU wise but stuff like database connections), as in the worst case scenario we will run M * T processes at once (M = Modules, T = Tests). We could try to introduce coordination between M * T to adhere to a limit of max_cases, but that will likely be too complex.

Yiming Chen

Sep 4, 2023, 12:12:37 PM9/4/23
to elixir-lang-core
I think the answer is clearly NO currently.

1. Being able to run test cases in parallel is a big win already.
I assume for most resource-light code bases (like state-less libraries), they can just upgrade to `async: per_test` for all their tests, without worrying about resource utilization.

2. And we can always add more parallelism later (after adding `async: :per_test` support this time)
With further optimizations like `max_cases` limit, etc.

So, I assume `3.1 spawn_tests/3 before spawn_modules/2` is the way to go now?
What do you think?

José Valim

Sep 4, 2023, 1:21:01 PM9/4/23
to elixir-l...@googlegroups.com
If we want to add more parallelism later, then it is worth discussing future developments of the API now, so we don't put ourselves into a corner. If the answer is no, it is clear we are only parallelizing tests or parallelizing modules, but never both. And if we want to do both in the future, we should maybe consider the API now.

Yiming Chen

Oct 1, 2023, 7:46:24 AM10/1/23
to elixir-lang-core
Sorry for the late response.
I'll look into the API options again and post the results here.

Michal Śledź

Jan 4, 2024, 2:07:31 PMJan 4
to elixir-lang-core
That would be awesome for tests with a lot of `refute_receive` assertions  

Yiming Chen

Jan 24, 2024, 9:33:35 PMJan 24
to elixir-lang-core
Before proposing more API options, let me first try to hack a working version of case-level async ExUnit.
So that I can benchmark the performance differences between case-level and module-level async runner.
And we can then decide if it's worth pursuing case-level async further.


Yiming Chen

Jan 25, 2024, 7:28:07 AMJan 25
to elixir-lang-core
I got a PoC working and opened a PR for it:

But the benchmark results against real world projects were not promising.
I'm not sure if we should continue increasing the granularity of ExUnit async.
Reply all
Reply to author
0 new messages