Hello Ogi,
I'm Kenichi from the CuPy team. Thank you for using CuPy and congratulations on the v0.12 release including GPU support!
As you see our primary CI pipeline is on Jenkins. The master node (Jenkins Web UI) is on AWS EC2, and the worker (a GPU node to runs test) is on-premise.
I don't think this repo is useful for downstream projects (it's a bit complicated as we have to test against various CUDA versions and library versions), but in summary, `./run_test.py --test cupy-py3` builds a docker image containing all dependencies (including CUDA Toolkit) and runs CuPy build and unit tests inside the container to isolate the test environment.
Speaking of Jenkins,
Pros are:
- Flexible; you can do almost everything you want
- Plenty of resources and plug-ins; GitHub integration to trigger tests via test phrase ("Jenkins, test this please"), launching worker node in the cloud AWS/GCP/Azure only when needed (this can reduce cost if the test run is not very frequent)
Cons are:
- Not a SaaS solution; you need to manage whole master/worker servers.
- No per-test isolation support; you need to do it by yourself (e.g., Docker as described above)
Other choices are (although we don't have much experience):
- GitHub Actions with self-hosted runners with GPU, as you mentioned. GitHub actions are often targeted by attackers, but I think the risk can be reduced by setting the workflow to only trigger by a test phrase.
Finally, unfortunately, there's no service offering test infrastructure with CUDA for free, AFAIK. You may need to change the development flow (e.g., use a test phrase to trigger GPU test instead of running tests automatically for all pull-requests) to reduce the number of test runs.
This will also reduce the security risk of malicious pull-request attacks like:
Hope this helps!
Thanks,
Kenichi