On 11/18/20 6:33 PM, David A. Wheeler wrote:
> Unfortunately Travis has become increasingly unreliable... I propose switching to something else. The “obvious” answer is GitHub actions.
On Nov 19, 2020, at 1:47 AM, Jim Kingdon <
kin...@panix.com> wrote:
>> All sounds good, thanks for looking into this.
>> I'd say that running everything on every pull request would be my preference, and I don't think it takes that long to run (at least, it hasn't on travis). Nightly is an option if we have to, but if the nightly run fails there is a debugging step of figuring out which of the pull requests merged that day caused the problem.
Good news, the switch from Travis CI to GitHub Actions is implemented in this pull request:
https://github.com/metamath/set.mm/pull/1879
I’m waiting for Norm to approve it & merge it; I don’t know any reason he’d object.
This change implements all the same checks as the Travis CI pipeline (it’s basically the same code).
I did the initial conversion; Mario then sped up the smm3 (Rust) process by adding a cache.
Just like the Travis version, all verifiers run in parallel. A sample run’s timing was:
* 47 seconds (checkmm, C++)
* 69 seconds (metamath.exe, C, does extra checks)
* 14 seconds (smm3, Rust *when cached*). Note: it's 111 seconds without a cache
* 102 seconds (mmj2, Java, does extra checks)
* 89 seconds (mmverify.py, Python,
set.mm body)
* 73 seconds (mmverify.py, Python,
set.mm mathboxes)
* 16 seconds (mmverify.py, Python,
iset.mm)
Since they’re parallel, verification by *all* them is the same as the slowest one, plus sometimes a few seconds to wait for spare CPUs. So it takes less than 2 minutes to complete.
A few comments about these numbers are probably in order. The Rust one is cached in these numbers (14 seconds). Without caching Rust is the *slowest* because of compilation time; one sample was 111 seconds per <
https://github.com/metamath/set.mm/runs/1428848217> We can shave off 30 seconds in metamath.exe by caching its compiled result, and 7 seconds by checkmm by caching its compiled result. We’ll probably do that at some point, though there’s no rush.
I agree with Jim Kingdom & think there's no need to move anything to a "nightly" run:
* These all run in parallel. The longest one (mmj2) has a sample time of 1 minute 42 seconds. I think that's fine to keep running all in parallel, we aren't in *that* much of a rush to merge things. I don't think there's a problem if it takes less than 2 minutes to run all the checks.
* The longest-running one (mmj2) does added checks (for our definition conventions) that most other verifiers don't do. So it's not true that omitting mmj2 will have identical results.
* This doesn't really impact the environment. We're talking seconds, and a lot of this is waiting (e.g., for checkouts & downloads). It'd be impractical to try to measure the "extra" energy used. Adding the two caches would be a nice touch & reduce CPU time some, and that's easy to do, so we should do that eventually. But I don't see a reason to go beyond that.
That said, if we *do* want to move to nightly checks, we’re now better positioned to do it.
--- David A. Wheeler