Migration from Travis to Github Actions is now complete-ish

24 views
Skip to first unread message

Oscar Benjamin

unread,
Dec 11, 2020, 7:14:37 PM12/11/20
to sympy
Hi all,

We have just completed a migration from using Travis as our CI system
to using Github Actions. This was discussed in several issues and pull
requests but the main motivation is here:
https://github.com/sympy/sympy/issues/20374

I don't know how long SymPy has been using Travis but they have given
us a fantastic service (free of charge) for many years. We should
definitely be thankful to them for the huge expense that they have
provided in CPU cycles that we have depended on as part of testing the
correctness of any proposed changes to the main SymPy codebase.
Travis' model for providing services to open source projects has
changed though and it does not look like the new version of their
service would be usable for SymPy.

Although the main changes in Travis' service were due to kick in at
the end of 2020 it seems that they began winding down their provision
of the old service in advance of that which meant that we needed to
switch ASAP. Slowdowns in CI contributed to the delay of the 1.7
release and then made it difficult to keep contributions to SymPy
ticking over after the release.

The situation became urgent and there wasn't much time to discuss
possible alternatives to Travis but Github Actions seemed an obvious
choice so in the little time I had I built a new CI config for Actions
here:
https://github.com/sympy/sympy/blob/8b2b7e4c616677e054d01e997ab940b3150aa89d/.github/workflows/runtests.yml

Today I have disabled Travis from running on pull requests (it will
still run on the master branch after a PR is merged). I have also made
the Actions jobs "required" so that a PR can not be merged unless it
passes the tests on Github Actions. That mostly completes the
migration to Actions but I'm sure that there will be more teething
problems or things that I've missed.

For a while now it may be necessary to close and reopen PRs when
reviewing to make sure they run under the new CI. Any PR that
previously failed on Travis will still show with a red "fail" cross
even if subsequent changes have fixed any errors (Travis will not run
again after changes now). The PR will be mergeable if the tests have
passed under Actions but it will still show as having "failed" in
Travis. This also applies to any of the most recently pushed PRs for
which I cancelled the Travis build (Github shows a cancelled Travis
build as having "failed").

In the short term when reviewing a PR:
1) Close and open to rerun the tests under the new CI
2) Ignore any report of failure from Travis

Also could reviewers please pay careful attention to Actions and the
output of the different test jobs for a while?

It is very likely that I have overlooked something in the migration so
that the codebase is not as rigorously tested as it was before and
some things that should fail might pass. This is why changes to CI are
risky and need careful review. There hasn't been as much time as I
would like to test out the new CI in parallel with the old.

Finally having worked with both the Travis and Actions CI systems I
can say that I think this is an improvement in the long term. The
config format for Actions is significantly better (I hated editing the
.travis.yml and test_travis.sh files!) but also right now Actions are
giving us much more computing power than Travis ever did. We should
still focus on reducing the time taken to run the tests but it's good
that we now have a system that has more capacity to run our extremely
slow test suite.

Oscar

Aaron Meurer

unread,
Dec 11, 2020, 8:03:38 PM12/11/20
to sympy
Thank you so much for getting this working Oscar. The effective
removal of the free tier by Travis was unexpected, so we had
relatively little time to prepare and move to another system.
Indeed, if other CI systems seem better at some point in the future,
we can investigate them (especially if someone is willing to write the
build configuration file for it).

>
> Today I have disabled Travis from running on pull requests (it will
> still run on the master branch after a PR is merged). I have also made
> the Actions jobs "required" so that a PR can not be merged unless it
> passes the tests on Github Actions. That mostly completes the
> migration to Actions but I'm sure that there will be more teething
> problems or things that I've missed.
>
> For a while now it may be necessary to close and reopen PRs when
> reviewing to make sure they run under the new CI. Any PR that
> previously failed on Travis will still show with a red "fail" cross
> even if subsequent changes have fixed any errors (Travis will not run
> again after changes now). The PR will be mergeable if the tests have
> passed under Actions but it will still show as having "failed" in
> Travis. This also applies to any of the most recently pushed PRs for
> which I cancelled the Travis build (Github shows a cancelled Travis
> build as having "failed").

I would also add that if a PR is obviously in need of changes, then
it's not necessary to restart the tests, since any new commits that
are pushed will automatically start the tests again anyway.

>
> In the short term when reviewing a PR:
> 1) Close and open to rerun the tests under the new CI
> 2) Ignore any report of failure from Travis
>
> Also could reviewers please pay careful attention to Actions and the
> output of the different test jobs for a while?
>
> It is very likely that I have overlooked something in the migration so
> that the codebase is not as rigorously tested as it was before and
> some things that should fail might pass. This is why changes to CI are
> risky and need careful review. There hasn't been as much time as I
> would like to test out the new CI in parallel with the old.
>
> Finally having worked with both the Travis and Actions CI systems I
> can say that I think this is an improvement in the long term. The
> config format for Actions is significantly better (I hated editing the
> .travis.yml and test_travis.sh files!) but also right now Actions are
> giving us much more computing power than Travis ever did. We should
> still focus on reducing the time taken to run the tests but it's good
> that we now have a system that has more capacity to run our extremely
> slow test suite.

One downside to Actions compared to Travis that people should be aware
of is that the logs for Actions builds are removed after 90 days.
Thus, you shouldn't link to a GitHub actions log in an issue. If you
need to reference a build log, you should copy the text of the log
into the relevant issue, or into a gist if it is long.


Aaron Meurer

>
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxTUqPWFLxmgdCE-Pba1_n%2BNFTcX7kTA%2B3GU7B5Sv6prJg%40mail.gmail.com.

Jason Moore

unread,
Dec 12, 2020, 4:30:46 AM12/12/20
to sympy
Thank you all for doing this. We're all seeing a future of tons of work across thousands of repositories to migrate from Travis. SymPy is a nice early case we can look to to mimic. These things are always quite painful and we appreciate Oscar (and others that likely helped) for bearing that burden.

One question: what was the solution for pushing the docs via doctr?

Jason

Oscar Benjamin

unread,
Dec 12, 2020, 8:19:00 AM12/12/20
to sympy
We haven't migrated the doctr part. I assume that for now at least it
will still work from Travis since Travis is still running on the
master branch (the master build is what pushes the docs). I think
Aaron has done some work in migrating doctr to Actions as well.

That's a good point though I should have mentioned that some things
are missing from Actions compared to what we had on Travis:

1. There are no tests under pypy.
2. We don't have coverage measurement any more.
3. The Tensorflow 1.x tests don't run any more.
4. The benchmarks aren't run on Actions.

Actions is perfectly capable of doing each of these things. It's just
a case of someone taking the time to add them to the CI config and
test them out to make sure that they are working. It's also worth
considering how useful these things are since we burn a lot of CPU
time on each of them for every push to every PR:

I haven't seen anything useful from the pypy tests for a long time.

Coverage measurement is good but the implementation we had with
codecov was not so good and regularly reported meaningless coverage
changes. We should try to get something better if we are going to
bother running coverage tests.

I don't know if the Tensorflow 1.x tests are needed but I assume
they're not super important. We still have tests for Tensorflow 2.3.

The benchmarks were useful but the benchmarks job didn't actually fail
the build if there was a slowdown. It was useful to be able to look at
the results but you had to go out of your way to see them and I think
that most contributors didn't know they were there. If we can make
them more noticeable that would be good.

Actions allows self-hosted runners so if we had a dedicated benchmarks
machine we could use that to run them on consistent hardware. Perhaps
with that we could run benchmarks for each PR and also for master and
have the results for master available online somewhere.

Oscar
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAP7f1AjYrd%2Buz-BwQZY_5ecG0ACK1YrWoeXuxhPHkNQ4jiMvLQ%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages