Enabling precommit.ci to fix common PR problems automatically

135 views
Skip to first unread message

Oscar Benjamin

unread,
Mar 28, 2023, 9:41:44 AM3/28/23
to sympy
Hi all,

There are two open PRs discussing the potential use of pre-commit and
pre-commit.ci in SymPy:

https://github.com/sympy/sympy/pull/24908
https://github.com/sympy/sympy/pull/24941

I want to know what others think specifically about enabling
pre-commit.ci on the sympy repo or otherwise encouraging use of
pre-commit for contributors.

I'll explain below but you can also read about pre-commit and
pre-commit.ci here:

https://pre-commit.com/
https://pre-commit.ci/

The pre-commit tool is something that can be installed locally and can
be used directly or as a git hook so that when making a git commit
some quick checks can run on the code. The PR gh-24908 would add some
configuration for this so that a contributor can either run some
checks before committing or can install pre-commit as a git hook so
that git commit automatically runs the checks. The configuration in
gh-24908 means that pre-commit runs flake8 and ruff but specifically
only on the files that are being changed in the commit which is
convenient because it is much faster than checking the whole codebase.

To be clear adding the pre-commit config to the sympy repo does not
make it mandatory for all contributors to use the git hook. However it
could be something that is "recommended" as it will quickly show up
some common problems that would otherwise fail the checks in CI after
opening a PR or after pushing to a PR.

What is also discussed in those PRs is adding pre-commit.ci to the
sympy repo which is something different from just adding a pre-commit
configuration that contributors can choose to use or not. The
difference is that pre-commit.ci is a GitHub bot that will run the
pre-commit hooks on all pull requests and can often fix the problems
automatically by pushing a new commit to the PR.

Currently if someone pushes a PR that has simple problems like
trailing whitespace or unnecessary imports then the flake8 or quality
checks in CI will report an error asking the contributor to fix those
problems. With pre-commit.ci we could make it so that those problems
are just fixed automatically without the contributor needing to do
anything.

Both trailing whitespace and unnecessary imports are automatically
fixable e.g. there is already a bin/strip_whitespace script and ruff
can fix the imports with:

ruff check --select F401 --fix sympy

Obviously other things could be fixed automatically but these are the
two that I see most often where someone pushes and then needs to push
a followup fixing commit after seeing CI checks fail. If precommit.ci
was used there would be no need to push a follow up commit because the
bot would just do it automatically.

On the other hand if someone uses the pre-commit hook locally then
that could fix these things automatically before pushing and there
wouldn't be any need for the bot to fix them in CI. The advantage of
the CI bot would be that it could apply simple fixes for someone who
does not use the git hook and didn't check pre-commit before pushing.

To be clear there would not be any requirement for any individual
contributor to use pre-commit locally. However if pre-commit.ci runs
on PRs then that is obviously not optional and there would be a bot
pushing fix commits to PRs.

Does anyone have any thoughts on enabling pre-commit.ci or otherwise
encouraging contributors to use pre-commit?

--
Oscar

Jason Moore

unread,
Mar 28, 2023, 11:50:25 AM3/28/23
to sy...@googlegroups.com
I personally would find a bot adding commits to my work a bit intrusive. If the bot posted a comment to the issue telling me what to fix, that would be preferable. Right now we have to make a few clicks to see why the linter failed.

Conda forge has a bot that will add commits to your branch, but only if you explicitly ask it to. If we had some bot commands like '@sympy/bot please fix flake8 issues' then that would run the fix and add the commit, but it is the author's choice to do so.

Jason

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxSmJ2aGsiHf6%2BNFLaBL0xkUeH0_CJ86uqQR-py6uCZnxg%40mail.gmail.com.

Tirthankar Mazumder

unread,
Mar 28, 2023, 12:12:59 PM3/28/23
to sympy
Adding a bit to this, the Rust repository does something similar to the pre-commit hook thing: When you set up your dev environment, you can optionally run ./x.py setup which optionally installs a pre-commit hook (with the user's permission, of course) for running tidy, their internal tool for ensuring code quality.

Oscar Benjamin

unread,
Mar 28, 2023, 1:09:39 PM3/28/23
to sy...@googlegroups.com
On Tue, 28 Mar 2023 at 17:13, Tirthankar Mazumder
<greenw...@gmail.com> wrote:
>
> Adding a bit to this, the Rust repository does something similar to the pre-commit hook thing: When you set up your dev environment, you can optionally run ./x.py setup which optionally installs a pre-commit hook (with the user's permission, of course) for running tidy, their internal tool for ensuring code quality.

Yes, lots of different repos do something similar. I don't like the
idea of actually using a git hook myself that prevents git commit from
working if the checks don't pass. Being able to manually run some
quick checks is useful though and providing that in a preconfigured
way makes it easier for new contributors to get started.

With https://github.com/sympy/sympy/pull/24908 you can pip install
pre-commit and then any time you want to run the precommit checks you
just run:

$ pre-commit
flake8...................................................................Passed
ruff.....................................................................Passed

Here that runs in under a second. A few other checks could also be
added without slowing it down too much.

> On Tuesday, March 28, 2023 at 9:20:25 PM UTC+5:30 moore...@gmail.com wrote:
>>
>> I personally would find a bot adding commits to my work a bit intrusive. If the bot posted a comment to the issue telling me what to fix, that would be preferable. Right now we have to make a few clicks to see why the linter failed.
>>
>> Conda forge has a bot that will add commits to your branch, but only if you explicitly ask it to. If we had some bot commands like '@sympy/bot please fix flake8 issues' then that would run the fix and add the commit, but it is the author's choice to do so.

I don't think anything exists that works like that. Of course if the
fixes can be applied automatically then you could just run pre-commit
to apply them locally.

--
Oscar

Aaron Meurer

unread,
Mar 28, 2023, 5:53:22 PM3/28/23
to sy...@googlegroups.com
That's what I was worried about too. If the bot pushes a commit, then the PR author won't be able to push any additional commits unless they either pull first or force push. Personally I would find that a little surprising, and might not even notice it when I do "git push". Plus I feel like this would push people into the bad habit of always force pushing.

Aaron Meurer

Oscar Benjamin

unread,
Mar 31, 2023, 10:20:57 AM3/31/23
to sy...@googlegroups.com
Okay, so it seems people are not keen on integrating the pre-commit.ci bot.

We can of course continue to make a pre-commit config that any
contributor wants to use.

Another question I have is what people think about using dependabot.
We regularly have problems where new versions of SymPy's dependencies
cause breakage in CI. This can happen because of things like:

- New Sphinx versions. Basically every significant release of Sphinx
seems to break some part of building SymPy's docs.
- New numpy/scipy etc versions. It is common that these will introduce
things like deprecation warnings and also various things will break
because of the removal of functions etc.
- New versions of linters like flake8 and its plugins or mypy.

Whenever this happens and CI gets broken it is both disruptive and
confusing for any contributors whose PRs will fail CI checks because
of problems that are unrelated to the changes they have made.

A common way to prevent CI from breaking is to pin dependencies of the
things that are used in CI but the difficulty with that is the need to
keep the pinned versions updated as new releases are made. A widely
used solution for this is dependabot which is a bot that can
automatically open PRs to update pinned versions. Depending on how
many versions are pinned and how often the pinned things make new
releases there would be some number of PRs from dependabot updating
versions one dependency at a time (dependabot does not have a way to
do several dependencies in a single PR).

The advantage of using dependabot is that if a new version of some
dependency causes CI to break then it will only break in dependabot's
PR that tries to bring in the update. Otherwise we can keep things
updated by updating the versions only when CI passes.

The dependabot update PRs can be annoying but at the same time they
are very simple PRs that are easy to review and merge if CI passes.
When CI fails they would often also be easy problems to fix and would
often make suitable "easy to fix" issues (we can afford to do this if
there is no need to rush in a fix because of broken CI).

Does anyone have any views on enabling dependabot on the SymPy repo
(or some similar alternative)?

--
Oscar
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAKgW%3D6%2BUYTFZWf%3DHJJZPx0UukKx1PgnTtY8%3DcK7MwJOTOmD1Tw%40mail.gmail.com.

Chris Smith

unread,
Mar 31, 2023, 11:24:50 AM3/31/23
to sympy
I like the idea of the following

With https://github.com/sympy/sympy/pull/24908 you can pip install
pre-commit and then any time you want to run the precommit checks you
just run:

$ pre-commit
flake8...................................................................Passed
ruff.....................................................................Passed

I especially like that they will only check changed files.

/c

Aaron Meurer

unread,
Mar 31, 2023, 4:44:03 PM3/31/23
to sy...@googlegroups.com
I only ever hear bad things about dependabot. I don't have any
experience with it myself, but I would be cautious about using it.
Maybe try reaching out to other communities that have tried it to see
what their experiences have been.

Aaron Meurer
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxR5CSvWrLEEgz827gPrQJj1soVWkOE%3Djo87FfxiOY_MkQ%40mail.gmail.com.

Jason Moore

unread,
Apr 1, 2023, 12:33:05 AM4/1/23
to sy...@googlegroups.com
When the # of dependencies is large, dependabot is a very annoying feature. I contributed to a Javascript lib and the dependabot floods your inbox and notifications with useless PRs. It may be ok for us, since it is only checking a handful of dependencies and those don't change too often.

Jason

Aaron Meurer

unread,
Apr 1, 2023, 1:36:42 AM4/1/23
to sy...@googlegroups.com
On Fri, Mar 31, 2023 at 10:33 PM Jason Moore <moore...@gmail.com> wrote:
>
> When the # of dependencies is large, dependabot is a very annoying feature. I contributed to a Javascript lib and the dependabot floods your inbox and notifications with useless PRs. It may be ok for us, since it is only checking a handful of dependencies and those don't change too often.

We actually have quite a few, assuming we were to pin all of them

https://github.com/sympy/sympy/blob/master/.github/workflows/runtests.yml#L201-L203
(there's a few others in this file too, search for "install")
https://github.com/sympy/sympy/blob/master/doc/requirements.txt

I don't know if there's a tool that lets you easily see how often
these are updated but my guess would be 1-5 updates a week.

Aaron Meurer
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAP7f1Ai6hthbys7cGx9UyXzVSEj%2BZQmx7KGx3SMdj2Sc4FSKWQ%40mail.gmail.com.

Oscar Benjamin

unread,
Apr 1, 2023, 7:04:40 AM4/1/23
to sy...@googlegroups.com
On Sat, 1 Apr 2023 at 06:36, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Fri, Mar 31, 2023 at 10:33 PM Jason Moore <moore...@gmail.com> wrote:
> >
> > When the # of dependencies is large, dependabot is a very annoying feature. I contributed to a Javascript lib and the dependabot floods your inbox and notifications with useless PRs. It may be ok for us, since it is only checking a handful of dependencies and those don't change too often.
>
> We actually have quite a few, assuming we were to pin all of them
>
> https://github.com/sympy/sympy/blob/master/.github/workflows/runtests.yml#L201-L203
> (there's a few others in this file too, search for "install")
> https://github.com/sympy/sympy/blob/master/doc/requirements.txt
>
> I don't know if there's a tool that lets you easily see how often
> these are updated but my guess would be 1-5 updates a week.

Yes, but we could set dependabot to just run once a month. We would
get a small flurry of updates. Most could just get immediately merged.

What would be nie is if there was an alternative to dependabot that
could batch all the different dependency updates into a single PR or
perhaps a PR for say all doc dependencies so that you know that to
review you just need to check the docs build.

This tool can be used to update a whole requirements.txt file in one go:
https://pypi.org/project/pip-upgrader/

It could probably be configured to run say once a month and open a PR.
I think that making a full bot to do this is a bunch of work though so
it is better if there is a ready made action that we can use.

My suggestion is just that we try using dependabot for some things and
see how it pans out.

--
Oscar

Aaron Meurer

unread,
Apr 1, 2023, 9:19:59 PM4/1/23
to sy...@googlegroups.com
On Sat, Apr 1, 2023 at 5:04 AM Oscar Benjamin
<oscar.j....@gmail.com> wrote:
>
> On Sat, 1 Apr 2023 at 06:36, Aaron Meurer <asme...@gmail.com> wrote:
> >
> > On Fri, Mar 31, 2023 at 10:33 PM Jason Moore <moore...@gmail.com> wrote:
> > >
> > > When the # of dependencies is large, dependabot is a very annoying feature. I contributed to a Javascript lib and the dependabot floods your inbox and notifications with useless PRs. It may be ok for us, since it is only checking a handful of dependencies and those don't change too often.
> >
> > We actually have quite a few, assuming we were to pin all of them
> >
> > https://github.com/sympy/sympy/blob/master/.github/workflows/runtests.yml#L201-L203
> > (there's a few others in this file too, search for "install")
> > https://github.com/sympy/sympy/blob/master/doc/requirements.txt
> >
> > I don't know if there's a tool that lets you easily see how often
> > these are updated but my guess would be 1-5 updates a week.
>
> Yes, but we could set dependabot to just run once a month. We would
> get a small flurry of updates. Most could just get immediately merged.
>
> What would be nie is if there was an alternative to dependabot that
> could batch all the different dependency updates into a single PR or
> perhaps a PR for say all doc dependencies so that you know that to
> review you just need to check the docs build.

It seems that you can configure some of these things:

https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#open-pull-requests-limit
https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#scheduleinterval

Although it doesn't seem like it can do the thing you suggest where it
just opens one PR then pushes new stuff to that same PR.

Aaron Meurer

>
> This tool can be used to update a whole requirements.txt file in one go:
> https://pypi.org/project/pip-upgrader/
>
> It could probably be configured to run say once a month and open a PR.
> I think that making a full bot to do this is a bunch of work though so
> it is better if there is a ready made action that we can use.
>
> My suggestion is just that we try using dependabot for some things and
> see how it pans out.
>
> --
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxQS-fNsxeQVsKkMvv3PjFC7f5z2W7z%2BYX20x_ZSsCd_oA%40mail.gmail.com.

Oscar Benjamin

unread,
Apr 2, 2023, 12:29:13 PM4/2/23
to sy...@googlegroups.com
On Sun, 2 Apr 2023 at 02:19, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Sat, Apr 1, 2023 at 5:04 AM Oscar Benjamin
> <oscar.j....@gmail.com> wrote:
> >
> > On Sat, 1 Apr 2023 at 06:36, Aaron Meurer <asme...@gmail.com> wrote:
> > >
> > > On Fri, Mar 31, 2023 at 10:33 PM Jason Moore <moore...@gmail.com> wrote:
> > > >
> > > > When the # of dependencies is large, dependabot is a very annoying feature. I contributed to a Javascript lib and the dependabot floods your inbox and notifications with useless PRs. It may be ok for us, since it is only checking a handful of dependencies and those don't change too often.
> > >
> > > We actually have quite a few, assuming we were to pin all of them
> > >
> > > https://github.com/sympy/sympy/blob/master/.github/workflows/runtests.yml#L201-L203
> > > (there's a few others in this file too, search for "install")
> > > https://github.com/sympy/sympy/blob/master/doc/requirements.txt
> > >
> > > I don't know if there's a tool that lets you easily see how often
> > > these are updated but my guess would be 1-5 updates a week.
> >
> > Yes, but we could set dependabot to just run once a month. We would
> > get a small flurry of updates. Most could just get immediately merged.
> >
> > What would be nie is if there was an alternative to dependabot that
> > could batch all the different dependency updates into a single PR or
> > perhaps a PR for say all doc dependencies so that you know that to
> > review you just need to check the docs build.
>
> It seems that you can configure some of these things:
>
> https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#open-pull-requests-limit
> https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#scheduleinterval
>
> Although it doesn't seem like it can do the thing you suggest where it
> just opens one PR then pushes new stuff to that same PR.

No, there is an open issue for it:
https://github.com/dependabot/dependabot-core/issues/1190

I have figured out though a way that I could automate updates for
requirements.txt files using the create-pull-request action and
pip-upgrader:
https://github.com/marketplace/actions/create-pull-request
https://pypi.org/project/pip-upgrader/

With that I can make something that creates pull requests with grouped
updates. My suggestion would be to have it run once per month and open
3 PRs:

- Doc build dependencies
- Test dependencies
- Linter dependencies

We have doc/requirements.txt which could be updated with pip-upgrader.
A PR for that could be reviewed by checking the docs build (including
looking at the generated docs).

We could have a .github/pip-constraints.txt to pin versions for the
dependencies used in CI test jobs. That could be upgraded in a
separate PR. In this PR any CI failure indicates some sort of bug
either in upstream dependency or that something should be fixed in
sympy to work with the latest version of the dependency.

If all linting is done by pre-commit then we can use pre-commit
autoupdate to update the versions in the pre-commit config. That could
be a third PR. Here if CI checks fail it means that we should either
fix the lint errors or possibly adjust the lint config depending on
what changes would be needed.

Are there any other places we might want to pin versions and keep them
up to date automatically?

Actually there is one which is the actions workflow files themselves
(the versions of the actions that are used). I haven't found something
that can update those offline but we could use dependabot for that or
otherwise there is an action that can do it:
https://github.com/marketplace/actions/github-actions-version-updater

If we go with this idea then we can have at most 3 PRs per month plus
whatever is needed for the actions workflow files.

Does that seem reasonable?

--
Oscar

Aaron Meurer

unread,
Apr 4, 2023, 3:33:50 AM4/4/23
to sy...@googlegroups.com
On Sun, Apr 2, 2023 at 10:29 AM Oscar Benjamin
I guess we can try it out and see how it works.

Aaron Meurer

>
> --
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxQhb9fVHi-GR6Zg3afApuwu2cPXfT_YzZAxWK9Pcq%2BzbA%40mail.gmail.com.

Oscar Benjamin

unread,
Apr 4, 2023, 6:03:52 AM4/4/23
to sy...@googlegroups.com
On Tue, 4 Apr 2023 at 08:33, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Sun, Apr 2, 2023 at 10:29 AM Oscar Benjamin
> <oscar.j....@gmail.com> wrote:
> >
> >
> > I have figured out though a way that I could automate updates for
> > requirements.txt files using the create-pull-request action and
> > pip-upgrader:
> > https://github.com/marketplace/actions/create-pull-request
> > https://pypi.org/project/pip-upgrader/
> >
> > With that I can make something that creates pull requests with grouped
> > updates. My suggestion would be to have it run once per month and open
> > 3 PRs:
>
> I guess we can try it out and see how it works.
>
> Aaron Meurer
>

Okay, I'll set something up.

--
Oscar
Reply all
Reply to author
Forward
0 new messages