Discussion: Approving changes to the set.mm database

119 views
Skip to first unread message

David A. Wheeler

unread,
Dec 13, 2021, 1:20:27 PM12/13/21
to Metamath Mailing List
Sadly, Norm is no longer with us. We're going to have to work out next steps.

Norm typically made the final decision of what to accept or not accept into the set.mm database.
We're going to have to find another process.

I would suggest, for now, that proposed changes continue to be added as GitHub pull requests.
Obviously any change *must* pass the automated checks to be considered for merging.
Others can "approve" by going to the "Files Changed" tab and click on "review changes".
If you approve, click on "approve" and then "submit review". Or you could just say "+1" as a comment.
If after several days (I suggest 5) there's another approver who has *previously* had a change approved,
and no disagreement, I suggest merging it. If there's disagreement, discuss in the comments to
try towards resolution (& if it's more contentious, also involve the mailing list).

We don't have to use this process. I'm just suggesting *a* process we can use.
If anyone would prefer another process, please speak up.

--- David A. Wheeler

Scott Fenton

unread,
Dec 13, 2021, 1:29:41 PM12/13/21
to meta...@googlegroups.com
That seems reasonable for contributions to the main sections of the databases. I'd propose keeping a shorter length for mathbox submissions however. Personally, I'd like to see my own section of set.mm update more than weekly.

-Scott

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/23A8AB6C-886F-4A1C-BBAC-76F36FCEC240%40dwheeler.com.

David A. Wheeler

unread,
Dec 13, 2021, 1:53:47 PM12/13/21
to Metamath Mailing List


> On Dec 13, 2021, at 1:29 PM, Scott Fenton <sct...@gmail.com> wrote:
>
> That seems reasonable for contributions to the main sections of the databases. I'd propose keeping a shorter length for mathbox submissions however. Personally, I'd like to see my own section of set.mm update more than weekly.

Oh! Good point! My proposed rules were intended to apply to the "main" part of set.mm.

I think people should be able to just update their own mathboxes as long as it passes the automated checks
and in general doesn't interfere with anyone else's work. I think it'd be good to encourage
giving a day or two in that case, not because it's *necessary* but because others might have useful feedback.
But I don't think a delay needs to be a hard & fast rule.

Updating *others'* mathboxes should generally be coordinated with others.

--- David A. Wheeler

Jim Kingdon

unread,
Dec 13, 2021, 2:14:34 PM12/13/21
to David A. Wheeler, Metamath Mailing List
Seems like a good place to start.

The only real caveat I'd add is that one of the subtle things which Norm was doing which might not even have been fully noticeable is that he had a bias towards "merge this now" which I think generally served us well. Now sometimes that was combined with "please make a follow up to fix x, y, and z", or applied more to uncontroversial changes than to everything, or had some details like "we can rename it later if we find a better name" or the like.

But that is a subtlety which doesn't really contradict anything being suggested below. Just something to keep in mind if our pull request queue seems like it is growing and pull requests linger for a long time without being either merged or closed.

Glauco

unread,
Dec 13, 2021, 3:02:43 PM12/13/21
to Metamath
A small risk I can see (I hope I'm wrong) is that we would end up with a pretty long list of pending PRs and the n_th PR in the queue could be merged with the repository when it was submitted, but it will often conflict with one of the PRs ahead in the queue.

Is this something to be worried about?

Glauco

Scott Fenton

unread,
Dec 13, 2021, 3:50:34 PM12/13/21
to meta...@googlegroups.com
I've seen that happen before. The primary place we should worry about is the date at the top of the file. If a conflict does arise, the affected PR should just pull and merge and resolve the conflict. They can then push and the PR should be good to go.

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

Jim Kingdon

unread,
Dec 13, 2021, 3:58:53 PM12/13/21
to meta...@googlegroups.com

It is a fair question and to a fair extent I think we can rely on the judgement of the person merging it about how long to wait (subject to the more black and white rules about passing the automated checks and - I suppose this is black and white except for mathboxes - having at least one approval). That is, I think many of us have at least some feel for whether there needs to be more discussion or there's a potential for controversy.

We can also see whether merge conflicts of this sort end up being frequent or rare. A bit hard for me to predict in advance, as it depends on how often people change the same parts of (well, set.mm especially, the other files to a smaller extent).

David A. Wheeler

unread,
Dec 13, 2021, 5:54:22 PM12/13/21
to Metamath Mailing List


> On Dec 13, 2021, at 3:58 PM, Jim Kingdon <kin...@panix.com> wrote:
>
> It is a fair question and to a fair extent I think we can rely on the judgement of the person merging it about how long to wait (subject to the more black and white rules about passing the automated checks and - I suppose this is black and white except for mathboxes - having at least one approval). That is, I think many of us have at least some feel for whether there needs to be more discussion or there's a potential for controversy.

I just wanted to get *some* rule on the table so that we aren't stuck forever waiting for a merge.
As long as the automated tests pass, and it's either in that person's mathbox or someone else approves it,
it could probably just get merged immediately. I'd suggest at least 24 hours after posting unless it's
something like an obvious typo/error, but that wouldn't be *necessary*.

It's a pain, but changes *can* be reverted.

> We can also see whether merge conflicts of this sort end up being frequent or rare. A bit hard for me to predict in advance, as it depends on how often people change the same parts of (well, set.mm especially, the other files to a smaller extent).

I suspect that varies. But long delays for merging increase this risk, so it's good to
merge instead of letting pull requests linger. Which is why I think it's important to
create *some* rule for merging - otherwise it'll be easy for proposed changes to linger.

--- David A. Wheeler

Thierry Arnoux

unread,
Dec 13, 2021, 10:04:25 PM12/13/21
to meta...@googlegroups.com, David A. Wheeler
I've created a "mathbox only" label in Github.

It could be used to mark such PR's with only changes in one's own
mathbox and ease reviewing a bit.

_
Thierry

Mario Carneiro

unread,
Dec 13, 2021, 10:34:00 PM12/13/21
to metamath, David A. Wheeler
I think that a 5 day waiting period is too long, and indeed I don't think there should be any time-based review barrier. For contributions to main set.mm, I would propose that we have a simple system where at least 2 people sign off on each PR: the author and at least one maintainer/reviewer (not equal to the author). CI is quick so we can make it a requirement for merging.

I think we should have designated maintainers for parts of set.mm. Any changes to that section should be reviewed by at least one designated maintainer. There aren't too many daily active contributors so possibly we can just all be maintainers of everything to start, but some people have their own pet sections and we should make an attempt to write down everyone's area of expertise so that we know who to ping.

For mathbox-only changes, I would stick to the 1 reviewer rule but most likely only a cursory examination is required.

Mario

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

Mario Carneiro

unread,
Dec 13, 2021, 10:37:48 PM12/13/21
to metamath, David A. Wheeler
Regarding the top date, I have said this before and I'll say it again: we should just abandon it. It is a completely unnecessary merge hazard and I don't know any other git-based project that checks in a date that changes on every commit. (A version number might make more sense, but I don't think that set.mm needs version numbers.) Git is perfectly capable of tracking the commit date in a way that does not create rampant merge conflicts.

David A. Wheeler

unread,
Dec 14, 2021, 12:28:51 PM12/14/21
to Metamath Mailing List

> On Dec 13, 2021, at 10:33 PM, Mario Carneiro <di....@gmail.com> wrote:
>
> I think that a 5 day waiting period is too long, and indeed I don't think there should be any time-based review barrier. For contributions to main set.mm, I would propose that we have a simple system where at least 2 people sign off on each PR: the author and at least one maintainer/reviewer (not equal to the author). CI is quick so we can make it a requirement for merging.

I'm fine with omitting a waiting period! I suggested the "5 days" to increase the likelihood of people agreeing to *a* merge criterion.

But if "another maintainer approved" is generally acceptable as a merge criterion, that's even better.

So who does the merge? I guess any maintainer can trigger the merge once the conditions are met?

> I think we should have designated maintainers for parts of set.mm. Any changes to that section should be reviewed by at least one designated maintainer. There aren't too many daily active contributors so possibly we can just all be maintainers of everything to start, but some people have their own pet sections and we should make an attempt to write down everyone's area of expertise so that we know who to ping.

Sounds good. I'll call those "area maintainers". Let's start with the empty set.

If anyone has a "pet section" or "pet database"
they want to review all changes to, please post here on the mailing list.
If there's general agreement, we'll add them to an "area maintainers"
list in the set.mm README. The rule would be, all changes to the area (database or database section)
would require a review from one of those area maintainers before being accepted.

> For mathbox-only changes, I would stick to the 1 reviewer rule but most likely only a cursory examination is required.

Fair enough. I imagine they'll mainly be looking for statements that might interfere with others.

If *every* pull request requires review & approval by someone else, we could even enforce that via GitHub rule.
That might be a good thing. One of the main advantages of set.mm is how rigorously verified it is
(every change is verified by multiple independent verifiers, etc.). Having enforced review could help
us convince *others* that we take this database seriously. We don't need to turn that on now,
but it might be worth thinking about.

> Regarding the top date, I have said this before and I'll say it again: we should just abandon it. It is a completely unnecessary merge hazard and I don't know any other git-based project that checks in a date that changes on every commit. (A version number might make more sense, but I don't think that set.mm needs version numbers.) Git is perfectly capable of tracking the commit date in a way that does not create rampant merge conflicts.

I'm fine with removing the top date, let's do it. I think it's a vestige of the time before this was managed by git.

--- David A. Wheeler

Mario Carneiro

unread,
Dec 14, 2021, 12:47:12 PM12/14/21
to metamath
On Tue, Dec 14, 2021 at 12:28 PM David A. Wheeler <dwhe...@dwheeler.com> wrote:

> On Dec 13, 2021, at 10:33 PM, Mario Carneiro <di....@gmail.com> wrote:
>
> I think that a 5 day waiting period is too long, and indeed I don't think there should be any time-based review barrier. For contributions to main set.mm, I would propose that we have a simple system where at least 2 people sign off on each PR: the author and at least one maintainer/reviewer (not equal to the author). CI is quick so we can make it a requirement for merging.

I'm fine with omitting a waiting period! I suggested the "5 days" to increase the likelihood of people agreeing to *a* merge criterion.

But if "another maintainer approved" is generally acceptable as a merge criterion, that's even better.

So who does the merge? I guess any maintainer can trigger the merge once the conditions are met?

Yes. Obviously it has to be someone with write permissions, but I think most of our contributors have such permission on set.mm already.

> I think we should have designated maintainers for parts of set.mm. Any changes to that section should be reviewed by at least one designated maintainer. There aren't too many daily active contributors so possibly we can just all be maintainers of everything to start, but some people have their own pet sections and we should make an attempt to write down everyone's area of expertise so that we know who to ping.

Sounds good. I'll call those "area maintainers". Let's start with the empty set.

If anyone has a "pet section" or "pet database"
they want to review all changes to, please post here on the mailing list.
If there's general agreement, we'll add them to an "area maintainers"
list in the set.mm README. The rule would be, all changes to the area (database or database section)
would require a review from one of those area maintainers before being accepted.

Personally, I would put my area of expertise as overall set.mm cohesion and design, although there are also plenty of individual sections that I have contributed to. Of late I haven't had much time to devote to PR reviews though, so it will need more than just me if we want to keep the queue flowing. I think Jim Kingdon should take point on iset.mm maintenance, and others should speak up if they want to participate more in maintenance.

> For mathbox-only changes, I would stick to the 1 reviewer rule but most likely only a cursory examination is required.

Fair enough. I imagine they'll mainly be looking for statements that might interfere with others.

If *every* pull request requires review & approval by someone else, we could even enforce that via GitHub rule.
That might be a good thing. One of the main advantages of set.mm is how rigorously verified it is
(every change is verified by multiple independent verifiers, etc.). Having enforced review could help
us convince *others* that we take this database seriously. We don't need to turn that on now,
but it might be worth thinking about.

It's a possibility, but I don't think we need to make the structures unnecessarily rigid. Most of our contributors are able to work well together without a heavy coordination framework and I'd like to keep it that way, at least until experience suggests that more structure will help more than hinder. Which is to say, I think we can function on the honor system for now.

One thing that I think we should stick to is one commit per PR. We had a similar discussion for metamath-knife but for set.mm it's important to keep PR sizes down so that people can reasonably review them, and squash merges keep things organized. This is the mechanism we've been using in mathlib and it works well to avoid merge conflicts, especially when things scale to many concurrent PRs.

Mario

Mario Carneiro

unread,
Dec 14, 2021, 12:51:13 PM12/14/21
to metamath
On Tue, Dec 14, 2021 at 12:47 PM Mario Carneiro <di....@gmail.com> wrote:
One thing that I think we should stick to is one commit per PR. We had a similar discussion for metamath-knife but for set.mm it's important to keep PR sizes down so that people can reasonably review them, and squash merges keep things organized. This is the mechanism we've been using in mathlib and it works well to avoid merge conflicts, especially when things scale to many concurrent PRs.

For big PRs that benefit from multiple commits, I think the commits should be consciously organized, squashing away bugfix commits and rebasing (not merging) on master. Such PRs are merged using the "rebase" option instead of "squash", but "squash" should be the default (and "merge" should not be used).

David A. Wheeler

unread,
Dec 14, 2021, 1:32:32 PM12/14/21
to Metamath Mailing List


> On Dec 14, 2021, at 12:51 PM, Mario Carneiro <di....@gmail.com> wrote:
>
>
>
> On Tue, Dec 14, 2021 at 12:47 PM Mario Carneiro <di....@gmail.com> wrote:
> One thing that I think we should stick to is one commit per PR. We had a similar discussion for metamath-knife but for set.mm it's important to keep PR sizes down so that people can reasonably review them, and squash merges keep things organized. This is the mechanism we've been using in mathlib and it works well to avoid merge conflicts, especially when things scale to many concurrent PRs.

I think multiple *commits* per PR is fine, especially if the later commits are small changes that fix the initial commit based on feedback. Then it's clear that comments were acted on.

I think the more important rule is that every logical change should be a separate PR.
Some PRs will be accepted, others not, and we don't want to lose good idea #1 if idea #2 is rejected.


> For big PRs that benefit from multiple commits, I think the commits should be consciously organized, squashing away bugfix commits and rebasing (not merging) on master. Such PRs are merged using the "rebase" option instead of "squash", but "squash" should be the default (and "merge" should not be used).

Squashing & rebasing instead of merging is the latest fashion. I'm not a fan of these bell bottomed pants.
Squashing eliminates the history of what actually happened, in particular, it's hard to tell
if an earlier comment was acted on if you eliminate the history through squashing.
You also can't tell what "really happened".

Git handles changes just fine without rebasing or squashing. It's definitely not a hill to die on,
but I prefer avoiding both.

--- David A. Wheeler

vvs

unread,
Dec 14, 2021, 2:36:34 PM12/14/21
to Metamath
Squashing & rebasing instead of merging is the latest fashion. I'm not a fan of these bell bottomed pants.
Squashing eliminates the history of what actually happened, in particular, it's hard to tell
if an earlier comment was acted on if you eliminate the history through squashing.
You also can't tell what "really happened".
Still, rebasing might be a good option for a single maintainer who feels strongly about what should be included in the history of their project. After approving changes they are no more interested in all the clutter and dirty details.

But for most people it doesn't matter what a particular git workflow is as long as it establishes some order.

Mario Carneiro

unread,
Dec 14, 2021, 2:44:04 PM12/14/21
to metamath
On Tue, Dec 14, 2021 at 1:32 PM David A. Wheeler <dwhe...@dwheeler.com> wrote:


> On Dec 14, 2021, at 12:51 PM, Mario Carneiro <di....@gmail.com> wrote:
>
>
>
> On Tue, Dec 14, 2021 at 12:47 PM Mario Carneiro <di....@gmail.com> wrote:
> One thing that I think we should stick to is one commit per PR. We had a similar discussion for metamath-knife but for set.mm it's important to keep PR sizes down so that people can reasonably review them, and squash merges keep things organized. This is the mechanism we've been using in mathlib and it works well to avoid merge conflicts, especially when things scale to many concurrent PRs.

I think multiple *commits* per PR is fine, especially if the later commits are small changes that fix the initial commit based on feedback. Then it's clear that comments were acted on.

This is what the *PR history* is for. These commits don't need to make it to master.

> For big PRs that benefit from multiple commits, I think the commits should be consciously organized, squashing away bugfix commits and rebasing (not merging) on master. Such PRs are merged using the "rebase" option instead of "squash", but "squash" should be the default (and "merge" should not be used).

Squashing & rebasing instead of merging is the latest fashion. I'm not a fan of these bell bottomed pants.
Squashing eliminates the history of what actually happened, in particular, it's hard to tell
if an earlier comment was acted on if you eliminate the history through squashing.
You also can't tell what "really happened".

The history of a PR is less important when you are looking at master. When you use squash merge, the squashed commit contains a reference to the PR page, where you can see the blow by blow (and the associated discussion!) if you want to.
 
Git handles changes just fine without rebasing or squashing. It's definitely not a hill to die on,
but I prefer avoiding both.

Git can handle it, but it's a lot more confusing for people looking at the history; things like HEAD~3 go who knows where and it's difficult to see at a glance what's going on when half of the commits are merges that clutter up the graph.

vvs

unread,
Dec 14, 2021, 3:28:44 PM12/14/21
to Metamath
Git can handle it, but it's a lot more confusing for people looking at the history; things like HEAD~3 go who knows where and it's difficult to see at a glance what's going on when half of the commits are merges that clutter up the graph.

I should add that by default `git log` shows commits not in topo order, so they might get ordered before commits that logically should go after it. That confused me a lot sometimes, especially after merging a big PR with lots of commits and a long discussion.

But yes, that's certainly "not a hill to die on" and Lean and Metamath communities are very different.

Jim Kingdon

unread,
Dec 14, 2021, 3:43:50 PM12/14/21
to meta...@googlegroups.com

On 12/14/21 9:47 AM, Mario Carneiro wrote:

Of late I haven't had much time to devote to PR reviews though, so it will need more than just me if we want to keep the queue flowing. I think Jim Kingdon should take point on iset.mm maintenance, and others should speak up if they want to participate more in maintenance.

I don't object to being listed for iset.mm but I don't think I should be the only one who can approve pull requests which touch it. For all the usual "avoid single points of failure" reasons.

It's a possibility, but I don't think we need to make the structures unnecessarily rigid. Most of our contributors are able to work well together without a heavy coordination framework and I'd like to keep it that way, at least until experience suggests that more structure will help more than hinder.
I know this was written in a slightly different context, but that's roughly how I feel about whether I am "iset.mm maintainer" or co-maintainer or what. The number of people who have attempted to contribute to iset.mm is not large and I'd only be excited about formalized maintainer structures if we thought it would encourage rather than discourage contributions (especially from people who have not yet contributed to iset.mm).

Scott Fenton

unread,
Dec 14, 2021, 4:24:05 PM12/14/21
to meta...@googlegroups.com
Personally, I like seeing all the commits in my git log history. Using bisect I can generally piece together what happened when. I often use the PR history, but I'd hate to not have a local copy of the history to refer to.

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

Scott Fenton

unread,
Dec 14, 2021, 4:25:28 PM12/14/21
to meta...@googlegroups.com
I feel likewise about nf.mm. I occasionally try to sync it up to set.mm and I make the odd new contribution once in a while, so I guess I'm the maintainer. I'll gladly review any changes to it, but I don't really see anyone contributing to it in a big way besides me.

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

vvs

unread,
Dec 14, 2021, 4:43:13 PM12/14/21
to Metamath
Personally, I like seeing all the commits in my git log history. Using bisect I can generally piece together what happened when. I often use the PR history, but I'd hate to not have a local copy of the history to refer to.

I'd say that it would be made simpler by a more simple history. And vice versa: if there are lots of intertwined commits bisecting will be more difficult.

But it shows that it's more important to have simple PRs in a first place rather then relying on squashing it afterwards.

Alexander van der Vekens

unread,
Dec 15, 2021, 5:19:07 AM12/15/21
to Metamath
In general, providing and handling PRs should be as simple as possible. I am not very experienced with Git, and I was happy that everything which was essential needed worked: As long as there were no merge conflicts, I just pushed my changes to my forked repository using  GitHib Desktop, and a PR was created automatically by Github. If there were merge conflicts, I updated my forked repository by pulling the latest version of  set.mm.git develop, resolving the merge conflicts, and pushed again. There should be not (much) more I have to do.

About the reviews and maintainers: we must distinguish between reviewers and persons performing the merging: Since it seems that I have only restricted permissions (no write permissions?), I can only be a reviewer and could approve PRs. Two approvals should be sufficient for changes of the main part (except moving theorems from mathboxes only, see below), and no approval for changes in the own mathbox (the person performing the merging should have a quick look at it). Changing other mathboxes are sometimes required (e.g. if theorems are moved to the main part, or maintenace actions are performed, e.g. proof minimizations) - for this, maybe only one approval (not necessarily by the mathbox owner(s)) should be sufficient.

I could volunteer for being a maintainer for set.mm, Parts 2,5,6,7,8,10,11,14,16.

Benoit

unread,
Dec 20, 2021, 8:35:05 AM12/20/21
to Metamath
I volunteer for being a maintainer (for all parts of all databases --- I'm not sure differentiating by parts is useful at this stage: a maintainer should abstain from approving something he does not understand, and I certainly will). I think I have no GitHub permissions, so you may open them when you have time. Thanks.
Benoît

Mario Carneiro

unread,
Dec 20, 2021, 4:49:06 PM12/20/21
to metamath
Benoit, I've sent you an invite to the set.mm team, so you should be able to review and merge PRs now.

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages