A monster theorem

98 views

Skip to first unread message

Giovanni Mascellani

unread,

Nov 4, 2019, 10:38:21 AM11/4/19

to metamath

Hi,

playing with my code to build Metamath proofs from ITP programs, I
generated this proof of the theorem asserting that each group where all
elements have order 2 is commutative:

https://github.com/giomasce/set.mm/commit/6b64e8068b1c1c82d2fb714c42ae2615bee80094

I thought anyone having a fetish for monster Metamath proofs might like
it. You can also find the HTML rendering here (25 MB, be careful!):

https://people.debian.org/~gio/monsterord2com.html

This is the description, with some additional explanation on why it is
so big and weird:

---
A monster theorem, proving that a group where every element has order 2
(i.e., every element multiplied by itself is the identity) is commutative.

The proof is automatically generated: the original problem is fed as a
First-Order Formula to Prover9, whose proof is processed with GAPT to
convert it to Natural Deduction style. The program mmpp is finally used
to express the proof in the Metamath language. Since the original
theorem is not completely trivial itself and each conversion step adds a
good amount of bookkeeping code, the final result is huge and hardly
readable by any human being.

The statement itself is littered with artifacts due to the automatic
conversion from a First Order Formula. The class 𝐴 is the group
identity and the class 𝐵 (depending on variables 𝑡 and 𝑒) is the
group operation. Explicit substitution is used both in the statement and
in the proof in order to bridge the gap between FOF and Metamath's
treating of predicates' and functors' arguments: while in First Order
Formulae arguments are explicit and order-based, in Metamath they are
implicit and name-based. A conspicuous amount of bookkeeping must be put
in place to handle this discrepancy.
---

BTW, the proof generated by mmpp is uncompressed. Compressing with with
metamath.exe took a few good minutes.

Giovanni.
--
Giovanni Mascellani <g.masc...@gmail.com>
Postdoc researcher - Université Libre de Bruxelles

signature.asc

David A. Wheeler

unread,

Nov 4, 2019, 12:21:26 PM11/4/19

to metamath

David A. Wheeler asked Giovanni Mascellani about the "monster theorem":
> > The final expression includes "∧ ⊤" - I would not have expected that.
> > Should that be removed?

On Mon, 4 Nov 2019 17:42:39 +0100, Giovanni Mascellani wrote:
> It is currently an artifact of the automatic translation from ND. Each
> clause converted from ND must have the form "( ph -> ps )", even when
> its context is empty, because otherwise I have to branch over the two
> possible cases (there is or there is not a context) each time I want to
> use or produce a clause. So when the context is empty I need to put
> something, which is, of course, "T.". Adding to context is obtained by
> replacing "ph" with "( ch /\ ph )", and again I do not want to make a
> special rule for when I am adding to an empty context. So, in general,
> the context containing ph, ps and ch will be
>
> ( ph /\ ( ps /\ ( ch /\ T. ) ) )
>
> This and many other artifacts, like the explicit substitution
> constructors, should eventually be removed (unless they are desired, in
> some situation) by the program _consuming_ this proof, which is still to
> be written.

--- David A. Wheeler

David A. Wheeler

unread,

Nov 4, 2019, 12:52:17 PM11/4/19

to metamath

>On Mon, 4 Nov 2019 17:42:39 +0100, Giovanni Mascellani wrote:
>> This and many other artifacts, like the explicit substitution
>> constructors, should eventually be removed (unless they are desired, in
>> some situation) by the program _consuming_ this proof, which is still to
>> be written.

There's no way to know for certain without trying, but I expect that an automated tool could simplify a number of these expressions. The result might be a smaller monster.

--- David A.Wheeler

Giovanni Mascellani

unread,

Nov 4, 2019, 3:22:43 PM11/4/19

to metamath

[DAW, sorry for double sending, this is meant for the mailing list]

Il 04/11/19 18:52, David A. Wheeler ha scritto:

> There's no way to know for certain without trying, but I expect that
> an automated tool could simplify a number of these expressions. The
> result might be a smaller monster.

Looking quickly at the proof, it seems that a substantial amount of
steps apply either mergeconj or selconj. The first is used to merge
together two lists of conjuncts in the antecedent of a ND clause, and
the second to select one of the conjuncts in the antecedent of a ND
clause. Currently the Metamath proof synthesisation algorithm is forced
to represent ND clauses invariably in this form:

( ( ph /\ ( ps /\ ( ch /\ T. ) ) ) -> th )

although in line of principle much more freedom could be use to
represent the antecedent. When to lists are merged they won't be in this
form, so mergeconj has to be repeatedly called to fix it. Also, ND
manipulation lemmata, which are called by the proof generated by my
code, expect conjuncts to be removed from the antecedent to appear in
the first (outermost) place(s). Therefore repeated application of
selconj is required to fix the shape of the antecedent before applying
some ND steps.

Probably many such calls could be saved by allowing more freedom in how
the antecedent is represented. mergeconj could be dropped altogether and
fewer selconj applications might be required if the antecedent is
shallower. Unfortunately, this means complicating the code, because the
conjunct selection process needs to accept any tree of conjuncts instead
of a rigid right-associated list.

Alternatively I could try to follow the idea of Mario's style for ND. In
that case selecting conjuncts becomes free, because selecting hypotheses
in Metamath is already free. The problem I see is that some ND steps,
like ex (implication introduction) require a reified inference operator
(i.e., an implication), that is not available with Mario's style. It is
available with mine, though, which is the reason I am using it.
Therefore it seems that I would still need to use both my and Mario's
style in the proof, with a lot of additional complication, as I would
have to track for each antecedent which type it is, and with much less
advantage because I still would have implications. I don't know if there
is an easy way out (other than using purely my style, with the
disadvantage of size).

Another frequently used lemma is csbcom2fi, which applies an explicit
substitution to another one. Substitutions are frequently used in ND
proofs, for example in quantifier introduction and elimination. Also,
when I convert a FOF predicate or functor to Metamath, I represent it
using one substitution for each of its arguments. For example, ph(x, y)
is represented as

[. x / w ]. [. y / z ]. ph

where w and z are two fresh variables chosen for ph. Furthermore, I
assume that w and z are the only free variables in ph. A similar thing
happens for functors, where in addition I also assume that it can be
proved that they are sets.

This is necessary because there is mismatch between Metamath and
standard FOFs: in FOFs arguments are explicit and order-based, while in
Metamath they are implicit and name-based. As a result, it is not easy
in Metamath to distinguish the two FOFs ph(x,y) and ph(y,x), and even
less ph(x,y) and ph(A(y),y), unless one resorts to the trick I am using.
If we stipulate that the arguments to ph are internally called w and z
and the argument to A is internally called u, then we can write:

[. y / w ]. [. x / z ]. ph
[. [_ y / u ]_ A / w ]. [. y / z ]. ph

and everything is fine, except for the readability of the notation and
the fact that we have to commute a lot of explicit substitutions.

I don't see an easy way out here. It might be possible to save a few
variables on some proof where there is no need to perform excessively
complicated substitutions, but it is something difficult to generalize
and which complicates a lot the logic of the generator.

signature.asc

Mario Carneiro

unread,

Nov 4, 2019, 3:49:51 PM11/4/19

to metamath

Maybe you could bring me up to speed on the details here, but as I see it, implication introduction is 'ex'. It is important for this that conjuncts be stored as a *left* associated list, possibly beginning with a T. if that makes things easier. Whenever you change the context, there is a cost associated with pulling any lemmas into the new context (including hypotheses) via adantr; you can either compute these lazily or precompute them and drop any that you don't end up needing.

If you are finding subproofs in the manner of a tableau calculus (working backwards from the goal), this will not be a problem, but if you are reasoning forward and taking context unions all the time, one inefficiency that you can end up with is that a subproof Gamma |- ph is derived, and then it is used to prove two other subproofs Delta1 |- ps and Delta2 |- ch, where Delta1 and Delta2 are incomparable strict supersets of Gamma. In this case, if you turn it into an ND proof, you will end up replaying the steps Delta1 |- ph and Delta2 |- ph, and some sharing is lost. I don't know how problematic this is, but if you look for shared subproofs you can in principle determine that Gamma |- ph is shared, and so prove Gamma |- ph directly and then do a context change (syl*anc) step to get to Delta1 |- ph and Delta2 |- ph.

Another frequently used lemma is csbcom2fi, which applies an explicit
substitution to another one. Substitutions are frequently used in ND
proofs, for example in quantifier introduction and elimination. Also,
when I convert a FOF predicate or functor to Metamath, I represent it
using one substitution for each of its arguments. For example, ph(x, y)
is represented as

[. x / w ]. [. y / z ]. ph

where w and z are two fresh variables chosen for ph.

While this is an option, I think it would be much simpler in a set.mm based system to use x R y for representing binary relations without additional variables getting in the way. In a real FOL system, you would have a sort for terms and a constant for each primitive relation; set.mm only has the e. relation but you can use that as a prototypical relation on set.mm foundations. Using wff variables has the disadvantage that the binding variables become explicit, which complicates proofs with lots of unnecessary NF stuff. (Actually it doesn't need to, because you should be assuming that w and z are disjoint from everything other than ph.)

Furthermore, I
assume that w and z are the only free variables in ph. A similar thing
happens for functors, where in addition I also assume that it can be
proved that they are sets.

I formalize this requirement in MM0: it is not that w and z are the only free variables in ph, but rather that any free variables other than w and z are not mentioned anywhere in the theorem or proof (they are "parameters" to ph). In metamath that means putting a $d between w , z and all variables in the statement.

This is necessary because there is mismatch between Metamath and
standard FOFs: in FOFs arguments are explicit and order-based, while in
Metamath they are implicit and name-based. As a result, it is not easy
in Metamath to distinguish the two FOFs ph(x,y) and ph(y,x), and even
less ph(x,y) and ph(A(y),y), unless one resorts to the trick I am using.
If we stipulate that the arguments to ph are internally called w and z
and the argument to A is internally called u, then we can write:

[. y / w ]. [. x / z ]. ph
[. [_ y / u ]_ A / w ]. [. y / z ]. ph

and everything is fine, except for the readability of the notation and
the fact that we have to commute a lot of explicit substitutions.

Yes, I dealt with the same problem in peano.mm0. Originally I just had wffs and variables, but without a "class" like type it quickly becomes difficult to deal with all the "deferred binding" variables involved in relations. If you are okay with an axiomatic term constructor, you can just declare P(x,y), but if you want P to be a metavariable you need a sort for it. (Things become even more complicated if you want a sort for n-ary predicates but the universe of discourse doesn't support pairing...)

Mario

Giovanni Mascellani

unread,

Nov 4, 2019, 6:02:45 PM11/4/19

to metamath

Hi Mario,

Il 04/11/19 21:49, Mario Carneiro ha scritto:

> Maybe you could bring me up to speed on the details here, but as I see
> it, implication introduction is 'ex'. It is important for this that
> conjuncts be stored as a *left* associated list, possibly beginning with
> a T. if that makes things easier. Whenever you change the context, there
> is a cost associated with pulling any lemmas into the new context
> (including hypotheses) via adantr; you can either compute these lazily
> or precompute them and drop any that you don't end up needing.

Hmm, there are different problems converging here and I am not sure we
are considering the same. Let me state more generally what I want to do.
I want to take an arbitrary ND proof (more specifically, ND proofs as
defined by GAPT[1], generated by GAPT itself munching resolutions found
by Prover9 or similar tools; however, I assume that the proof was
generated by a black box) and generate a Metamath proof of the same
statement.

[1] https://www.logic.at/gapt/api/gapt/proofs/nd/NDProof.html

Before even starting converting the proof, I need to decide how to
convert statements, i.e., to what to map a ND sequent in Metamath. My
current choice is to map, say, "ph, ps, ch |- th" to the Metamath statement

|- ( ( ph /\ ( ps /\ ( ch /\ T. ) ) ) -> th )

I call this a "reified" ND inference, because the ND turnstile is mapped
to an element of the language, i.e., the implication. I don't know if
the word "reified" is logically/philosophically correct, I will be happy
to use a better one if anybody suggests it.

Using your style, I would choose a global context, say et, and map the
same sequent to

|- ( et -> ph )
& |- ( et -> ps )
& |- ( et -> ch )
=> |- ( et -> th )

Here the ND inference is not reified, because it is mapped to the "=>"
symbol which is not an element of the formal language.

These two Metamath notations are equivalent, as we know, but they are
not equivalently easy to work with when converting ND proofs, as I see
it. Let us consider implication introduction, informally stated as "If
from a context Gamma augmented with ph I can derive ps, then from Gamma
I can derive ph -> ps". In my style this is easily expressed as

|- ( ( ph /\ Gamma ) -> ps )
=> |- ( Gamma -> ( ph -> ps ) )

which is expcom. Choosing a different associativity direction it could
have been ex, but that's not the problem here.

If I used your style, I would have had to use something like:

( |- ( et -> ph ) & |- ( et -> Gamma ) => |- ( et -> ps ) )
=> ( |- ( et -> Gamma ) => |- ( et -> ( ph -> ps ) ) )

but this is clearly not expressible in Metamath, because "&" and "=>"
are not reified.

This means that I cannot recursively convert ND proofs to this style,
because in some cases the recursion terms do not fit the same shape of
the main one. Namely, only antecedents coming from the global context
can be represented in this style; local antecedents need to be reified,
so either I reified everything or I have to keep track of what is
reified and what is not and handle all the cases, which is inconvenient.

On top of that, taken for granted that I will be using the "reified"
style, there is some freedom to decide whether to use a left or right
associated list, or a generic tree. Left vs right is not a big deal, I'd
say, in the worst case one has to prepare an additional layer of lemmata
to adjust things. Since I already use quite some technical lemmata[2]
for building these proofs, it doesn't change much. I discussed in my
previous email the alternative between a fixed list and a tree, but this
difference is in term of efficiency, while the reified vs non-reified ND
style is a more fundamental difference.

[2] http://us.metamath.org/mpeuni/mmtheorems289.html#mm28865s

> If you are finding subproofs in the manner of a tableau calculus
> (working backwards from the goal), this will not be a problem, but if
> you are reasoning forward and taking context unions all the time, one
> inefficiency that you can end up with is that a subproof Gamma |- ph is
> derived, and then it is used to prove two other subproofs Delta1 |- ps
> and Delta2 |- ch, where Delta1 and Delta2 are incomparable strict
> supersets of Gamma. In this case, if you turn it into an ND proof, you
> will end up replaying the steps Delta1 |- ph and Delta2 |- ph, and some
> sharing is lost. I don't know how problematic this is, but if you look
> for shared subproofs you can in principle determine that Gamma |- ph is
> shared, and so prove Gamma |- ph directly and then do a context change
> (syl*anc) step to get to Delta1 |- ph and Delta2 |- ph.

Yes, this might be a useful optimization (and maybe GAPT is already
doing it under my feet). It doesn't change what I discussed above, though.

> While this is an option, I think it would be much simpler in a set.mm

> <http://set.mm> based system to use x R y for representing binary

> relations without additional variables getting in the way.

Nice insight, but this works only for at most binary relations (and
relatedly functors). I would like to not lose generality with respect to
the possibility of translating whatever ND proof to Metamath, as is
currently the case.

> Yes, I dealt with the same problem in peano.mm0. Originally I just had
> wffs and variables, but without a "class" like type it quickly becomes
> difficult to deal with all the "deferred binding" variables involved in
> relations. If you are okay with an axiomatic term constructor, you can
> just declare P(x,y), but if you want P to be a metavariable you need a
> sort for it. (Things become even more complicated if you want a sort for
> n-ary predicates but the universe of discourse doesn't support pairing...)

My target theory is always set.mm, so pairing is not a problem. At the
same time, I would like to avoid adding further sorts or axioms to it.

Thanks, Giovanni.

signature.asc

Mario Carneiro

unread,

Nov 4, 2019, 8:19:06 PM11/4/19

to metamath

Aha, I see the confusion now. The style you present above is deduction style as it is understood across theorems. That is, when we have a theorem A , B |- C we split up the context on separate hypotheses and stick a new context variable on the left, as you've done. But inside a theorem this is not possible, as you've observed, so a slight variant of the strategy is used. In an ND proof where all hypotheses have been transformed as such to ( ph -> A ) & |- ( ph -> B ) => ( ph -> C ), all steps in the proof will contain Gamma, plus additional local assumptions, for example A, B, D, E |- F, and this is concretely translated to |- ( ( ( ph /\ D ) /\ E ) -> F ).

At a slightly higher level, we can still find the distributed context view of the statement, where we hold some context Delta fixed (in this case it is ( ( ph /\ D ) /\ E ) but in general it can be an extension of this), and then the step A, B, D, E |- F is identified (in context Delta) as the proof step |- ( Delta -> F ), where the four hypotheses are stored names of proof steps that prove |- ( Delta -> A ) , |- ( Delta -> B ) (by adantr from the original theorem hypotheses), and |- ( Delta -> D ) , |- ( Delta -> E ) (proven by simplr and simpr).

So although we can't generalize over statements like

|- ( et -> ph )
& |- ( et -> ps )
& |- ( et -> ch )
=> |- ( et -> th )

in the middle of a proof, we can still identify the relevant substitution instance of this pattern as a part of the proof. The concrete encoding |- ( ( ( ph /\ D ) /\ E ) -> F ) is simply one way to make sure that A, B, D, E are all provable from the context; as you say right associating would work just as well, although metamath has a large number of theorems dealing with the left associated case (simpr, simplr, simpllr, simp-4r ... simp11r selects hypotheses, and adantr, ad2antrr, ad3antrrr, ad4antr, ... ad10antr weakens a context), exactly for making this kind of reasoning efficient.

For extremely large contexts, this approach is quadratic where log-linear is possible, so you would probably want to do something else in that case, but this isn't really a problem in set.mm because the number of variables is already capped.

If I used your style, I would have had to use something like:

( |- ( et -> ph ) & |- ( et -> Gamma ) => |- ( et -> ps ) )
=> ( |- ( et -> Gamma ) => |- ( et -> ( ph -> ps ) ) )

but this is clearly not expressible in Metamath, because "&" and "=>"
are not reified.

It's actually

making the quantifier over contexts et' explicit. But the big observation is that we only care about one substitution instance of this (depending on the context where the step applies); in this case a reasonable choice of context is ( et /\ ph ), in which case you get

And now we eliminate the higher order arrow by observing that all the assumptions to the hypothesis are provable outright:

simpr: |- ( ( et /\ ph ) -> ph )

adantr: |- ( et -> Gamma_1 ) => |- ( ( et /\ ph ) -> Gamma_1 )

...

adantr: |- ( et -> Gamma_n ) => |- ( ( et /\ ph ) -> Gamma_n )

and the conclusion is provable from the consequent of the hypothesis:

ex: |- ( ( et /\ ph ) -> ps ) => |- ( et -> ( ph -> ps ) ) )

So the higher order inference decomposes into a set of theorem applications to *both* the conclusion (working backwards) and the hypotheses/subproofs (working forwards).

On top of that, taken for granted that I will be using the "reified"
style, there is some freedom to decide whether to use a left or right
associated list, or a generic tree. Left vs right is not a big deal, I'd
say, in the worst case one has to prepare an additional layer of lemmata
to adjust things. Since I already use quite some technical lemmata[2]
for building these proofs, it doesn't change much. I discussed in my
previous email the alternative between a fixed list and a tree, but this
difference is in term of efficiency, while the reified vs non-reified ND
style is a more fundamental difference.

My point is that "ND style" incorporates *both* the reified and non-reified aspects. You can't prove theorems using only the non-reified approach, the context changing is an integral part of it. You probably are already aware of all this, but you might find my original talk on the ND style helpful: http://us.metamath.org/ocat/natded.pdf .

> While this is an option, I think it would be much simpler in a set.mm
> <http://set.mm> based system to use x R y for representing binary
> relations without additional variables getting in the way.

Nice insight, but this works only for at most binary relations (and
relatedly functors). I would like to not lose generality with respect to
the possibility of translating whatever ND proof to Metamath, as is
currently the case.

Assuming you have a pairing operator, you can represent an n-ary operation via ( F ` <. x , <. y , <. z , ... >. >. >. ) and an n-ary relation via <. x , <. y , <. z , ... >. >. >. e. R. The cases n = 1, 2, 3 have special support in set.mm (n = 3 an "ordered triple"), and for medium size tuples you can use the <" A B C ... D "> sequence (df-s1, df-s2, ..., df-s8), using the "concat" function to get more than 8 arguments. (These operators are intended for representing large sequences and trees as metamath terms with log-overhead.)