Terry Tao's QED game

293 views
Skip to first unread message

Giovanni Mascellani

unread,
Jul 29, 2018, 4:28:17 AM7/29/18
to metamath
Dear Metamathers,

you might be interested in this recent post by Terence Tao:


https://terrytao.wordpress.com/2018/07/28/gamifying-propositional-logic-qed-an-interactive-textbook/

It describes a small Javascript game he created to show some features of
first order logic. There are some similarities with MPE (and
particularly the Metamath applet), although there are some differences
too (for example, Tao allows a full deduction theorem).

Giovanni.
--
Giovanni Mascellani <g.masc...@gmail.com>
Postdoc researcher - Université Libre de Bruxelles

signature.asc

Antony Bartlett

unread,
Aug 18, 2018, 4:50:56 AM8/18/18
to meta...@googlegroups.com
That's cool, I've had fun with QED, thanks.

I also like The Incredible Proof Machine.

http://incredible.pm/

But then I have posted about liking proof trees previously!  Here as in most cases (such as MetaMath's 'compressed' proof-format) Directed Acyclic Graphs are used instead of trees in order to eliminate unnecessary repetition, but sometimes (such as MetaMath's 'normal' proof-format) the stricter structure of a tree is used.  The MetaMath book doesn't talk directly about trees (except as the default output format of the 'show proof' command), instead it talks about the reverse-Polish notation of the 'normal' proof-format, and the stack used by the verifier.  But to me proof-tree is what underpins these, as the reverse-Polish notation of the 'normal' proof-format is just a convenient way of serialising a proof-tree, and the stack used by the verifier is just a convenient way of walking the proof-tree.

Because they are fun and easy to use I'm trying to create a bridge between QED and/or The Incredible Proof Machine and set.mm to aid my learning of set.mm.  I'm going to start with The Incredible Proof machine because it's immediately obvious which operations of Propositional Logic it regards as fundamental, and I've been able to map most of them to set.mm proofs, thus:

https://i.imgur.com/ihsjCVy.png

Without implication-introduction (which I'm missing) any example I give is going to be extremely trivial.  But here's my Incredible Proof Machine proof of its Session 2 exercise 8:

https://i.imgur.com/iJCeHhc.png

And the correspondences I've figured out allows me to port the proof into set.mm thus:

${

  session2ex8.1 $e |- ( ph -> ch ) $.  
  session2ex8.2 $e |- ( ps -> ch ) $.  
  session2ex8.3 $e |- ( ph /\ ps ) $.  

  session2ex8 $p |- ch $=
    wph wch wph wps session2ex8.3 simpli session2ex8.1 ax-mp $.

$}

Can I have some help figuring out figuring out how to do the remaining two operations in set.mm please?  Implication-introduction and disjunction-elimination.  The other operations all mapped to single theorems, I might be struggling to find these because they're more of a technique?

I've also thought about attacking the problem of learning set.mm Propositional Logic from QED or The Incredible Proof machine the other way around.  I could take the set.mm proof of, say, Peirce's Law and break it into smaller pieces until it starts to resemble my QED proof of Peirce's Law or my Incredible Proof Machine proof of Peirce's Law.

'peirce' uses 'simplim', 'id', and 'ja'
'simplim' uses 'pm2.21', and 'con1i'

So the first step would be to splice 'peirce' and 'simplim' together in order to express 'perice' in terms of 'pm2.21', 'con1i', 'id', and 'ja'.  I can just about manage to do that sort of thing manually, but is there code that automates this operation, please?

  peirce.splice1 $p |- ( ( ( ph -> ps ) -> ph ) -> ph ) $=
    wph wps wi wph wph wph wph wps wi wph wps pm2.21 con1i wph id ja $.

    Best regards,

        Antony

Mario Carneiro

unread,
Aug 18, 2018, 5:54:16 AM8/18/18
to metamath
> Can I have some help figuring out figuring out how to do the remaining two operations in set.mm please?  Implication-introduction and disjunction-elimination.  The other operations all mapped to single theorems, I might be struggling to find these because they're more of a technique?

You should take a look at http://us.metamath.org/mpeuni/natded.html, which explains how the natural deduction style of proof maps over to the hilbert style used by metamath.

Mario

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To post to this group, send email to meta...@googlegroups.com.
Visit this group at https://groups.google.com/group/metamath.
For more options, visit https://groups.google.com/d/optout.

Norman Megill

unread,
Aug 18, 2018, 7:05:39 AM8/18/18
to Metamath
On Saturday, August 18, 2018 at 4:50:56 AM UTC-4, Antony Bartlett wrote:
...


I've also thought about attacking the problem of learning set.mm Propositional Logic from QED or The Incredible Proof machine the other way around.  I could take the set.mm proof of, say, Peirce's Law and break it into smaller pieces until it starts to resemble my QED proof of Peirce's Law or my Incredible Proof Machine proof of Peirce's Law.

'peirce' uses 'simplim', 'id', and 'ja'
'simplim' uses 'pm2.21', and 'con1i'

So the first step would be to splice 'peirce' and 'simplim' together in order to express 'perice' in terms of 'pm2.21', 'con1i', 'id', and 'ja'.  I can just about manage to do that sort of thing manually, but is there code that automates this operation, please?

  peirce.splice1 $p |- ( ( ( ph -> ps ) -> ph ) -> ph ) $=
    wph wps wi wph wph wph wph wps wi wph wps pm2.21 con1i wph id ja $.

metamath.exe has an "expand" command.  Here is your example:

MM> prove peirce
Entering the Proof Assistant.  HELP PROOF_ASSISTANT for help, EXIT to exit.
You will be working on statement (from "SHOW STATEMENT peirce"):
655 peirce $p |- ( ( ( ph -> ps ) -> ph ) -> ph ) $= ... $.
Note:  The proof you are starting with is already complete.
MM-PA> show new_proof
 8   ja.1=simplim $p |- ( -. ( ph -> ps ) -> ph )
10   ja.2=id      $p |- ( ph -> ph )
11 peirce=ja    $p |- ( ( ( ph -> ps ) -> ph ) -> ph )
MM-PA> expand simplim
Proof of "peirce" increased from 32 to 41 bytes after expanding "simplim".
MM-PA> show new_proof
12     con1i.a=pm2.21 $p |- ( -. ph -> ( ph -> ps ) )
13   ja.1=con1i     $p |- ( -. ( ph -> ps ) -> ph )
15   ja.2=id        $p |- ( ph -> ph )
16 peirce=ja      $p |- ( ( ( ph -> ps ) -> ph ) -> ph )
MM-PA> sh show new_proof /normal
Proof of "peirce":
---------Clip out the proof below this line to put it in the source file:

    wph wps wi wph wph wph wph wps wi wph wps pm2.21 con1i wph id ja $.
---------The proof of "peirce" (64 bytes) ends above this line.
MM-PA>

Norm

Norman Megill

unread,
Aug 18, 2018, 1:47:11 PM8/18/18
to Metamath
Here's an experiment to take expansion to the extreme and obtain a proof directly from axioms.

MM> prove peirce
Entering the Proof Assistant.  HELP PROOF_ASSISTANT for help, EXIT to exit.
You will be working on statement (from "SHOW STATEMENT peirce"):
655 peirce $p |- ( ( ( ph -> ps ) -> ph ) -> ph ) $= ... $.
Note:  The proof you are starting with is already complete.
MM-PA> show new_proof
 8   ja.1=simplim $p |- ( -. ( ph -> ps ) -> ph )
10   ja.2=id      $p |- ( ph -> ph )
11 peirce=ja    $p |- ( ( ( ph -> ps ) -> ph ) -> ph )

For fun, let us expand everything back to axioms.  Warning: "expand *" is normally a dangerous command that will overflow memory (quadrillions of steps could result) except for very early proofs like the one for "peirce".

MM-PA> expand *
Proof of "peirce" increased from 32 to 52 bytes after expanding "ja".
Proof of "peirce" increased from 52 to 65 bytes after expanding "pm2.61d1".
...
Proof of "peirce" increased from 2060 to 2381 bytes after expanding "a1i".
MM-PA> show new_proof /reverse
8218 peirce=ax-mp $a |- ( ( ( ph -> ps ) -> ph ) -> ph )
8217   maj=ax-mp    $a |- ( ( ( ( ph -> ps ) -> ph ) -> ( -. ph -> ph ) ) -> (
                                               ( ( ph -> ps ) -> ph ) -> ph ) )
...

Only ax-1,2,3,mp are referenced in this proof, which has 8218 "normal" steps.  Next, let's minimize the proof:

MM-PA> minimize *
Bytes refer to compressed proof size, steps to uncompressed length.
Scanning forward through statements...
Proof of "peirce" decreased from 2381 to 1785 bytes using "dummylink".
...
Proof of "peirce" decreased from 227 to 31 bytes using "pm2.18d".
The forward scan results were used.
MM-PA> show new_proof
10   pm2.18d.1=jarl $p |- ( ( ( ph -> ps ) -> ph ) -> ( -. ph -> ph ) )
11 peirce=pm2.18d $p |- ( ( ( ph -> ps ) -> ph ) -> ph )


We have a proof of the same number of "normal" steps (11) as the orginal, but with completely different theorems referenced!

Now let's expand back to axioms again:

MM-PA> expand *
Proof of "peirce" increased from 31 to 48 bytes after expanding "jarl".
...
Proof of "peirce" increased from 557 to 616 bytes after expanding "a1i".
MM-PA> sh n/rev
1862 peirce=ax-mp $a |- ( ( ( ph -> ps ) -> ph ) -> ph )
...


What is interesting is that the second expansion to axioms resulted in only 1862 ("normal") steps, compared to 8218 steps for the first expansion.  So if our interest is a proof directly from axioms, the 2nd expansion would be better.  I added it to the end of  "shortest known proofs" challenge at http://us2.metamath.org:88/mmsolitaire/pmproofs.txt where it is 95 steps long in D notation (which doesn't need syntax-building steps).

Norm

Antony Bartlett

unread,
Aug 18, 2018, 3:57:04 PM8/18/18
to meta...@googlegroups.com
Dear Norm,

That's wonderful, thank you!

I've tried that on contraposition

  con3 $p |- ( ( ph -> ps ) -> ( -. ps -> -. ph ) ) $=
    ( wi id con3d ) ABCZABFDE $.

and I get 213 steps from the axioms.  So it might interest you that someone's managed it in 72 steps!


By the way this was where I first learnt about MetaMath.

> Warning: "expand *" is normally a dangerous command that will overflow memory (quadrillions of steps could result) except for very early proofs like the one for "peirce".

Yes, I'm not suprised, this is directly related to what I said earlier about Directed Acyclic Graphs being more efficient... the dependency relationship between theorems is a DAG, as is the compressed proof format... "expand *" forces some of that information into a tree structure which creates a *lot* of repetition.

    Best regards,

        Antony



--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+unsubscribe@googlegroups.com.

Norman Megill

unread,
Aug 18, 2018, 4:57:21 PM8/18/18
to Metamath
On Saturday, August 18, 2018 at 3:57:04 PM UTC-4, Antony Bartlett wrote:
Dear Norm,

That's wonderful, thank you!

I've tried that on contraposition

  con3 $p |- ( ( ph -> ps ) -> ( -. ps -> -. ph ) ) $=
    ( wi id con3d ) ABCZABFDE $.

and I get 213 steps from the axioms.

Well, 'expand *' is pretty dumb and by no means tries to find a short proof. I was surprised that the peirce proof was so short, which may have just been luck.
 

Actually con3 is theorem *2.16 on http://us.metamath.org/mmsolitaire/pmproofs.txt with a proof of 59 steps. Perhaps you can post this link for them?  (I don't have a stackexchange account.)  That page would provide plenty of challenges for them to beat!

Norm
 

Antony Bartlett

unread,
Aug 19, 2018, 2:10:17 PM8/19/18
to meta...@googlegroups.com
Dear Norm,

Thanks, yes I will post this solution (as a 'community wiki' so I don't take the credit for it) to stack exchange, with an invitation to the other solitaire challenges.  However, I'm afraid I'm having trouble getting it in a /lemmon style format (I'm confident I can modify it from there into the format prefered by the person asking the question).  I've tried adding a dPrintProof() call to drule.c and ended up with an unhandled exception I can't figure out, and I've tried manually entering the steps into the MetaMath Proof Assistant, and must have made a mistake because it's telling me '?Statement "ax-2" cannot be unified with step 22.'

I do have it verified in The Incredible Proof Machine, however


    Best regards,

        Antony


Benoit

unread,
Aug 19, 2018, 3:21:53 PM8/19/18
to Metamath
I translated the proof of *2.16 from pmproofs.txt to the metamath format.  It reads

  con3test $p |- ( ( ph -> ps ) -> ( -. ps -> -. ph ) ) $=
    ( wi wn ax-1 ax-3 ax-mp ax-2 ) ABCZABDZDZCZCZIJADZCZCZABKCZCZMQRKDZJCZQSJSC
    ZCZTSJESUAJCZCZUBTCSKUADZCZCZUDSUEDZSCZCZUGSUHESUIUFCZCZUJUGCUKULUEKFUKSEGS
    UIUFHGGSUFUCCZCZUGUDCUMUNJUAFUMSEGSUFUCHGGSUAJHGGKBFGQAEGABKHGILOCZCZMPCUOU
    PLNDZKCZCZUOLUQLCZCZUSLUQELUTURCZCZVAUSCVBVCUTUQACZCZVBUQLACZCZVEUQNLDZCZCZ
    VGUQVHDZUQCZCZVJUQVKEUQVLVICZCZVMVJCVNVOVHNFVNUQEGUQVLVIHGGUQVIVFCZCZVJVGCV
    PVQALFVPUQEGUQVIVFHGGUQLAHGUTVDURCCVEVBCUQAKHUTVDURHGGVBLEGLUTURHGGLUROCZCZ
    USUOCVRVSNJFVRLEGLUROHGGUOIEGILOHGG $.

From this, you can get different proof displays using the metamath commands.  Hope this is what you want.

Best,
Benoît

Antony Bartlett

unread,
Aug 19, 2018, 4:47:51 PM8/19/18
to meta...@googlegroups.com
OK, I've posted the contraposition (Principia Mathematica) proof to the Code Golf (StackExchange) question.


Thanks Norm for telling me about this very short proof.
Thanks Benoît for putting it in MetaMath format for me.
Thanks Mario for the natural deduction style of proof in metamath resource - I was struggling with it when Norm told me about the 59 step proof, and I'll no doubt go back to struggling with it in a day or two.

    Best regards,

        Antony


Norman Megill

unread,
Aug 21, 2018, 1:26:15 PM8/21/18
to Metamath
George Szpiro took the 95 step proof of pierce I mentioned

(((P -> Q) -> P) -> P); ! Peirce's axiom
DD2D1DD2DD2D13D2DD2D13DD2D11DD211DD211DD2D1D2DD2D1D2D2D1DD211DD2D11DD
2D13DD2D11DD211DD2D11DD211; ! 95 steps

and ran a "genetic" algorithm (that he had used in April and May to find 2 shorter proofs in pmproofs.txt).  I'll post his description of the algorithm in a separate post since it is rather long.

His algorithm found two different 43-step proofs of peirce:

DD2D1DD2DD2D13D2DD2D1311DD2D1DD22D1DD2D1311 – 43 steps

DD2D1DD2D1DD2DD2D13D2DD2D1311DD22D1DD2D1311 – also 43 steps


I think it is amazing that George's algorithm was able to find these possibly shortest proofs, particularly since I was very skeptical that it would find anything.

Separately, I tried to find a short proof by hand. and found yet another 43-step proof:

DD2D1DD2DD2D13D2DD2D1311DD2DD2D121D1DD2D131



If anyone is interested, here is how I found it by hand.

Facts I started from:

#1 From pmproofs.txt:
((~ P -> P) -> P); ! *2.18
DD2DD2D13D2DD2D1311; ! 19 steps

#2  From pmproofs.txt:
(~ P -> (P -> Q)); ! *2.21
DD2D131; ! 7 steps

#3 From the Metamath Solitaire page question 8:
From  (P -> Q) and (Q -> R) conclude (P -> R)
DD2D1 followed by 2 subproofs

#4 From some old notes:
From (P -> Q) infer (R -> Q) -> (R -> P)
DD2DD2D121D1 followed by 1 subproof


Using these facts, the following construction yields the proof.  To test the subproofs along the way, I used drule.c with one argument, e.g.  "./drule DD2DD2D121D1DD2D131" for step #5 below.

#5 From #4 applied to #2, we get (P -> (P -> Q)) -> (P -> ~ P)
DD2DD2D121D1 DD2D131
Note that drule.c returns the more general >>>PQR>~PR (Polish notation) of which #5 is a substitution instance.

#6 From #3 applied to #5 and #1, we conclude (((P -> Q) -> P) -> P)
DD2D1    DD2DD2D13D2DD2D1311    DD2DD2D121D1 DD2D131

So the final proof is:
DD2D1DD2DD2D13D2DD2D1311DD2DD2D121D1DD2D131  43 steps

Norm



On Saturday, August 18, 2018 at 1:47:11 PM UTC-4, Norman Megill wrote:
Here's an experiment to take expansion to the extreme and obtain a proof directly from axioms.

MM> prove peirce


.[rest omitted]

Norman Megill

unread,
Aug 21, 2018, 1:30:20 PM8/21/18
to Metamath

(As indicated in my previous post, here is George's description of his "genetic algorithm", copied from his email to me of 5/21/18. -Norm)


Below, I give a short summary and explanation of the algorithm.


   population = CreatePrototypes()

   population = InsertRussellWhitehead(population)  

    for generation in range(0,NrOfGenerations):

        population = Ranking(population)

        population = RemoveTwins(population)

        population = Mutations(population)

        population = FormCouples(population)

        population = Renewal(population)

        population = NextGen(population)

 

CreatePrototypes: 199 prototypes strings of the ‘symmetric tree’ form are created to form the population:

DDDDjkDjkDDjkDjkDDDjkDjkDDjkDjk.

j and k stand for axioms 1, 2 and 3. Each time a protoype is created, it is tested with drule to see whether it is a ‘legitimate’ string.

(It takes approximately 40,000 prototypes to obtain 200 legitimate prototypes of the above form; i..e, about 0.5 percent of the attempts are legitimate.)

 

InsertRussellWhitehead: the string from pmproofs that is thought to be the shortest, is entered into the population. E.g. Theorem 1:4: DD2D13D2D1D3DD2DD2D13DD2D1311

 

The theorem (i.e. the output of drule when it is run with RussellWhitehead as an input) is entered in reverse Polish notation as the ‘target’.

E.g.: Theorem 1.4: >>~PQ>~QP

 

Ranking: drule is run with all strings. (I use the word ‘string’ now. ‘Prototype’ is when a new string is created) The strings are ranked according to how much the results (the outputs of drule) overlap with the target. That is the ‘fitness’. (More on fitness below).

Of course, RussellWhitehead is ranked first because 100% of its output overlaps with the target.

 

RemoveTwins: after a few generations, it often happens that a large part of the population is identical to RussellWhitehead or to some other highly-ranked string. To prevent that the entire population becomes largely homogenous, only one copy of such strings are kept. The other copies are deleted and replaced with new prototypes. (Each new prototype ios tested, as usual, with drule to check that it is legitimate.) Thus, diversity of the population is guaranteed.

 

Mutations: a certain percentage of the j’s and k’s are randomly mutated into any other axiom. Again, all such mutated strings are tested with drule and if the string is ‘illegitimate’ another mutation is attempted. The three top-ranked strings are exempt from mutation, so they do not get lost inadvertently.

 

FormCouples:

The top-ranked quarter of the strings are the ‘dads’. Each dad randomly choses a ‘mom’ from the lower-ranked strings, with more weight being given to higher ranked moms.

 

Renewal: the strings ranked in the bottom quarter are discarded and replaced by new prototypes. (Tested with drule) This increases the population’s diversity.

 

NexGen: THIS IS THE CRUX OF THE ALGORITHM

Our population consists of 200 strings of the forms:

aaaaaaaaaaaaaaaaaaaaaaaa

BBBBBBBBBBBBBBB

Then, each pair of parents create two offspring through ‘crossover’. I.e., chunks are randomly chosen and exchanged:

 

aaaaBBBBaaaaaaaaa

BBBBBBBaaaaaaaaBBBB….

 

The chunks that are exchanged must have an equal number of D’s and axioms 1,2,3. The resulting offspring are tested with drule. If they are illegitimate, the crossover is run again, until the offspring are legitimate.

 

Then we go on to the next generation. I usually run between 500 and 200 generations. (about 30 seconds per generation).

 

A remark on fitness: the algorithm aims for two things: the overlap should eventually become 100% and the string should be as short as possible. I have tried various forms of fitness functions and what seems to work well is:

 

(Percentage of overlap) * (Length of overlap)^2

              (length of the drule-result)^0.33

 

Results:

So far, I only found two strings that are shorter than the ones listed on pmproofs (Meredith and 3.37). This may mean that the algorithm is not very effective in finding shorter strings or, what I suspect, the strings listed on pmproofs are, indeed the shortest ones.

 

What gives me confidence in the algorithm is that it found several strings that were as long as – but different from – the shortest string listed. For example, for theorem 4.73 it found three additional strings of length 113 which result in >P~>>Q~>Q~P~>~>RSR:

 

DD2DD2D1DD2D13DD2D1D2DD2DD2D13DD2D13111DD2D1D2DD2D13DD2D1D2DD2DD2D13DD2D131111D1D3DD2D1D3DD2DD2D13DD2D1311DD2D131 DD2DD2D1DD2D13DD2D1D2DD2DD2D13DD2D13111DD2D1D2DD2D13DD2D1D2DD2DD2D1DD2D1331111D1D3DD2D1D3DD2DD2D13DD2D1311DD2D131

DD2DD2D13DD2D1DD2D1D2DD2DD2D13DD2D13111DD2D1D2DD2D13DD2D1D2DD2DD2D1DD2D1331111D1D3DD2D1D3DD2DD2D13DD2D1311DD2D131

DD2DD2D1DD2D13DD2D1D2DD2DD2D13DD2D13111DD2D1D2DD2D13DD2D1D2DD2DD2D13DD2D131111D1D3DD2D1D3DD2DD2D1DD2D13311DD2D131

 

Reply all
Reply to author
Forward
0 new messages