Implementing a Computer Algebra System (CAS) over metamath

Olivier Binda

unread,

Mar 8, 2019, 4:34:02 AM3/8/19

to Metamath

Hello

I'm currently trying to implement CAS over metamath

So far, I have implemented (for decimals) :
1) parsing (metamath string -> tree)
2) addition for NN0
3) multiplication for NN0
4) compute a simplified metamath statement with mixed addition/multiplication

Next, I would like to implement subtraction A - B and I was wondering what the best approach would be :

A) convert A & B to binary representations bigIntA & bigIntB (BigInteger class),
do bigIntResult = bigIntA - bigIntB

convert bigIntResult to a metamath string result

prove that B + result = A and then that A-B = result

B) do no binary conversion, but implement it like we do in highschool, which would require stuff like

8 - 3 = 5

The same holds for division too, I guess

I am quite happy with the api I found for expressing computations (there are probably possible improvements still).
My code for multiplication is just :

fun Engine.reduceMultiplication(statement: Statement): Statement {
    val (a, op, b) = statement.children
    if (!op.equals(MM_MULTIPLICATION)) throw Exception("wrong operation : $op")
    return when {
        a.matchComplexMultiplication() -> statement.map(reduceMultiplication(a)).eq(Engine::reduceMultiplication)
        b.matchComplexMultiplication() -> statement.map(reduceMultiplication(b)).eq(Engine::reduceMultiplication)
        a.isZero() -> statement.eq(laMul02i, b) { statement0 }
        b.isZero() -> statement.eq(laMul01i, a) { statement0 }
        a.isOne() -> statement.eq(laMulid2i, b) { b }
        b.isOne() -> statement.eq(laMulid1i, a) { a }
        a.isDigit() && b.isDigit() -> statement.eq { multiplies2digits(a, b, statement) }
        a.isDigit() && b.hasDecimalConstructor() -> useTheorem(laMulcomi, a, b).asProvenEquality().eq(Engine::reduceMultiplication)
        a.hasDecimalConstructor() && b.hasDecimalConstructor() -> {
            val (b1, b2) = b.children
            /** b2 is supposed to be a digit */
            if (!b2.isDigit()) return statement
            val sA = "( ∨1c x. ∨2c )".with(a, b1)
            val sB = "( ∨1c x. ∨2c )".with(a, b2)
            useTheorem(neoremDec2Mul, a, b, "; ∨1c ∨2c".with(sA, statement0), sB, b1, b2).asProvenEquality()
                .map(reduceMultiplication(sA), reduceMultiplication(sB))
                .eq(Engine::reduceAddition)
        }
        a.hasDecimalConstructor() && b.isDigit() -> {
            val (a1, a2) = a.children
            /** a2 is supposed to be a digit */
            if (!a2.isDigit()) throw Exception("$a2 is not a digit")
            val a2TimesB = multiplies2digits(a2, b)
            if (!a2TimesB.hasDecimalConstructor()) {
                val upperResult = "( ∨1c x. ∨2c )".with(a1, b)
                useTheorem(neoremDecMul, a, b, upperResult, a2TimesB, a1, a2).asProvenEquality()
                    .map(reduceMultiplication(upperResult))
            } else {
                val (sA, sB) = a2TimesB.children
                val s1 = "( ∨1c x. ∨2c )".with(a1, b)
                val upperResult = "( ∨1c + ∨2c )".with(s1, sA)
                useTheorem(neoremDecMulCarry, a, b, upperResult, sB, a1, a2, sA).asProvenEquality()
                    .map(upperResult.map(reduceMultiplication(s1)).eq(Engine::reduceAddition))
            }
        }
        else -> statement
    }
}

private fun Engine.multiplies2digits(a: Statement, b: Statement, aTimesBOrNull: Statement? = null): Statement = when {
    a.isZero() -> useTheorem(laMul02i, b).asProvenEquality()
    b.isZero() -> useTheorem(laMul01i, a).asProvenEquality()
    a.isOne() -> useTheorem(laMulid2i, b).asProvenEquality()
    b.isOne() -> useTheorem(laMulid1i, a).asProvenEquality()
    else -> {
        // handle other digits
        val atimesB = aTimesBOrNull ?: "( ∨1c x. ∨2c )".with(a, b)
        val digitA = a.asInt()
        val digitB = b.asInt()
        val intResult = digitA * digitB
        // multiplies digits
        useTheorem(laMul(digitA, digitB))
        // build result
        val swap = digitA < digitB
        val a1TimesB1 = if (swap) "( ∨1c x. ∨2c )".with(b, a) else atimesB
        val result = if (intResult < 10) intResult.toString().toStatement() else "; ${intResult / 10} ${intResult % 10}".toStatement()
        // handle 10
        if (intResult == 10) useTheorem(laEqtri, a1TimesB1, result, useTheorem(laDec10).asProvenEqualityReversed()/*statement10*/)

        // handle commutativity
        if (swap) {
            useTheorem(laMulcomi, a, b)
            useTheorem(laEqtri, atimesB, result, a1TimesB1)
        }
        result
    }
}

Giovanni Mascellani

unread,

Mar 9, 2019, 12:13:30 PM3/9/19

to meta...@googlegroups.com

Hi Olivier,

Il 08/03/19 10:34, Olivier Binda ha scritto:

> I'm currently trying to implement CAS over metamath

I might have lost tracks of a few emails: in which language is your code
written? How can I try it?

I like your project (which would also be a long term goal of my
project). I also would like to see Metamath usable by working
mathematicians without necessarily having a background in logic and
without having to deal with all the fine details that currently Metamath
requires the user to mind.

Thanks, Giovanni.
--
Giovanni Mascellani <g.masc...@gmail.com>
Postdoc researcher - Université Libre de Bruxelles

signature.asc

Olivier Binda

unread,

Mar 9, 2019, 2:51:00 PM3/9/19

to Metamath

Hello Giovanni

At the moment, it is a (closed source) multi-platform project written in Kotlin, that only lives on my computer.
All the code is pure Kotlin which makes it work for Linux/Mac/windows (native platform), ios & android, the jvm and the browser (through the javascript target) at the exception of a metamath antlr4/hybrid parser (only available on the jvm platform)

But, I would just need to do what Norman suggested in his latest post (minimalist metamath), to implement a pure kotlin parser for metamath (also, there is this repository that theorically allows antlr4 to build a metamath parser for pure Kotlin but I haven't been able to make it work so far : https://github.com/ftomassetti/antlr-kotlin)

So, at the moment, I'm only running my code on the jvm but I aim to deliver it to all intended platforms.

I'm not against open sourcing some of the code but I won't make everything available publicly.
Concerning the result, I will probably make the binary I produce for free to use (so that everyone can use maths/metamaths).

But, to make it worth my time (I'll be doing it for years in my spare time, hoping to feed my family) there might be some kind of moneytisation (no ads, no selling privacy things, no bad stuff), like giving 2€ to access some advanced (but not necessary for maths) features I would implement (like lessons to get better in maths, statistics, rpg game using maths to kill monsters, who knows what ^^ )

Also, at the moment, it is mostly experimentation (I'm a complete beginner at implementing a cas system and I have only been playing with metamath for a few months). BUT, as soon as I have something useful for you all, metamath power users, I'll implement a jvm app or a web app to make it available for you, so that you can reap the benefits of it (and also give me precious feedback and improvement suggestions) and build proofs faster and the easy way.

I'll be posting my progress when they happen. Yesterday, I managed to do that work

assertEquals("; ; ; ; ; ; 1 0 5 4 4 4 3".toStatement(), engine.reduce("( ; 1 3 x. ( ; ; 1 2 1 + ( ( 7 x. ; 8 9 ) x. ( ; ; 1 2 3 + 7 ) ) ) )".toStatement()))

assertEquals("( ( 1 + ( ∨1c x. 2 ) ) + ( ( ∨1c x. ∨1c ) x. 3 ) )".toStatement(), engine.reduce("( ( 1 + ( ∨1c x. ( 1 + 1 ) ) ) + ( ( ∨1c x. ∨1c ) x. ( 1 + ( 1 + 1 ) ) ) )".toStatement()))

So, I only have addition and multiplication for now.

I stumbled on your repository https://github.com/giomasce/mmpp But it is gpl2, so...I just looked at the readme.
(I coded 2 years on my nintendo ds in C++ in 2009 and I remember spending 2 days to find a single bug (memory leaks :/ ),
to have gone so far with C++, you must have spent a lot of time on it, wow...)

Olivier Binda

unread,

Mar 13, 2019, 6:28:29 AM3/13/19

to Metamath

I have implemented subtraction (I chose option 1 : using help from binary computations) for positive integers

assertEquals("; ; ; ; ; ; 1 0 5 1 2 9 7".toStatement(), engine.reduce("( ; 1 3 x. ( ( ( 7 x. ; 8 9 ) x. ( ; ; 1 2 3 + 7 ) ) - ; ; 1 2 1 ) )".toStatement()))

and factorials yesterday

But I hit a big hurdle (performance)

computing 30 ! takes 1s but computing 60 ! takes 18s :/

assertEquals("; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 8 3 2 0 9 8 7 1 1 2 7 4 1 3 9 0 1 4 4 2 7 6 3 4 1 1 8 3 2 2 3 3 6 4 3 8 0 7 5 4 1 7 2 6 0 6 3 6 1 2 4 5 9 5 2 4 4 9 2 7 7 6 9 6 4 0 9 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0".toStatement(), engine.reduceFactorial("( ! ` ; 6 0 )".toStatement()))
t=t.logTime()//18096 ms

I had previously foolishly tried to compute 300 ! and I had to reboot my computer that had frozen.

Though I am using some tail recursivity optimizations, and doing it this way
n! = (n-1)! * n = ((n-2)! * (n-1)) * n = (n-2)! * reduce((n-1)*n) = ...

My guess is that there are still a lot of single digits multiplications and slow additions and those pile up (I just observed the issue, I haven't computed the O() in space/time complexity)

say a multiplication between 2 numbers with digits in the hundreds should require quite some amount of work (including associativity, commutativity, additions, single digit multiplications, ...)

Proving operations at the bare metal (metamath) level is really resource intensive

Norman Megill

unread,

Mar 13, 2019, 11:56:05 AM3/13/19

to Metamath

This looks impressive. Not having studied the code, I'm somewhat uncertain about what is going on "in between". Are you generating native metamath proofs for these statements? If so, how big are the proofs? Or how big would you estimate them to be?

From the point of view of archiving the results (in say set.mm), what matters is the size of the proofs and possibly (if they are extremely large) the run time of verifying them with a metamath proof verifier. The time needed to generate the proofs is somewhat irrelevant, since it's a one-time activity that is done once and never needed again. Of course, tools to help generate proofs more efficiently are welcome.

Obviously metamath is not well-suited to do routine arithmetic on large numbers efficiently, which has more to do with the nature of complete formal proofs from axioms than with metamath per se. In principle we can call an external "oracle" (calculator) that returns the result of a large calculation, then insert it into the metamath database as a new axiom, but philosophically that is a little ugly and would be a last resort.

An interesting challenge is to figure out how to prove that the results of certain long calculations hold without doing the calculations in the proof itself. A trivial example would be proving that 300! is an even number without expanding it. A more sophisticated example is using a certificate to show a large number is prime. Calculating the certificate can done once outside of metamath (and in principle we don't care how long the calculation takes), but proving the validity of the certificate is a relatively short proof. I think Mario was on the verge of doing this for some primes needed to prove Bertrand's postulate, but in the end direct proofs won out because the size of the numbers wasn't quite large enough to justify the overhead of using a certificate (Mario may wish to comment or correct me).

A paper written by Mario is "Arithmetic in Metamath Case Study: Bertrand's Postulate" (in case it wasn't mentioned before; I may have missed it):
http://arxiv.org/abs/1503.02349
http://us.metamath.org/downloads/bpos.pptx.pdf (slides)
http://youtu.be/PF9cL3RABIw

Norm

Mario Carneiro

unread,

Mar 13, 2019, 1:19:55 PM3/13/19

to metamath

On Wed, Mar 13, 2019 at 11:56 AM Norman Megill <n...@alum.mit.edu> wrote:

This looks impressive. Not having studied the code, I'm somewhat uncertain about what is going on "in between". Are you generating native metamath proofs for these statements? If so, how big are the proofs? Or how big would you estimate them to be?

From the point of view of archiving the results (in say set.mm), what matters is the size of the proofs and possibly (if they are extremely large) the run time of verifying them with a metamath proof verifier. The time needed to generate the proofs is somewhat irrelevant, since it's a one-time activity that is done once and never needed again. Of course, tools to help generate proofs more efficiently are welcome.

Obviously metamath is not well-suited to do routine arithmetic on large numbers efficiently, which has more to do with the nature of complete formal proofs from axioms than with metamath per se. In principle we can call an external "oracle" (calculator) that returns the result of a large calculation, then insert it into the metamath database as a new axiom, but philosophically that is a little ugly and would be a last resort.

This, by the way, is one of the planned applications of Metamath Zero (and surrounding projects). The idea is that you can verify a verifier and use it to extend the system you are running, so that you get ad hoc proof procedures with bare metal speed. I've claimed on several occasions that metamath should be able to "execute programs" by means of a proof trace, but the big overhead there is the amount of equality testing, since we are taking the verifier through the steps and also checking at each stage that the verifier is still on board. This checking is not required if we already trust the verifier, so the process can theoretically be sped up by just running the program and asserting the end result. My goal is to find ways to incorporate this into the proof process in a formally sound way.

An interesting challenge is to figure out how to prove that the results of certain long calculations hold without doing the calculations in the proof itself. A trivial example would be proving that 300! is an even number without expanding it. A more sophisticated example is using a certificate to show a large number is prime. Calculating the certificate can done once outside of metamath (and in principle we don't care how long the calculation takes), but proving the validity of the certificate is a relatively short proof. I think Mario was on the verge of doing this for some primes needed to prove Bertrand's postulate, but in the end direct proofs won out because the size of the numbers wasn't quite large enough to justify the overhead of using a certificate (Mario may wish to comment or correct me).

Actually, the proof in set.mm that 4001 is prime actually uses one of these "pocklington certificates", although it's mostly for demonstration purposes at this point because the proof of bpos doesn't use it, and also I think a direct proof by trial division would be shorter (4001 isn't large enough for the certificate to be worth it).

In the proof itself, though, there are some more applications of certificates at the smaller scale. For example, modular exponentiation is a major component of the calculation. In order to calculate a^n mod b where n is large, rather than doing n multiplications you can do log n multiplications using binary exponentiation, i.e. a^(2*n) = a^n * a^n and a^(2*n+1) = a^n * a^n * a, where the duplication of a^n (mod b) means we only have to calculate it once.

This is a well known technique, and most algorithms for doing exponentiation on the computer do just this. But what is less well known is that you can do even better than this for known values of n, using so called "addition chains". We view the problem as a way to get to n by adding 1's and previously computed values together. For example, with n=15, the binary method is essentially

3 = 1+1+1

7 = 3+3+1

15 = 7+7+1

for a total of 6 additions, while there is a shorter alternative

3 = 1+1+1

6 = 3+3

12 = 6+6

15 = 3+12

requiring only 5 additions. Finding optimal chains is quite hard and requires lots of brute forcing, but the result is a shorter final proof. I needed an addition chain for n = 800 for the 4001 proof.

On Wed, Mar 13, 2019 at 6:28 AM Olivier Binda <olus....@gmail.com> wrote:

I have implemented subtraction (I chose option 1 : using help from binary computations) for positive integers
assertEquals("; ; ; ; ; ; 1 0 5 1 2 9 7".toStatement(), engine.reduce("( ; 1 3 x. ( ( ( 7 x. ; 8 9 ) x. ( ; ; 1 2 3 + 7 ) ) - ; ; 1 2 1 ) )".toStatement()))
and factorials yesterday
But I hit a big hurdle (performance)

computing 30 ! takes 1s but computing 60 ! takes 18s :/

I might be wrong, but I think this might be a result of inefficiencies of your code rather than innate intractability of the metamath proof. It should be provable in a fraction of a second, even if you work it all out in full detail.

But like Norm says, you are profiling the wrong thing here. The question is how long is the metamath proof, and how long does it take to check that proof. The time it takes to generate the proof is irrelevant.

assertEquals("; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 8 3 2 0 9 8 7 1 1 2 7 4 1 3 9 0 1 4 4 2 7 6 3 4 1 1 8 3 2 2 3 3 6 4 3 8 0 7 5 4 1 7 2 6 0 6 3 6 1 2 4 5 9 5 2 4 4 9 2 7 7 6 9 6 4 0 9 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0".toStatement(), engine.reduceFactorial("( ! ` ; 6 0 )".toStatement()))
t=t.logTime()//18096 ms
I had previously foolishly tried to compute 300 ! and I had to reboot my computer that had frozen.

Though I am using some tail recursivity optimizations, and doing it this way
n! = (n-1)! * n = ((n-2)! * (n-1)) * n = (n-2)! * reduce((n-1)*n) = ...
My guess is that there are still a lot of single digits multiplications and slow additions and those pile up (I just observed the issue, I haven't computed the O() in space/time complexity)

I looked up fast factorial algorithms, and it turns out that there are methods asymptotically faster than the naive one, and those translate to shorter metamath proofs. Unfortunately there is very little literature on IP bounds for many common computation problems, but in this case there is a pretty easy one to describe: split your number 60! into its prime factorization (you don't have to prove that the factors are prime, they just have to be small numbers) and multiply them together in a balanced way (so that each pair being multiplied together has roughly the same size). This keeps the resulting numbers small, so that most multiplications are small except the final one producing the actual target number.

For the algorithm you show above, you can slightly optimize it using the following lemma:

facp1i.1 $e |- ( M e. NN0 /\ ( ! ` M ) = A )

facp1i.2 $e |- ( M + 1 ) = N

facp1i.3 $e |- ( A x. N ) = B

facp1i $p |- ( N e. NN0 /\ ( ! ` N ) = B )

This is obviously just a weird way to write the defining equation of the factorial, but it is optimized for algorithmic use, where the "main spine" of the proof is a series of applications of this lemma, and at each step you only have two arithmetic subgoals, facp1i.2 and facp1i.3, both of which are dischargable by the arithmetic algorithm implemented in mmj2. We could also easily augment this to apply some of the above mentioned ideas about evenly splitting the division, by means of recursive subdivision:

facsplit1i.1 $p |- M e. NN0

facsplit1i.2 $p |- ( M + 1 ) = N

facsplit1i $p |- ( M e. ( 0 ... N ) /\ ( ( ! ` N ) / ( ! ` M ) ) = N )

facspliti.1 $e |- ( L e. ( 0 ... M ) /\ ( ( ! ` M ) / ( ! ` L ) ) = A )

facspliti.2 $e |- ( M e. ( 0 ... N ) /\ ( ( ! ` N ) / ( ! ` M ) ) = B )

facspliti.3 $e |- ( A x. B ) = C
facspliti $p |- ( L e. ( 0 ... N ) /\ ( ( ! ` N ) / ( ! ` L ) ) = B )

Here the goal is to calculate ( ( ! ` ; 6 0 ) / ( ! ` 0 ) ) = ... by applying facspliti repeatedly. Unlike the previous version, here the intermediate value M is a free choice, which allows you to split wherever you want to, in order to get the shortest proof. So I ask mathematica to give me the optimal splitting, and it gives me:

split[m_, n_] := MinimalBy[Range[m + 1, n - 1], Max[#!/m!, n!/#!] &][[1]]
tree[m_, n_] := If[n <= m + 1, n, With[{k = split[m, n]}, {tree[m, k], tree[k, n]}]]
tree[1, 60]

{{{{{{{{2, 3}, 4}, {5, 6}}, {{7, 8}, 9}}, {{{10, 11}, 12}, {13, 14}}}, {{{15, 16}, {17, 18}}, {{19, 20}, {21, 22}}}},

{{{{23, 24}, {25, 26}}, {{27, 28}, 29}}, {{{30, 31}, {32, 33}}, {{34, 35}, 36}}}}, {{{{{37, 38}, 39}, {{40, 41}, 42}},

{{{43, 44}, 45}, {{46, 47}, 48}}}, {{{{49, 50}, 51}, {{52, 53}, 54}}, {{{55, 56}, 57}, {{58, 59}, 60}}}}}

This still hasn't been optimized by rearranging factors, but it gives you a sense of what can be accomplished by additional up front work to make the final result as small as possible.

Mario Carneiro

unread,

Mar 13, 2019, 2:40:29 PM3/13/19

to metamath

I tried doing this calculation (of 60!) with the arithmetic prover I wrote in lean and it takes about 1.2 s to generate and 0.4 s to verify. Lean is doing a heck of a lot more bookkeeping, so that validates my claim that this should be doable in a fraction of a second in metamath.

Mario

OlivierBinda

unread,

Mar 13, 2019, 3:45:48 PM3/13/19

to meta...@googlegroups.com

Le 13/03/2019 à 19:40, Mario Carneiro a écrit :

I tried doing this calculation (of 60!) with the arithmetic prover I wrote in lean and it takes about 1.2 s to generate

Thanks a lot, it really helps to have this.
It gives me hope that something better can be done and it also gives me a realistic time to aim for
(within the bounds of what to expect for C++ vs the jvm).

and 0.4 s to verify. Lean is doing a heck of a lot more bookkeeping, so that validates my claim that this should be doable in a fraction of a second in metamath.

Ok. Point made :)

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To post to this group, send email to meta...@googlegroups.com.
Visit this group at https://groups.google.com/group/metamath.
For more options, visit https://groups.google.com/d/optout.

OlivierBinda

unread,

Mar 13, 2019, 4:19:53 PM3/13/19

to meta...@googlegroups.com

Le 13/03/2019 à 16:56, Norman Megill a écrit :

This looks impressive.

Not that much. I have only started working with trees and parser for like 2 weeks. There is not much worth in what I have been doing yet.
I have not yet solved one non-trivial problem. Also, I am standing on the shoulders of Giants and they have been doing all the heavy lifting.

Not having studied the code, I'm somewhat uncertain about what is going on "in between". Are you generating native metamath proofs for these statements? If so, how big are the proofs? Or how big would you estimate them to be?

At this time, I am only exploring what can be done with metamath and my code (I am researching apis and methods to prove things in a simple/human way).

The code I use only generates the steps necessary to obtain a proof.
A month ago, I was using that to translate those steps into an uncompressed metamath proof that I would then check.

But I haven't rewritten that part yet (I am refactoring, to use trees instead of strings). I'll do it soon though.

so, I only generate pre-metamath proofs, I don't check them yet.

From the point of view of archiving the results (in say set.mm), what matters is the size of the proofs and possibly (if they are extremely large) the run time of verifying them with a metamath proof verifier. The time needed to generate the proofs is somewhat irrelevant, since it's a one-time activity that is done once and never needed again. Of course, tools to help generate proofs more efficiently are welcome.

I understand what you say there.

As I would like also normal people to use the code that I wrote (and not just math researchers) and
as they would expect computations to take less than 3s, I need the time to generate the proofs to be as short as possible.

At this point, there is NO optimisations (neither mathematically, nor in software) for speed, or small size of proofs.

Obviously metamath is not well-suited to do routine arithmetic on large numbers efficiently, which has more to do with the nature of complete formal proofs from axioms than with metamath per se.

Yes, I can see that now. ^^

In principle we can call an external "oracle" (calculator) that returns the result of a large calculation, then insert it into the metamath database as a new axiom, but philosophically that is a little ugly and would be a last resort.

It might be a solution to have a math app, with decent computing time, for the general public.
But it wouldn't be acceptable for the professional mathematician community. So, I guess that it will be necessary to find other solutions.

It might be possible to quickly produce a special kind of proof, that explicitely relies on some computations, that one could check once to promote the proof from "true if computations hold" status to "metamath-true" status

Also, I am really interested in what Mario Carneiro says there :

this, by the way, is one of the planned applications of Metamath Zero (and surrounding projects). The idea is that you can verify a verifier and use it to extend the system you are running, so that you get ad hoc proof procedures with bare metal speed. I've claimed on several occasions that metamath should be able to "execute programs" by means of a proof trace, but the big overhead there is the amount of equality testing, since we are taking the verifier through the steps and also checking at each stage that the verifier is still on board. This checking is not required if we already trust the verifier, so the process can theoretically be sped up by just running the program and asserting the end result. My goal is to find ways to incorporate this into the proof process in a formally sound way.

But, first I have to understand how it can be done and what would be needed.

An interesting challenge is to figure out how to prove that the results of certain long calculations hold without doing the calculations in the proof itself. A trivial example would be proving that 300! is an even number without expanding it. A more sophisticated example is using a certificate to show a large number is prime. Calculating the certificate can done once outside of metamath (and in principle we don't care how long the calculation takes), but proving the validity of the certificate is a relatively short proof. I think Mario was on the verge of doing this for some primes needed to prove Bertrand's postulate, but in the end direct proofs won out because the size of the numbers wasn't quite large enough to justify the overhead of using a certificate (Mario may wish to comment or correct me).

What you say is true. But this is math research (do things the smarter way). At this point, I'm aiming to allow everyone to do basic maths, but with metamath proof-check truth.

If I manage to build a decent tool achieving this purpose, what you say would be a good next step though :)

A paper written by Mario is "Arithmetic in Metamath Case Study: Bertrand's Postulate" (in case it wasn't mentioned before; I may have missed it):
http://arxiv.org/abs/1503.02349
http://us.metamath.org/downloads/bpos.pptx.pdf (slides)
http://youtu.be/PF9cL3RABIw

Norm

On Wednesday, March 13, 2019 at 6:28:29 AM UTC-4, Olivier Binda wrote:
I have implemented subtraction (I chose option 1 : using help from binary computations) for positive integers
assertEquals("; ; ; ; ; ; 1 0 5 1 2 9 7".toStatement(), engine.reduce("( ; 1 3 x. ( ( ( 7 x. ; 8 9 ) x. ( ; ; 1 2 3 + 7 ) ) - ; ; 1 2 1 ) )".toStatement()))
and factorials yesterday
But I hit a big hurdle (performance)

computing 30 ! takes 1s but computing 60 ! takes 18s :/
assertEquals("; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 8 3 2 0 9 8 7 1 1 2 7 4 1 3 9 0 1 4 4 2 7 6 3 4 1 1 8 3 2 2 3 3 6 4 3 8 0 7 5 4 1 7 2 6 0 6 3 6 1 2 4 5 9 5 2 4 4 9 2 7 7 6 9 6 4 0 9 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0".toStatement(), engine.reduceFactorial("( ! ` ; 6 0 )".toStatement()))
t=t.logTime()//18096 ms
I had previously foolishly tried to compute 300 ! and I had to reboot my computer that had frozen.

Though I am using some tail recursivity optimizations, and doing it this way
n! = (n-1)! * n = ((n-2)! * (n-1)) * n = (n-2)! * reduce((n-1)*n) = ...

My guess is that there are still a lot of single digits multiplications and slow additions and those pile up (I just observed the issue, I haven't computed the O() in space/time complexity)

say a multiplication between 2 numbers with digits in the hundreds should require quite some amount of work (including associativity, commutativity, additions, single digit multiplications, ...)

Proving operations at the bare metal (metamath) level is really resource intensive

--

Mario Carneiro

unread,

Mar 13, 2019, 6:03:51 PM3/13/19

to OlivierBinda, metamath

On Wed, Mar 13, 2019 at 4:31 PM OlivierBinda <olus....@gmail.com> wrote:

Le 13/03/2019 à 18:19, Mario Carneiro a écrit :
> his, by the way, is one of the planned applications of Metamath Zero
> (and surrounding projects). The idea is that you can verify a verifier
> and use it to extend the system you are running, so that you get ad
> hoc proof procedures with bare metal speed. I've claimed on several
> occasions that metamath should be able to "execute programs" by means
> of a proof trace,

I saw that in a previous mail but I only had 5 minutes to ponder it and
failed to truely grasp what you said (mostly what/how it should be done)

While you do a computation in a software, you produce a log of the
operations (with trees) you do and you verify it later with a metamath
veryfier, is that it ?

That's what I would call "the traditional approach". It's how the arithmetic algorithm and other proof producing algorithms for metamath work. You run an algorithm that produces a proof, and you copy paste it into a metamath file and forget about the means of generation; thenceforth the checking is done by the verifier. (Because it will be generated only once and verified many times, there is an interest in keeping the proof size down here, even at the expense of lots of upfront computation.) This is what I explained in my last email.

Metamath zero is trying something different. It usually goes by the name "proof reflection" or "code extraction". You prove a program is correct, then you run the program and believe the result. This is what is done in some proof assistants like Isabelle and Coq. At the end you don't get a proof, at least not in the sense meant by metamath. The problem is that the checking of the proof dominates the runtime, and even if you are being clever there are still significant overheads associated with symbol manipulation of the sort done by metamath.

As mentioned, there are other proof assistants that have implemented this. However, these external programs are not available in the logic, or else they are in a "less trusted" area, because the compilation or translation of the program from the logic into the target language adds some possibility for error. My goal is to solve this problem from the bottom up, where the compilation itself is proven, so that the results coming from the executing proven-correct code are just as trustworthy as the verifier that performed the extraction.

If verifier A is used to define and model verifier B, that produces a program P, then verifier A is in a position to say what it means for P to be correct, and for B to compile programs correctly. Then the execution history of program P will relate to proofs in ZFC such that verifier A will be able to prove statements like "if program P returns true then ZFC |- 2 + 2 = 4" in ZFC. This means that if B has a built in rule saying "if P returns true then admit 2 + 2 = 4 as an axiom", then verifier A can prove that this axiom is conservative, so verifier B is still producing valid proofs. Finally, we can now use verifier B, which is able to run external programs to prove facts, with complete confidence. We have "amplified" verifier A, which has a slower proof method, without compromising correctness.

In fact, the proof by verifier A that verifier B is able to compile programs that act as certificates for theorems is constructive, so we even have a way to take the compiled program and the execution semantics of program P and turn it into a proof (assuming P isn't exploiting nondeterminism of the architecture) that would be acceptable to verifier A. That is, we can undo the speed-up process to produce a formal proof in the base language, say metamath. This proof might be intractably long, though.

But we can do even deeper. Suppose verifier B verifies another verifier C. Now B has an advantage over A, which is that it can extract programs that it can prove are correct. So if it can prove that C is a correct verifier, then it can run C to verify proofs and incorporate the results into B. So B is a "universal verifier": it can check proofs in languages other than itself. (There are Godel issues that I am glossing over here, but that's the basic story.)

Mario

> but the big overhead there is the amount of equality testing, since we
> are taking the verifier through the steps and also checking at each
> stage that the verifier is still on board. This checking is not
> required if we already trust the verifier, so the process can
> theoretically be sped up by just running the program and asserting the
> end result. My goal is to find ways to incorporate this into the proof
> process in a formally sound way.

I am really really interested in what you say here.
But, as I have very little time each day to do metamath work, I don't
see/understand yet how this could be done.
I have been following from very far the discussions about Metamath zero
but I don't understand the deep meanings/ideas behind what is said (no
time to ponder :/).

I would really really love some theory/explanation/sketch about what
would be needed for a metamath (or metamath zero) system to be extendable
and to be extended and some hint about how to do it

if I understood it and if I was convinced that it should be doable, I
would try to implement it.

Mario Carneiro

unread,

Mar 13, 2019, 10:58:12 PM3/13/19

to OlivierBinda, metamath

I revisited the arithmetic infrastructure, when I realized that there is a way to make the proofs shorter without an essential change to the approach. Here is a proof of 1259lem1:

Notably: there are about 25% less steps than (http://us2.metamath.org/mpeuni/1259lem1.html); and the actual "work" steps of the algorithm are now almost all the steps, rather than before, where the "work" steps (the dec* theorems with many hypotheses) were interspersed with lots of equality and closure steps, about 6 to 1 by looking at it. The definition of "calc" is very simple:

|- ( calc ( A + B = C ) <-> ( A e. NN0 /\ B e. NN0 /\ ( A + B ) = C ) )

|- ( calc ( A x. B + C = D ) <-> ( ( A e. NN0 /\ B e. NN0 /\ C e. NN0 ) /\ ( ( A x. B ) + C ) = D ) )

By bundling closure we don't have to worry about closure steps in the middle of the algorithm, so this actually achieves the (log10 n)^2 bound for multiplication (with a constant factor of 1). It turns out that having a definition is also important for minimizing syntax steps in the proof (my original version of this modification was to use the RHS of these rules directly, but it bloated the compressed proofs enough to negate any benefit). There are a lot of support theorems though (about the same as the dec* set of theorems), and in my tests it didn't perform very well on small proofs. I suspect the lack of leaf theorems like calc ( 2 + 2 = 4 ) was a major factor; we would have to redo the multiplication table with these calc things to save a step.

On the bright side, while writing this email I accidentally deleted the whole thing, so we can consider some similar designs without being tied to my prototype. ;) Is it worth the refactor?

Olivier Binda

unread,

Mar 14, 2019, 5:31:03 PM3/14/19

to Metamath

Thanks for your answers. I understand a lot more now and you gave me a solution to a problem I had.

I was wondering how I could extend my system to allow user to write "tactics" and I realize now that I could :

A) Generate code, compile it and dynamically integrate the binary result to my running app.
One source code is generated, on a linux/mac/windows box (but not on android/ios/a browser) it is possible to call an external compiler if it is setup.

All the heavy lifting would (again) be done by world experts (the people writing the kotlin compiler). But it would make it possible to use a battle tested and well knowed heavy usage language to do tactics programming

B) write an interpreter

C) Another possibility would be to integrate a scripting language, like lua, but it would be cumbersome to have a bridge to communicate and translate objects to lua.

D) then there is the possibility that you are researching (Metamath zero and a minimal footprint language, hopefully with a simple compiler)

I think that I'll first experiment with possibility A. ^^

Also, I am wondering
1) how it would be possible to speed up multiplications with a Verifier B, whose code would be checked by a veryfier A.

Basically, it would solve a lot of things to have multiplications as fast as a CPU can do them, but how can we trust those (with a metamath-like level) ?

cpu do things and have bugs, how can we be sure that they behave like they are modelled to be ?

The only way I see is to trust the fast hardware multiplications of verifier B (despite gamma rays, non ecc ram, and all bugs machine related), that was coded and proof checked to follow high standards (it should still be better than human proof checking though).

2) how would using a small footprint language improve the trust in the verifier in a meaningful way ?
(there is still the os underneath except if you build a very simple one that works on some generic motherboard, but this brings it's own can of worms...isn't it overkill ?)

Also, I implemented power and division for positive integers. Agian in a dumb (but simple) manner.
I took note of the possibilities of speed up that you mentionned

assertEquals("; ; ; ; ; ; 1 1 1 3 7 4 9".toStatement(), engine.reduce("( ( ; 2 6 x. ( ( ( 7 x. ; 8 9 ) x. ( ; ; 1 2 3 + ( 7 x. ( ! ` 5 ) ) ) ) - ; ; 2 3 8 ) ) / ; 1 4 )".toStatement()))
t = t.logTime()
assertEquals("; ; ; ; ; ; 1 4 5 6 4 4 1".toStatement(), engine.reduce("( ( ( ; 2 6 + ( 2 ^ 3 ) ) x. ( ( ( 7 x. ; 8 9 ) x. ( ; ; 1 2 3 + ( 7 x. ( ! ` 5 ) ) ) ) - ; ; 2 3 8 ) ) / ; 1 4 )".toStatement()))

Sadly, I haven't understood the trace thing yet but I remember that you were talking about it in your article about arithmetics, so I'll go read it once more.

Mario Carneiro

unread,

Mar 14, 2019, 9:40:31 PM3/14/19

to metamath

On Thu, Mar 14, 2019 at 5:31 PM Olivier Binda <olus....@gmail.com> wrote:

Thanks for your answers. I understand a lot more now and you gave me a solution to a problem I had.

I was wondering how I could extend my system to allow user to write "tactics" and I realize now that I could :

A) Generate code, compile it and dynamically integrate the binary result to my running app.
One source code is generated, on a linux/mac/windows box (but not on android/ios/a browser) it is possible to call an external compiler if it is setup.

All the heavy lifting would (again) be done by world experts (the people writing the kotlin compiler). But it would make it possible to use a battle tested and well knowed heavy usage language to do tactics programming

B) write an interpreter

C) Another possibility would be to integrate a scripting language, like lua, but it would be cumbersome to have a bridge to communicate and translate objects to lua.

D) then there is the possibility that you are researching (Metamath zero and a minimal footprint language, hopefully with a simple compiler)

I think that I'll first experiment with possibility A. ^^

Also, I am wondering
1) how it would be possible to speed up multiplications with a Verifier B, whose code would be checked by a veryfier A.
Basically, it would solve a lot of things to have multiplications as fast as a CPU can do them, but how can we trust those (with a metamath-like level) ?
cpu do things and have bugs, how can we be sure that they behave like they are modelled to be ?

This is definitely a concern. You are indeed trusting the CPU multiplication is behaving the way it is modeled to be. The main purpose here is to get all our assumptions up front; depending on how deep you want to go there will be some model of machine instructions and what they do and you say to readers "I trust that the CPU you are running conforms to this specification". Our ability to get trust higher than that is limited by lots of real world issues, like getting Intel to publish their chip designs so that people can review it for bugs. You can still do the usual things to gain confidence that this is correct, e.g. running self tests, and also by being as vague as possible in the spec so that there is room for nondeterministic behavior of the hardware, and relying on fewer features of the CPU (like sticking to basic add/mul etc). This trust is modulated by the end user, in the same way that we have users accept an app to have certain permissions, we have the user accept certain properties of the system they are running.

It's important to realize that this is not a new property; in fact we trust the computer to behave as advertised in order to run *any software at all*. The important thing here is to get our assumptions up front, and use those assumptions to their maximum effect. In particular, if the machine instructions are buggy, what is to say that verifier A hasn't been misled due to misexecution of its own instructions? It doesn't have any fancy hardware acceleration but it's still running on hardware and just as susceptible to corruption.

2) how would using a small footprint language improve the trust in the verifier in a meaningful way ?
(there is still the os underneath except if you build a very simple one that works on some generic motherboard, but this brings it's own can of worms...isn't it overkill ?)

I've been talking with Giovanni Mascellani about exactly that: how do we avoid depending on the OS? Actually I think the story is pretty simple there: for the most part a verifier that runs on the bare metal doesn't look any different from one that runs in an OS, except in the input and output, so you can go whole hog with OS avoidance if you want, or not if you trust the OS not to corrupt your memory, and there isn't a lot of additional code needed either way.

"Isn't it overkill?" Yes, but I think we need to start at the bottom when building security. There are lots of reasons to believe that every piece of software you use has bugs in it, and those bugs can and often are exploited by malicious agents to make bad things happen. A security proof that builds on a buggy foundation is exploitable, so we have to rip out the foundation to the highest possible degree if we want to seal the leaks.

Also, I implemented power and division for positive integers. Agian in a dumb (but simple) manner.
I took note of the possibilities of speed up that you mentionned
assertEquals("; ; ; ; ; ; 1 1 1 3 7 4 9".toStatement(), engine.reduce("( ( ; 2 6 x. ( ( ( 7 x. ; 8 9 ) x. ( ; ; 1 2 3 + ( 7 x. ( ! ` 5 ) ) ) ) - ; ; 2 3 8 ) ) / ; 1 4 )".toStatement()))
t = t.logTime()
assertEquals("; ; ; ; ; ; 1 4 5 6 4 4 1".toStatement(), engine.reduce("( ( ( ; 2 6 + ( 2 ^ 3 ) ) x. ( ( ( 7 x. ; 8 9 ) x. ( ; ; 1 2 3 + ( 7 x. ( ! ` 5 ) ) ) ) - ; ; 2 3 8 ) ) / ; 1 4 )".toStatement()))
Sadly, I haven't understood the trace thing yet but I remember that you were talking about it in your article about arithmetics, so I'll go read it once more.

The basic idea is to write a program that produces a proof by recursion on the term. For example, here's how you add one to a decimal number:

add1(0) = add0l(1)

add1(n) if n < 10 = addsmall(n, 1)

add1(; a b) = if b < 9

then addi(a, add1(b))

else addci(add1(a), add9)

Here add0l(n) is the theorem that proves |- 0 + n = n, addsmall(a,b) is the proof of |- a + b = ... where 0 < b <= a < 10 are fixed numbers (these have names like 3p2e5 in metamath), addi(a, h) is the theorem decaddi, which proves |- ( ; a b + c ) = ; a d, where h is the subproof of |- ( b + c ) = d, and addci(h1, h2) is the theorem decaddci which takes a proof h1 |- ( a + 1 ) = d and h2 |- ( b + n ) = ; 1 c and produces a proof of |- ( ; a b + n ) = ; d c.

It's not hard to write these kinds of definitions by recursion on the term, and you can get arbitrarily complicated about getting fast and efficient proof procedures this way. You can call the resulting proof a "trace of execution" of the program, or you can call the program a recursive function to compute the proof. The structure of the proof closely mirrors the branching taken by the program, so if each step in the program comes with a proof then you get a trace of execution.

In more complicated situations, there may be times when you don't want to trace the full execution, so that you get a more oracle like behavior. For example, if you want to prove that a linear program is infeasible (i.e. I have a bunch of inequalities like 3x+2y >= 5 and 6y-x <= 1 and want to prove false) it suffices to come up with a weighted sum of all the equations such that when you add them together you get an equation like 0 <= -3 which is clearly impossible. It can be difficult to find the weights, but they form a certificate for the problem in that once you have them you just plug them in and the theorem is proven. So here we don't want to incur the cost of running the LP in the verifier, so we run the LP externally and get the certificate, then encode the certificate in the proof for the verifier.

Mario

OlivierBinda

unread,

Mar 15, 2019, 5:15:49 AM3/15/19

to meta...@googlegroups.com

Wow. Thanks Mario. I really love your answers.

Thanks for the insight about trust and security. I had guessed why you wanted maximal security and why you wanted to go all the way to get it,
but hadn't got the reasoning behind. It makes a lot more sense now for me, especially the VerifierA proves VerifierB thing and the trust contract with the user.

Also, I have also understood the trace thing.
My code is actually implemented the way you describe it (with recursion and simple steps that have proofs).
I was anticipating that it would be possible to plug in arbitrarily smart/fast procedures this way and I'm happy to see some confirmation of that fact in your answer (that is the way I wanted it to be, there should be better things down the road then performance and mathematically wise :) )

I also got a bonus : I hadn't understood the certificate thing either but now I do (you are really great at teaching things actually).

with further thought I realized also that I could also do a better A solution for my problem :

I generate code, compile it on my computer with the kotlin compiler and statically generate a working solution for all platforms (instead of letting users doing it, if their setup allows it).

So, I am really happy. I am still concerned about a few technical issues though :

to create a higher level verifier, I can generate code but I have to model it (it is my trust contract with users).
So I need a metamath(like) language to model maths and a computational language to model a higher level verifier/ computations

a) Does it require modeling metamathematics ? arithmetic ? a new language ? some new axioms in metamath ?
b) metamath was avoiding a number model to stay simple, it violates this principle, right ? or is there a way to keep simple but do higher level stuff ?

I would really love to try building such a higher level verifier (say, with a simple aim first like) that does fast additions and/or fast multiplications,
also I would open source it's code in the public domain so it could be used by everyone.

Concerning your metamath zero project.

If I understand correctly, you just need to implement it for one platform, preferably a simple (not bloated with useless hardware), widely available and cheap platform.

Something like the raspberry pi 0 (for 5€) an intel compute stick (70+€) or some ARM hardware

You don't need a whole os as you just need to input strings and do simple output (strings or booleans ?)
But you still need to handle parts of the stuff associated with usb, memory, ...

A single purpose minimalist bare metal "os" (more like malloc and friends) sounds doable, the meat of the thing
would be indeed the metamath zero language design and modeling verifiers and bootstrapping things

I guess that the professionnal mathematical community could welcome a 5€ device that plugs in a usb port and allow everyone to proof-check maths...

It sounds to me like a worthy goal that could bring a lot of value.

Mario Carneiro

unread,

Mar 15, 2019, 8:28:53 AM3/15/19

to metamath

On Fri, Mar 15, 2019 at 5:15 AM OlivierBinda <olus....@gmail.com> wrote:

Wow. Thanks Mario. I really love your answers.

Thanks for the insight about trust and security. I had guessed why you wanted maximal security and why you wanted to go all the way to get it,
but hadn't got the reasoning behind. It makes a lot more sense now for me, especially the VerifierA proves VerifierB thing and the trust contract with the user.

Also, I have also understood the trace thing.
My code is actually implemented the way you describe it (with recursion and simple steps that have proofs).
I was anticipating that it would be possible to plug in arbitrarily smart/fast procedures this way and I'm happy to see some confirmation of that fact in your answer (that is the way I wanted it to be, there should be better things down the road then performance and mathematically wise :) )

I also got a bonus : I hadn't understood the certificate thing either but now I do (you are really great at teaching things actually).

with further thought I realized also that I could also do a better A solution for my problem :

I generate code, compile it on my computer with the kotlin compiler and statically generate a working solution for all platforms (instead of letting users doing it, if their setup allows it).

So, I am really happy. I am still concerned about a few technical issues though :

to create a higher level verifier, I can generate code but I have to model it (it is my trust contract with users).
So I need a metamath(like) language to model maths and a computational language to model a higher level verifier/ computations

a) Does it require modeling metamathematics ? arithmetic ? a new language ? some new axioms in metamath ?
b) metamath was avoiding a number model to stay simple, it violates this principle, right ? or is there a way to keep simple but do higher level stuff ?

It requires modeling everything in the target language. That might include fixed length integer types, an execution model, and so on. You can modulate the goal by aiming for a higher level, say compiling into a simple subset of C, and then you only have to worry about the features at that abstraction level rather than the hardware.

Metamath doesn't have a number model, but it built one from set theory. In the same way you can model any programming languages in ZFC (or even simpler axioms like PA) without additional axioms. You just define what it means for a program to execute.

I would really love to try building such a higher level verifier (say, with a simple aim first like) that does fast additions and/or fast multiplications,
also I would open source it's code in the public domain so it could be used by everyone.

I think that aiming for full bootstrapping verification, as I am, is a high goal and I wouldn't suggest you undertake it lightly. There is a much easier approach, which is to do code extraction *without* verification. You just run the code and trust the result, and that's the whole story. This is much simpler to implement and unless you are being totally paranoid about trust like I am there isn't that much to gain in going to the low level stuff. Lean does this by way of a virtual machine - the user is able to write programs in the logic, and it will turn them into bytecode and then execute them with a virtual machine. This is slightly slower than full compilation but it allows you to avoid building in a dependency on gcc or something for executing your program on the target machine. For a CAS with a REPL this is probably the way you want to go anyway, with a Python style interpreter because no one wants to wait for their single function call to compile and link first.

Concerning your metamath zero project.

If I understand correctly, you just need to implement it for one platform, preferably a simple (not bloated with useless hardware), widely available and cheap platform.

Yes, although it should be hardware accessible to the end user, so it can't be too obscure.

Something like the raspberry pi 0 (for 5€) an intel compute stick (70+€) or some ARM hardware

Rasberry Pi actually sounds like a pretty nice idea. I think a rasberry pi set.mm verifier would make a nice headline :)

You don't need a whole os as you just need to input strings and do simple output (strings or booleans ?)
But you still need to handle parts of the stuff associated with usb, memory, ...

In the core of the program, you don't need anything except instructions like add and branch and memory load/store. But you need to input a file and output a boolean (for straight verification) or a string (for a compiler), and you need to communicate it to the user. So the driver needs to be able to interact with the system in some way. You can get that down to either a few system calls on an OS, or writing to a memory mapped VGA buffer without an OS, or probably a memory mapped device driver to blink a light, say, on rasberry pi.

A single purpose minimalist bare metal "os" (more like malloc and friends) sounds doable, the meat of the thing
would be indeed the metamath zero language design and modeling verifiers and bootstrapping things

I guess that the professionnal mathematical community could welcome a 5€ device that plugs in a usb port and allow everyone to proof-check maths...

It sounds to me like a worthy goal that could bring a lot of value.

That's my hope! I think it's really not something we've seen before, and it takes software safety to a whole new level.

Mario

Giovanni Mascellani

unread,

Mar 15, 2019, 8:47:06 AM3/15/19

to meta...@googlegroups.com

Hi,

Il 15/03/19 10:15, OlivierBinda ha scritto:

> If I understand correctly, you just need to implement it for one
> platform, preferably a simple (not bloated with useless hardware),
> widely available and cheap platform.

I am currently working on the asmc project[1], whose aim is to provide
an extremely simple kernel that can run on bare metal and whose only
feature is a compiler for a language that I called G, designed to be
relatively similar to C, but easy to compile at a very low level. The
entire OS and G compiler currently fits in less then 7KB of compiled
binary code (less than 8KB if you want some debug features to be
enabled, but they are not required). I am still working on making it as
small and as easy verifiable as I can, although I think I am now getting
close to what is possibly achievable (patches are welcome!).

[1] https://gitlab.com/giomasce/asmc

When I begun writing asmc I was interested in compiler and operating
system bootstrapping, so the basic idea behind asmc is that the G
compiler compiles a simple C compiler, which then compiles a more
advanced C compiler, sufficiently advanced to be able to compile the
Linux kernel and eventually GCC and the GNU toolchain, so that in the
end you have a running Linux system with a development toolchain having
started with just 7 KB of binary code and a lot of sources.

A few weeks ago, in discussion with Mario Carneiro on this list, it
became apparent that asmc could also be used for boostrapping a MM0
verifier (or also a MM verifier), which could then be used for verifying
the asmc kernel and G compiler itself, creating a very tight trust loop
that could then be used to trustedly bootstrap other programs or proofs
(including the thing I described as the original asmc target). This is
currently work in progress: as soon as the specs for MM0 will be frozen,
I will implement them in G. In the meantime, I have begun collecting the
exact specification asmc assumes from the underlying computing device
(basically a simplified x86 CPU), so that the MM0 infrastructure to
prove asmc correctness could be eventually set up. It will be a lot of
work anyway, but interesting and useful work.

Maybe, since MM0 is still work in progress, I could write a verifier for
MM in the meantime. It should not be hard and it would be nice to have a
verification of set.mm on bare metal.

> Something like the raspberry pi 0 (for 5€) an intel compute stick (70+€)
> or some ARM hardware

As I said, asmc is currently based on x86. It is possible to port it to
new architectures, with some work. The architecture I really would like
to target is RISC-V, which, given its open source status (and given the
wide availability of open source implementations) is particularly well
suited for establishing a trust chain.

> You don't need a whole os as you just need to input strings and do
> simple output (strings or booleans ?)
> But you still need to handle parts of the stuff associated with usb,
> memory, ...
>
> A single purpose minimalist bare metal "os" (more like malloc and
> friends) sounds doable, the meat of the thing
> would be indeed the metamath zero language design and modeling verifiers
> and bootstrapping things

There are quite a lot of initiatives in the open hardware community that
might be useful here. For example, I own a little fomu hacker
edition[2], which is basically a little FPGA that fits inside a USB
port. The FPGA chip has a nice open source toolchain thanks to a bunch
of people who managed to reverse engineer it (most FPGA toolchains are
proprietary), and a simple RISC-V soft CPU can be synthesized on it. In
line of principle, if asmc was ported to RISC-V (which is on my to do
list, although not really on the top of it) we would have something like
what your say, built with a fully open source toolchain.

[2] https://github.com/im-tomu/fomu-hardware/tree/master/hacker

The problem is that the FPGA does not have a lot of RAM and storage, so
this is still a little bit theoretical. Maybe a streaming MM or MM0
might be implemented, so that the FPGA does not need to store the whole
database in RAM, but can verify it while the host computer is sending it.

However, the RISC-V community has a lot of other devices and chips that
might be useful as well, and definitely many others will arrive in the
years to come.

> I guess that the professionnal mathematical community could welcome a 5€
> device that plugs in a usb port and allow everyone to proof-check maths...
>
> It sounds to me like a worthy goal that could bring a lot of value.

Sure. Let's keep piling up bricks to build it!

signature.asc

Mario Carneiro

unread,

Mar 15, 2019, 10:28:28 AM3/15/19

to metamath

I think it's more of theoretical interest, but I recently stumbled on the ForwardCom instruction set project (https://www.forwardcom.info/), which is also open source and tries to address the issues with the proprietary x86 architecture.

> You don't need a whole os as you just need to input strings and do
> simple output (strings or booleans ?)
> But you still need to handle parts of the stuff associated with usb,
> memory, ...
>
> A single purpose minimalist bare metal "os" (more like malloc and
> friends) sounds doable, the meat of the thing
> would be indeed the metamath zero language design and modeling verifiers
> and bootstrapping things

There are quite a lot of initiatives in the open hardware community that
might be useful here. For example, I own a little fomu hacker
edition[2], which is basically a little FPGA that fits inside a USB
port. The FPGA chip has a nice open source toolchain thanks to a bunch
of people who managed to reverse engineer it (most FPGA toolchains are
proprietary), and a simple RISC-V soft CPU can be synthesized on it. In
line of principle, if asmc was ported to RISC-V (which is on my to do
list, although not really on the top of it) we would have something like
what your say, built with a fully open source toolchain.

[2] https://github.com/im-tomu/fomu-hardware/tree/master/hacker

The problem is that the FPGA does not have a lot of RAM and storage, so
this is still a little bit theoretical. Maybe a streaming MM or MM0
might be implemented, so that the FPGA does not need to store the whole
database in RAM, but can verify it while the host computer is sending it.

That's an interesting idea. One of the advantages of writing at a low level is that you can get a seriously small footprint, and I can believe that you can verify set.mm with only 128KB RAM if you are clever. Of course both MM and MM0 can handle streaming databases, since MM is a one-pass file format and MM0 has a state-transition like proof file format (once I finish writing it...).

Mario

Olivier Binda

unread,

Mar 16, 2019, 5:54:11 PM3/16/19

to Metamath

This is some impressive stuff that you are doing. Wow.

A compiler that compiles a compiler that compiles a compiler that compiles linux... and everything in 8kb...
Gosh, the boasting rights ! ^^

G really looks like a perfect fit for a metamath (zero) verifier.

Giovanni Mascellani

unread,

Mar 17, 2019, 4:02:11 AM3/17/19

to meta...@googlegroups.com

Hi,

Il 16/03/19 22:54, Olivier Binda ha scritto:

> This is some impressive stuff that you are doing. Wow.
> A compiler that compiles a compiler that compiles a compiler that
> compiles linux... and everything in 8kb...
> Gosh, the boasting rights ! ^^

To be fair, 8KB is just the first compiler (in its assembled binary
form). All the rest is much bigger, but it enters the system in source
code form, so it is easier to check that it has no surprises. So, in a
sense, 8KB is the "trusted source", i.e., the binary blob that you have
to trust (or manually inspect with the Intel programming manuals at
hand) to, for example, become sure that no backdoor was left from a
previous compilation.

Also, the Linux step is not achieved yet. I have a patched Linux version
that more or less compiles with tinycc (the compiler just before it),
but there are some tricks I have to put in place to correctly link it,
because tinycc is not as powerful as GCC at linking things.

Also, compiling Linux is not nearly enough for running Linux: the
userland has to be compiled as well, and that is still to do. Still,
overall, I cannot deny that what I have already done gives me some
satisfaction...

signature.asc

Olivier Binda

unread,

Apr 4, 2019, 11:38:32 AM4/4/19

to Metamath

I have lately implemented addition, subtraction, multiplication, division, power, factorial reduction for QQ, ZZ and NN0

It allowed me to improve the api/methods I was using to do it. Now, the code for multiplication looks like

package org.lakedaemon.math.engine.trees.operations

import org.lakedaemon.kotlin.asInt
import org.lakedaemon.kotlin.debug
import org.lakedaemon.math.BigInteger
import org.lakedaemon.math.engine.*

/** compute ( a + b ) AND proves that
 *  |- statement = reduceAddition(statement)
 *
 *  It must only be called on a co node and it ensures that it is only called on a co(+) node */
fun Engine.reduceMultiplication(statement: Statement): Statement {
    // val statement = from(originalStatement)

    val (a, op, b) = statement.children

    // if ((!a.matchDigit() && !a.matchDecimalConstructor()) || (!b.matchDigit() && !b.matchDecimalConstructor())) return statement

    if (!op.equals(MM_MULTIPLICATION)) throw Exception("wrong operation : $op")
    return when {

        a.matchZero() -> statement.s(laMul02i)
        b.matchZero() -> statement.s(laMul01i)
        a.matchOne() -> statement.s(laMulid2i)
        b.matchOne() -> statement.s(laMulid1i)
        a.matchMultiplication() -> statement.a(Engine::reduceMultiplication).eqIfDifferent(statement, Engine::reduceMultiplication)
        b.matchMultiplication() -> statement.c(Engine::reduceMultiplication).eqIfDifferent(statement, Engine::reduceMultiplication)
        a.matchDigit() && b.matchDigit() -> statement.s(Engine::multiplies2digits)
        a.matchDigit() && b.matchDecimalConstructor() -> statement.s(laMulcomi).s(Engine::reduceMultiplication)
        a.matchDecimalConstructor() && b.matchDecimalConstructor() -> {


            val (b1, b2) = b.children
            /** b2 is supposed to be a digit */

            if (!b2.matchDigit()) return statement


            val sA = "( ∨1c x. ∨2c )".with(a, b1)
            val sB = "( ∨1c x. ∨2c )".with(a, b2)

            statement.s { result(neoremDec2Mul, a, b, "; ∨1c ∨2c".with(sA, statement0), sB, b1, b2) }
                .map(P.a.a to Engine::reduceMultiplication, P.c to Engine::reduceMultiplication)
                .s(Engine::reducePositiveIntegerAddition)
        }
        a.matchDecimalConstructor() && b.matchDigit() -> {


            val (a1, a2) = a.children
            /** a2 is supposed to be a digit */

            if (!a2.matchDigit()) throw Exception("$a2 is not a digit")
            val a2TimesB = multiplies2digits("( ∨1c x. ∨2c )".with(a2, b))
            if (!a2TimesB.matchDecimalConstructor()) {


                val upperResult = "( ∨1c x. ∨2c )".with(a1, b)

                statement.s { result(neoremDecMul, a, b, upperResult, a2TimesB, a1, a2) }.a(Engine::reduceMultiplication)


            } else {
                val (sA, sB) = a2TimesB.children
                val s1 = "( ∨1c x. ∨2c )".with(a1, b)
                val upperResult = "( ∨1c + ∨2c )".with(s1, sA)

                statement.s { result(neoremDecMulCarry, a, b, upperResult, sB, a1, a2, sA) }
                    .map(P.a.a to Engine::reduceMultiplication)
                    .a(Engine::reducePositiveIntegerAddition)
            }
        }
        a.matchMinus() && b.matchMinus() -> statement.s(laMul2negi).s(Engine::reduceMultiplication)
        a.matchMinus() -> statement.s(laMulneg1i).a(Engine::reduceMultiplication)
        b.matchMinus() -> statement.s(laMulneg2i).a(Engine::reduceMultiplication)
        (a.matchPositiveRational() || a.matchPositiveInteger()) && (b.matchPositiveRational() || b.matchPositiveInteger()) -> {
            // first simplify each fraction with normalized fractions
            val aNormalized = a.normalized
            val bNormalized = b.normalized
            // use (n/1) for integers
            val changedA = if (aNormalized.matchInteger()) aNormalized.s2(laDiv1i) else aNormalized
            val changedB = if (bNormalized.matchInteger()) bNormalized.s2(laDiv1i) else bNormalized
            val s1 = statement.map(P.a to { changedA }, P.c to { changedB })

            // then try to cancel factors
            // a/c x. b/d = ac / bd
            val s2 = s1.s(laDivmuldivi)
            val (numA, denA) = aNormalized.rationalValue!!
            val (numB, denB) = bNormalized.rationalValue!!
            val g1 = numA.gcd(denB)
            val g2 = numB.gcd(denA)

            // eventually cancel the common factor in numA and denB
            val s3 = if (g1 != BigInteger.ONE) {
                val numA1 = numA.divide(g1)
                val denB1 = denB.divide(g1)
                val sG1 = g1.toStatement()
                s2.map(P.a to { it.a { reduceMultiplication("( ∨1c x. ∨2c )".with(numA1.toStatement(), sG1)).inv() }.s(laMul32i)},
                    P.c to { it.c { reduceMultiplication("( ∨1c x. ∨2c )".with(denB1.toStatement(), sG1)).inv() }.s2(laMulassi) })
                    .s(neoremDivcan5rd)
            } else s2

            // eventually cancel the common factor in numB and denA
            val s4 = if (g2 != BigInteger.ONE) {
                val denA2 = denA.divide(g2)
                val numB2 = numB.divide(g2)
                val sG2 = g2.toStatement()
                s3.map(P.a to { it.c { reduceMultiplication("( ∨1c x. ∨2c )".with(numB2.toStatement(), sG2)).inv() }.s2(laMulassi) },
                    P.c to { it.a { reduceMultiplication("( ∨1c x. ∨2c )".with(denA2.toStatement(), sG2)).inv() }.s(laMul32i)  })
                    .s(neoremDivcan5rd)
            } else s3

            s4.map(P.a to Engine::reduceMultiplication, P.c to Engine::reduceMultiplication).s{it.normalized}
        }

        else -> statement
    }.reductionContract(statement)
}

private fun Engine.multiplies2digits(statement: Statement): Statement {
    val (a, _, b) = statement.children
    return when {
        a.matchZero() -> result(laMul02i, b)
        b.matchZero() -> result(laMul01i, a)
        a.matchOne() -> result(laMulid2i, b)
        b.matchOne() -> result(laMulid1i, a)


        else -> {
            // handle other digits

            val digitA = a.asInt()
            val digitB = b.asInt()

            val s = result(laMul(digitA, digitB))
            val s1 = if (digitA < digitB) result(laEqtri, statement, s, result(laMulcomi, a, b)) else s
            (if (digitA * digitB == 10) result(laEqtri, statement, result(laDec10), s1) else s1)
        }
    }
}

I am quite happy with the Api but I forsee some more further improvements therei.

And if Multi-receiver for methods ever get released in a future Kotlin version, with this proposal, there are hopes of improving things a lot also.

Now, I am mostly going to work on proving stuff like

x e. ( 1 ... 5 ) -> someFormula = someValue
x =3 -> someFormula = someValue

(This is necessary for a lot of stuff, like sums, products, integration...)

and more generally towards turning a proof of |- A -> |- B into a proof of |- ( A -> B ), the Mario Carneiro way
(this is absolutely necessary to build a useful app for student level mathematicians) .

This should be a lot more challenging (but not too challenging, probably, so I still hope to succeed)

Filip Cernatescu

unread,

May 29, 2019, 4:01:24 AM5/29/19

to Metamath

Hello Olivier!

Good luck with your CAS implementation! It is a wonderful work!

But I have 2 question for you:

1. The proofs of your calculations would be in formal math not in classic math, do you think that would be a segment of population that would need your proofs?
2. Lots of math formulas are non-existent in set.mm,how do you deal with that?

Thank you very much!
Filip

OlivierBinda

unread,

May 29, 2019, 12:00:13 PM5/29/19

to 'Filip Cernatescu' via Metamath

Hello Filip :)

Le 29/05/2019 à 10:01, 'Filip Cernatescu' via Metamath a écrit :

Hello Olivier!

Good luck with your CAS implementation! It is a wonderful work!

But I have 2 question for you:

1. The proofs of your calculations would be in formal math not in classic math, do you think that would be a segment of population that would need your proofs?

The segments of the population that I am targeting are :
-the average math student
-the professional math researcher

Basically, humans would enter the proof either

- through a well thought UI, that looks and behaves as much as it can as traditional/classic math
(which makes it usable to the average math student).

- optionaly/additionaly, through text input (for power users like professional mathematicians, they know what they do and it is faster this way)

Basically, at the moment, it works this way :

You may have hypothesis/antecedent like

x = -2
y e. ZZ

You start with a (valid) math statement like

( x x. ; 1 2 ) + y

and then you may point to (select) parts of the statement and apply operations

like reduction, commutativity, associativity to transform the Statement (with theorems the way you want/can) and the api makes sure that, at then end, you get

- a formal proof of the resulting statement
- a formal proof of the transition

like |- ( x = 1 -> ( y e. ZZ -> ( x x. ; 1 2 ) + y = ; 1 2 + y ) )

It is still a work in progress though. At the moment, I'm refactoring the api to make it work nicely with antecedents. I have made some progress but I have still work to do to make
the usual operations (+ - * /) work with rationals AND antecedents, after which I should get sum, product and function support quite effortlessly

Once I get that to work, I'll :
1) work on the export step from my theorem format to mm (or mm0 ?).
Because I'm no longer using regular set.mm theorems (got to call too many of them to handle 1 operation with antecedents)

2) start building a prototype app and a UI :

to allow the first beta users to build formal proofs interactively and export them to mm (or mm0 ?)

2. Lots of math formulas are non-existent in set.mm,how do you deal with that?

At the moment, to develop my app,

I'm using my own theorems like this one

val A_plus_0 = tf("( ∨1c + 0 ) = ∨1c", "∨1c e. CC")

tf stands for "theorem family" because it also allows one to use it with antecedents like

|- ( v2w -> v1c e. CC )
gives
|- ( v2w -> (v1c + 0 ) = v1c )

v1c is for class
v1s for set
v1w for wff

Also, it removes the need for variants for

|- v1c = v2c
|- v3c = v4c
gives
|- ; v1c v3c = v2c v4c

like

|- v1c = v2c
gives
|- ; v1c v3c = v2c v3c

(you have to use such variants to lessen the huge amount of hypothesis that the system has to prove automatically to use the theorems)

such a theorem family could be easily proven (I believe) with the theorems that are in set.mm

But mm lacks variadic arguments and it makes it painfull to work with antecedents, so this is the way I decided to use to work around that (my app would first create a proof relying on some theorem family and then it would prove all the specialized theorems it uses)

As soon as I get arithmetic operations to work nicely, I'll post a progress update

Thank you very much!
Filip

Thank you for the interest :)
Olivier

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/11dc6086-2cf5-4c81-bcef-e6a58978b234%40googlegroups.com.

Olivier Binda

unread,

Jul 23, 2019, 12:39:25 PM7/23/19

to Metamath

Hello

I have just broken through the second milestone in the development for a CAS system over Metamath (or MM0).
Basically, now it can do this

"sum_ ∨1s e. ( 1 ... 5 ) ∨1s".test("; 1 5")
"sum_ ∨1s e. ( 1 ... 5 ) ( ( 2 x. ∨1s ) + 1 )".test("; 3 5")
"sum_ ∨1s e. ( 1 ... 9 ) ( ∨1s ^ 2 )".test("; ; 2 8 5")
"sum_ ∨1s e. ( 1 ... 4 ) ( ( 1 / 2 ) ^ ∨1s )".test("( ; 1 5 / ; 1 6 )")

where test just applies a "reduce" method on the left term and checks that it leads to the second term (and proofs it)

More generally, the nice API of Milestone 1 evolved to be able to handle statements with antecedents.
Basically, through the M2 API, you can :
1) take any statement like "A. v1s e. ZZ 0 < ( v1s ^ 2)"

2) apply some (or many) change like NN C_ ZZ

3) you get the statement "A. v1s e. NN 0 < ( v1s ^ 2)"

AND you have proved that |- ( A. v1s e. ZZ 0 < ( v1s ^ 2) -> A. v1s e. NN 0 < ( v1s ^ 2) )

EVEN if it has antecedents, it should handle those just fine (or if it still doesn't I'll fix it and it will)

For example, this test just does that (it also adds antecedents to see if it works with those)

"A. ∨1s e. ZZ 0 <_ ( ∨1s ^ 2 )".impTest("A. ∨1s e. NN 0 <_ ( ∨1s ^ 2 )") { it.imp(P.b + "|- NN C_ ZZ") }

But basically, what a developer needs to do steps 1, 2, 3 requires only 1 line of code which is

"A. ∨1s e. ZZ 0 <_ ( ∨1s ^ 2 )".toAssertion().imp(P.b + "|- NN C_ ZZ")

P.b tells the node of the statement where the change should happend

|- NN C_ ZZ tells the code, what (proved) changed should be used to transform the statement

And the code does the necessary to get the result AND the proof of the transformation

Basically, you point to the place where you want to change something and you give the system the tiny step you want to apply
the system figures out if you get a ->, a <- or a <-> for the transformation
(you use eq for <->, imp for -> and limp for <- if you want the sytem to check that you get the result that you expect)

some other examples (in test form)

"( ( 1 + 2 ) = 3 /\\ ( 2 + 3 ) = 5 )".eqTest("( 3 = 3 /\\ 5 = 5 )") { it.eq4(P.a.a * add(1, 2), P.b.a * add(2, 3)) }

assume("( ∨3w -> ( ∨6w -> ∨4w ) )")
 assume("( ( ∨3w -> ∨6w ) -> ( ∨5w <-> ∨5w ) )")
"( ( ∨3w -> ∨4w ) -> ∨5w )".impTest("( ( ∨3w -> ∨6w ) -> ∨5w )") { it.imp(P.a.b + "( ∨6w -> ∨4w )") }

"-. 3 < 1".limpTest("-. 3 < 2") { it.limp(P.a.c + "|- 1 < 2") }

applying a theorem is quite simple as the system is usually able to infer most arguments, from the statement and the place where you want to apply it

Like applying ( a * c / b * c ) = ( a / b) requires

- nothing if you apply it on ( 2 * 3 / 5 * 3 ) -> a = 2, c = 3, b = 5

- requires 1 argument if you apply it on ( 2 / 5) -> a= 2, b = 5 but what do you want to take for c ?

The Api and the services that make it work (checking in which set, statements are, checking that something is =/= 0, ...)
work quite nicely and make me think that I'll be able to provide a nice user experiment doing maths through metamath,

especially since it will be possible to code a nice UI that makes things simpler for the user (like providing suggestions of steps, theorems to apply, ...)

I have not decided yet on what I want to work next.
My options are :
A) export proofs to metamath

B) research sequences and functions

C) try to write some simple and useful proofs
D) Improve the work already done on rationals, sums (like enable changing the set variable used in sums...)

E) working on real and complex stuff, this requires more than reducing (like factoring, developping, grouping...)

I think that it would be a waste to do A before C but I fear that I won't find simple and usefull proofs before I do B

In order to develop the system further, I would really love to have some (simple) target theorems that it should be able to prove.

Could someone, please, suggest me some simple theorems I could aim to prove with my code ?

Best regards,
Olivier

Olivier Binda

unread,

Jul 24, 2019, 9:22:07 AM7/24/19

to meta...@googlegroups.com

I went for step C. I'm not very good at explaining what I want to achieve but I may succeed with the following example.

I want to build a system (probably named "Mephistolus") that makes it easy for humans (math students) to compute expressions and write computer checked maths proofs

Here is my first go at writing a simple and useful proof (inspired by a basic exercise I give to my 1st year students)

proving that ( A e. CC -> ( C e. NN0 -> ( ( 1 - A ) x. sum_ k e. ( 0 .. ∨C ) ( A ^ k ) ) = ( 1 - ( A ^ ( C + 1 ) ) ) ) )

// the comment are only here to help follow the computations and get what the theorems do

/** proving that ( A e. CC -> ( C e. NN0 -> ( ( 1 - A ) x. sum_ k e. ( 0 .. ∨C ) ( ∨A ^ k ) ) = ( 1 - ( A ^ ( C + 1 ) ) ) ) )
 * or with A=v1c, C=v3c, k=v2s
 * ( ∨1c e. CC -> ( ∨3c e. NN0 -> ( ( 1 - ∨1c ) x. sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ∨2s ) ) = ( 1 - ( ∨1c ^ ( ∨3c + 1 ) ) ) ) ) */
val assertion = "( ( 1 - ∨1c ) x. sum_ ∨2s e. ( 0 ... ∨3c ) ( ∨1c ^ ∨2s ) )".toAssertion("∨1c e. CC", "∨3c e. NN0")
val result = "( 1 - ( ∨1c ^ ( ∨3c + 1 ) ) )".toStatement()
val step = "( ∨1c e. CC -> ( ∨3c e. NN0 -> ( ( 1 - ∨1c ) x. sum_ ∨2s e. ( 0 ... ∨3c ) ( ∨1c ^ ∨2s ) ) = ( 1 - ( ∨1c ^ ( ∨3c + 1 ) ) ) ) )".toStatement()
val proof = assertion
    .s(A_times_Sum)
    // sum_ ∨2s e. ( 0 .. ∨3c ) ( ( 1 - ∨1c ) x. ( ∨1c ^ ∨2s ) )
    .c(A_times_B_com)
    // sum_ ∨2s e. ( 0 .. ∨3c ) ( ( ∨1c ^ ∨2s ) x. ( 1 - ∨1c ) )
    .c(A_times_BsubC)
    // sum_ ∨2s e. ( 0 .. ∨3c ) ( ( ( ∨1c ^ ∨2s ) x. 1 ) - ( ( ∨1c ^ ∨2s ) x. ∨1c ) )
    .eq(P.c.a.th(A_times_1), P.c.c.th(A_exp_Bplus1)).s(Sum_AsubB)
    // sum_ ∨2s e. ( 0 .. ∨3c ) ( ( ∨1c ^ ∨2s ) - ( ∨1c ^ ( ∨2s + 1 ) ) )
    .s(Sum_AsubB)
    // ( sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ( ∨2s + 1 ) ) )
    .c(Sum_AplusB_in) { arrayOf("∨5s".toStatement(), statement1) }
    // ( sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨5s e. ( ( 0 + 1 ) .. ( ∨3c + 1 ) ) ( ∨1c ^ ∨5s ) )
    .eq(P.c.b.a.op(Engine::reduceAddition))
    // ( sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨5s e. ( ( 1 .. ( ∨3c + 1 ) ) ( ∨1c ^ ∨5s ) )
    .eq(P.a.th(firstA_plus_Sum_ofA), P.c.th(Sum_ofA_plus_lastA))
    // ( ( ( ∨1c ^ 0 ) + sum_ ∨2s e. ( ( 0 + 1 ) .. ∨3c ) ( ∨1c ^ ∨2s ) ) - ( sum_ ∨5s e. ( ( 1 .. ( ( ∨3c + 1 ) - 1 ) ) ) ( ∨1c ^ ∨5s ) ) + ( ∨1c ^ ( ∨3c + 1 ) ))
    .eq(
        P.a.a.op(Engine::reduce),
        P.a.c.b.op(Engine::reduce),
        P.c.b.c.th(A_plus_B_sub_C_ass))
    // ( ( 1 + sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) ) - ( sum_ ∨5s e. ( ( 1 .. ( ∨3c + ( 1 - 1 ) ) ) ( ∨1c ^ ∨5s ) ) + ( ∨1c ^ ( ∨3c + 1 ) ))
    .eq(P.c.b.c.op(Engine::reduce))
    // ( ( 1 + sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) ) - ( sum_ ∨5s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨5s ) ) + ( ∨1c ^ ( ∨3c + 1 ) ) )
    .s(A_plus_B_sub_C_ass)
    // ( 1 + ( sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) - ( sum_ ∨5s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨5s ) ) + ( ∨1c ^ ( ∨3c + 1 ) ) )
    .c(A_sub_BplusC_ass)
    // ( 1 + ( ( sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨5s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨5s ) ) - ( ∨1c ^ ( ∨3c + 1 ) ) )
    .eq(P.c.a.c.th(Sum_AswitchB_in) { arrayOf("∨2s".toStatement(), "( ∨1c ^ ∨2s )".toStatement()) })
    // ( 1 + ( ( sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) ) - ( ∨1c ^ ( ∨3c + 1 ) ) )
    .eq(P.c.op(Engine::reduce))
// ( 1 - ( ∨1c ^ ( ∨3c + 1 ) ) )
/** done */
assertEquals(assertion.origin, proof.origin)
assertEquals(result, proof.result)
assertEquals(step, proof.step)
assertTrue(isProven(proof.step))

Next, I'll try to make this test pass (I haven't run it yet)

And when it runs, I'll start working on step A (export of the Mephistolus proofs to Metamath proofs).
That way, it won't just be vapourware and it'll maybee start being useful to some people

Best regards,
Olivier

Olivier Binda

unread,

Aug 4, 2019, 1:41:06 PM8/4/19

to meta...@googlegroups.com

I managed to pass the aforementioned (slightly changed) test

/** proving that ( ∨1c e. CC -> ( ∨3c e. NN0 -> ( ( 1 - ∨1c ) x. sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ∨2s ) ) = ( 1 + -u ( ∨1c ^ ( ∨3c + 1 ) ) ) ) ) */

val assertion = "( ( 1 - ∨1c ) x. sum_ ∨2s e. ( 0 ... ∨3c ) ( ∨1c ^ ∨2s ) )".toAssertion("∨1c e. CC", "∨3c e. NN0")

val result = "( 1 + -u ( ∨1c ^ ( ∨3c + 1 ) ) )".toStatement()
val step = "|- ( ∨1c e. CC -> ( ∨3c e. NN0 -> ( ( 1 - ∨1c ) x. sum_ ∨2s e. ( 0 ... ∨3c ) ( ∨1c ^ ∨2s ) ) = ( 1 + -u ( ∨1c ^ ( ∨3c + 1 ) ) ) ) )".toStatement()



val proof = assertion
    .s(A_times_Sum) // sum_ ∨2s e. ( 0 .. ∨3c ) ( ( 1 - ∨1c ) x. ( ∨1c ^ ∨2s ) )
    .c(A_times_B_com) // sum_ ∨2s e. ( 0 .. ∨3c ) ( ( ∨1c ^ ∨2s ) x. ( 1 - ∨1c ) )
    .c(A_times_BsubC) // sum_ ∨2s e. ( 0 .. ∨3c ) ( ( ( ∨1c ^ ∨2s ) x. 1 ) - ( ( ∨1c ^ ∨2s ) x. ∨1c ) )

    .eq(P.c.a.th(A_times_1), P.c.c.th(A_exp_Bplus1)) // sum_ ∨2s e. ( 0 .. ∨3c ) ( ( ∨1c ^ ∨2s ) - ( ∨1c ^ ( ∨2s + 1 ) ) )

    .s(Sum_AsubB) // ( sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ( ∨2s + 1 ) ) )

    .c(Sum_AplusB_in) { arrayOf("∨5s".toStatement(), statement1, "( ∨1c ^ ( ( ∨5s - 1 ) + 1 ) )".toStatement()) } // ( sum_ ∨2s e. ( 0 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨5s e. ( ( 0 + 1 ) .. ( ∨3c + 1 ) ) ( ∨1c ^ ( ( ∨5s - 1 ) + 1 ) ) )
    .eq(P.c.b.a.op(Engine::reduceAddition)) // ( sum_ ∨2s e. ( 0 ... ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨5s e. ( 1 ... ( ∨3c + 1 ) ) ( ∨1c ^ ( ( ∨5s - 1 ) + 1 )
    .eq(P.c.c.c.th(AsubB_plus_C_rot)) // ( sum_ ∨2s e. ( 0 ... ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨5s e. ( 1 ... ( ∨3c + 1 ) ) ( ∨1c ^ ( ∨5s - ( 1 - 1 ) ) )
    .eq(P.c.c.op(Engine::reduce)) // ( sum_ ∨2s e. ( 0 ... ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨5s e. ( 1 ... ( ∨3c + 1 ) ) ( ∨1c ^ ∨5s ) )
    .eq(P.a.th(firstA_plus_Sum_of) { arrayOf("( ∨1c ^ 0 )".toStatement()) }, P.c.th(Sum_ofA_plus_lastA) { arrayOf("( ∨1c ^ ( ∨3c + 1 ) )".toStatement()) }) // ( ( ( ∨1c ^ 0 ) + sum_ ∨2s e. ( ( 0 + 1 ) ... ∨3c ) ( ∨1c ^ ∨2s ) ) - ( sum_ ∨5s e. ( 1 ... ( ( ∨3c + 1 ) - 1 ) ) ( ∨1c ^ ∨5s ) + ( ∨1c ^ ( ∨3c + 1 ) ) ) )
    .eq(P.a.a.op(Engine::reduce), P.a.c.b.op(Engine::reduce), P.c.a.b.c.th(AplusB_sub_C_ass) // ( ( 1 + sum_ ∨2s e. ( 1 ... ∨3c ) ( ∨1c ^ ∨2s ) ) - ( sum_ ∨5s e. ( 1 ... ( ∨3c + ( 1 - 1 ) ) ) ( ∨1c ^ ∨5s ) + ( ∨1c ^ ( ∨3c + 1 ) ) ) )
    .eq(P.c.a.b.c.op(Engine::reduce))//  ( 1 + sum_ ∨2s e. ( 1 ... ∨3c ) ( ∨1c ^ ∨2s ) ) - ( sum_ ∨5s e. ( 1 ... ∨3c ) ( ∨1c ^ ∨5s ) + ( ∨1c ^ ( ∨3c + 1 ) ) ) )
    .s(AplusB_sub_C_ass) // ( 1 + ( sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) - ( sum_ ∨5s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨5s ) ) + ( ∨1c ^ ( ∨3c + 1 ) ) )

    .c(A_sub_BplusC_ass) // ( 1 + ( ( sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨5s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨5s ) ) - ( ∨1c ^ ( ∨3c + 1 ) ) )

    .eq(P.c.a.c.th(Sum_AswitchB_i) { arrayOf("∨2s".toStatement(), "( ∨1c ^ ∨2s )".toStatement()) }) // ( 1 + ( ( sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) - sum_ ∨2s e. ( 1 .. ∨3c ) ( ∨1c ^ ∨2s ) ) - ( ∨1c ^ ( ∨3c + 1 ) ) )
    .eq(P.c.op(Engine::reduce)) // ( 1 + -u ( ∨1c ^ ( ∨3c + 1 ) ) )

/** done */
assertEquals(assertion.origin, proof.origin)
assertEquals(result, proof.result)
assertEquals(step, proof.step)

assertTrue(isProven(proof.step))

I had to improve a few parts of the system (including the part tracking in what "closure" (NN0, NN, ZZ, QQ, RR, CC) things are)

and this test very efficiently showed places where things where lacking and deserved a better handling.

Now, that I have a Mephistolus-Proof for a really interesting and high level real world exercise,
I am going to work on

1) exporting Mephistolus-proofs to Metamath (this requires bootstrapping Mephistolus theorems from Metamath ones)
2) write other Mephistolus-proofs for real world university level maths so that I can test, debug and enhance progressively the system (hopefully, writing them will become faster and faster).

This being said, I am quite happy with this result because, I posted 1 year ago in this group to ask if the people there thought that it was possible to
do real maths, in a nice human friendly way, and I got my answer. It is definately possible (but it requires quite a lot of man-month worth of work to implement) .

The above test looks quite like real maths (especially if you consider that the Mephistolus references for the used theorems will be set by users (with sensible default) so that everyone can decently use stuff like A + B = B + A with a label/id they like in their language.

But I think that it should be possible to slightly enhance the way the proof is written further.
For example, when the last term of the sum is taken out, the system should be able to compute what the value should be
So, there should be no need for specifying "( ∨1c ^ ( ∨3c + 1 ) )".toStatement()

By only asking the user for values/parts that the system cannot compute, should make things a lot easier for the human user.

Also, with some enhancements, the above proof can be shortened of a few steps (only the important stuff should remain)

As I am not a mmj2 user, I was wondering how Mephistolus fares against mmj2.
How hard is it in mmj2 to write the above proof ? How many steps does it require ?

Best regards,
Olivier

Olivier Binda

unread,

Aug 9, 2019, 4:43:38 PM8/9/19

to Metamath

I might have just produced my first correct metamath proof from a Mephistolus proof

When trying to prove ⊢ ( ( 1 + 1 ) + 1 ) = 3

The following test code

val step = "( ( 1 + 1 ) + 1 )".toAssertion().s(Engine::reduce).step
val proof = Mephistolus2Metamath.mmProofFor(step, this@with)
MetamathProofChecker.process("test", NormalizedTheorem(step, listOf(), 0, setOf()), proof)

succeeds and produces the (uncompressed) metamath proof :

c1

caddc

co

c1

caddc

co

c2

c1

caddc

co

c3

c1

caddc

co

c2

c1

caddc

1p1e2apr1 (<- I know that it is deprecated but it should do for the moment)

oveq1i

2p1e3

eqtri

Could someone proofcheck it with another metamath proofchecker (I recycled a proof checker that I implemented a few months ago, there might be bugs as the code has been lying untended a long time)

I'll now work on :

0) improving the coverage of tests for proof conversion
1) compressing metamath proofs

2) Improving the range of Mephistolus theorems converted to Metamath.

This might not be hard as Mephistolus theorems are just slightly different from metamath ones
(a difference : I'm pretty sure that x. in Mephistolus will only be allowed for complex numbers though and not for any class expression

Jim Kingdon

unread,

Aug 9, 2019, 5:03:14 PM8/9/19

to meta...@googlegroups.com

Here's what I did. I put the following in set.mm (right before
"Appendix: Typesetting definitions" although the exact location doesn't
matter much.

mephistolus1 $p |- ( ( 1 + 1 ) + 1 ) = 3 $=

c1 c1 caddc co c1 caddc co c2 c1 caddc co c3 c1 c1 caddc co c2 c1 caddc

1p1e2apr1 oveq1i 2p1e3 eqtri $.

and then I ran metamath.exe and gave it the following commands:

read set.mm

verify proof mephistolus1

whereupon it printed "mephistolus1" which means it verified (an error
would have printed an error message).

Exciting to see Mephistolus generating valid proofs!

Olivier Binda

unread,

Aug 10, 2019, 5:24:50 PM8/10/19

to Metamath

Thanks for the feedback. It works ! :)

I see that I should also output the statement and hypotheses along with the proof, so that it could be properly copy-pasted/proofChecked
I should also put in there an option, to either output lemmas or to inline them in a big proof

I'll now shut up, work some more and only report when I have got more interesting proofs converted

Reply all

Reply to author

Forward