SMM3 proof assistant design

Mario Carneiro

unread,

Jul 14, 2016, 9:44:57 AM7/14/16

to metamath

Hi All,

I'd like to open up discussion on the direction of the proof assistant SMM3, a.k.a smetamath-rs (https://github.com/sorear/smetamath-rs). It is nearly feature complete in other areas (and it has already delivered unprecedented verification times for set.mm), but the proof assistant is still a big question mark.

The program is designed to work as a back end, so text editing facilities can be provided through integration with an existing IDE. Beyond that, how should proof input look? In the most conservative design, it could just copy mmj2. That is, it reads an .mmp file, and validates and updates it (with other behaviors adjusted via command line options). This would give a very similar experience to the ctrl-U editing style in mmj2.

Looking at the way other programming languages and proof assistants approach the problem, though, makes me think that having the user and computer edit the same text is problematic, and something similar to a "proof script" is easier for authors. I've talked about this somewhat in https://groups.google.com/forum/#!msg/metamath/ZlRle52FVO0/94ZTXaAfCQAJ , so let me propose a slightly different idea.

The inspiration this time is Haskell. Treat a proof as a datatype, and theorems as functions on this datatype. This idea is nothing new; in fact it is the Curry-Howard isomorphism (https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence). The arguments to the function are the hypotheses, and grouping is optionally indicated with parentheses. Type ascription (that is, indicating that an expression has a given type, or in Metamath parlance, that a step evaluates to a given expression) is indicated by writing the turnstile, followed by the metamath formula. Grammar parsing is used so that the end of the expression does not otherwise need to be marked, and wildcards are allowed (probably a single $ mark, similar to MM-PA dummy vars).

An example:

$p id1 |- ( ( ph /\ ps /\ ch ) -> th )

$e .1 |- ( ph -> ( ps -> ( ch -> th ) ) )

sylbi df-3an imp31 .1

The first two lines, supplied by the assistant, define the theorem being proven and establish local names for the hypotheses. In particular, "name.n" style hypotheses can be referred to simply as ".n" if this is unambiguous. The rest is the proof itself, as might be written by the user. The above is an extremely terse style, and there are more long winded ways to write it:

(sylbi (df-3an) (imp31 .1))

Note that parentheses are evaluated lisp-style, that is, they surround the entire expression, including the function being applied. So this would not be legal:

sylbi (df-3an imp31 .1)

because df-3an does not take an argument. With type ascriptions:

sylbi |- ( ( ph /\ ps /\ ch ) -> th )

df-3an |- ( ( ph /\ ps /\ ch ) <-> ( ( ph /\ ps ) /\ ch ) )

(imp31 |- ( ( $ /\ ch ) -> th )
.1 |- ( ph -> ( ps -> $ ) ) )

Spaces are required within and immediately after a metamath string, but not otherwise. The dollar is a wildcard that represents a well-formed expression of unknown type (I'm not sure if this can cause ambiguity in parsing the end of the expression, but I think not). Judicious use of partial type ascriptions allow you to give names to dummy variables in the proof.

The dollar can also be used where a proof is expected, in which case it indicates an incomplete proof. So:

sylbi df-3an $

is a valid but incomplete proof. A dollar suffix can be used after a theorem to mark all of its 0 or more hypotheses as incomplete. Thus "sylbi df-3an$ imp31 .1" is parsed as (sylbi df-3an$ (imp31 .1)), which is actually a complete proof because df-3an has no hyps, while "sylbi df-3an imp31$" is parsed as (sylbi df-3an imp31$) which expands to (sylbi df-3an (imp31 $)), and similarly "sylbi$" becomes (sylbi $ $).

This covers the basic language. The other half of this, which is needed so that the script doesn't become impossible to read, is constant feedback from the compiler about errors in the file. This would probably manifest as red underlines on any syntax errors first, then red underlines on the dollars in the proof which would print the expected type of the hole (i.e. the incomplete proof that should go here). In type ascriptions, the dollars would be green (info) underlined, and would report the calculated expression that fills the hole.

For extra automation, there are two mechanisms. First, if the program can guess the proof at a hole, it will do so silently, marking the dollar with an info underline as in the case of holes in the expression. Second, for more directed hints we have named macros (TBD) which are marked with a leading !. For example,

$p 1kp2ke3k

!arith

might run the arithmetic prover and either complete the proof, or else the macro would be marked with an error and would report the problem, and the desired goal.

To avoid duplication, one of the first implemented "auto proof" methods will be to prove any statement that appears earlier or later in the same proof. For example:

$p pm1.2 |- ( ( ph \/ ph ) -> ph )

jaoi |- ( ( ph \/ ph ) -> ph )
id |- ( ph -> ph )
$ |- ( ph -> ph )

Here it will notice that the hole has the same type as a different step in the proof, and will automatically duplicate the subtree (internally it will be a pointer to the same proof). In fact, I'm of half a mind to just disallow distinct proofs of the same statement, because that is always inefficient, and it allows me to internally hash by type. That is, the proof

jaoi id (mpd ax-1 ax-1)

will not be accepted, saying an error like "step proven twice; use $ instead", although "jaoi id id" is okay (because this doesn't have the same problems with compression).

These rules are primarily intended for efficient authoring in the context of set.mm, although they will work okay for other grammatical systems. The biggest potential problem in other databases is the behavior of the $ in expressions, and the lack of an end delimiter for expressions. set.mm has the property that the expression has a definite end, so a mechanism like this is feasible; in other grammars, as a fallback, I can support ` backtick ` quotes like in math mode comments.

I really need some feedback from users on what kind of interface would be nice to use. I have too many ideas and not enough direction, and it's a big decision that affects the future of SMM3 as a useful proof assistant.

Mario

Mario Carneiro

unread,

Jul 14, 2016, 9:58:23 AM7/14/16

to metamath

Also, for forward proving, it is possible to make definitions. For example:

$p pm1.2

= my-id id

jaoi my-id id

The expression "= my-id id" assigns a proof tree to a local name, which can collide with symbol names but not labels. They are referred to by name. They can appear anywhere where an expression is expected, and they parse like let-bindings in Haskell. Unlike Haskell, these are not scoped or ordered; explicit cycle checking is used to ensure that definitions are not recursive.

David A. Wheeler

unread,

Jul 14, 2016, 12:42:12 PM7/14/16

to metamath

On Thu, 14 Jul 2016 09:44:54 -0400, Mario Carneiro <di....@gmail.com> wrote:
> Hi All,
>
> I'd like to open up discussion on the direction of the proof assistant
> SMM3, a.k.a smetamath-rs (https://github.com/sorear/smetamath-rs). It is
> nearly feature complete in other areas (and it has already delivered
> unprecedented verification times for set.mm), but the proof assistant is
> still a big question mark.
>
> The program is designed to work as a back end, so text editing facilities
> can be provided through integration with an existing IDE. Beyond that, how

> should proof input look? ...

I need to spend more time to grok this.

However, there are real advantages to mmj2's current "control-U" behavior.

I've used proof script tools (PVS, Coq, etc.), and they are often
too inflexible about the order of expressions that you have to prove.

In contrast, I find mmj2's approach very freeing.
Once I load the qed, I often first
load up expressions that I expect that I *might* need
(e.g., some definitions) - without knowing exactly how
or when I'll need them.

I also typically do "forward reasoning" by starting at a step,
make a small change, and then see if the tool can find a match
with an existing rule. Given 18K proofs, I can't memorize them all,
so I find it's faster to simply enter a small next step (as a tweaked expression)
and let the *tool* find the step. The current mmj2 behavior is awesome
since in many cases I know what the next step's expression (reasoning forward)
will be, I just don't remember what the rule is.

Indeed, I'd like the ability to do this going backwards: I know that expression X
should be preceded by expression Y - can it list what rules would cause that?

If the *only* way to input things is a proof script, I worry that you'll
lose the ability to start & move in all sorts of ways.
That flexibility is very helpful.

Having the *ability* to support proof scripts is very useful. However, it
should support *much* more than just direct invocations of a given theorem.
All proof script systems I'm aware of have a large library of automated reasoning
functions, each of which apply some different approach to attempt to
find the proof. Proof scripts are especially useful if what you're proving changes
(this is important for proving programs).

It seems to me you could support both.

Perhaps more importantly, you may want to focus on creating a nice library
that's easy to invoke to do various jobs. Then you can build all sorts of things
on top. In the longer term I think that you want a neural network - I think
a neural network could be trained to do a *lot* of automation.

--- David A. Wheeler

Mario Carneiro

unread,

Jul 14, 2016, 2:03:32 PM7/14/16

to metamath

On Thu, Jul 14, 2016 at 12:42 PM, David A. Wheeler <dwhe...@dwheeler.com> wrote:

On Thu, 14 Jul 2016 09:44:54 -0400, Mario Carneiro <di....@gmail.com> wrote:
> Hi All,
>
> I'd like to open up discussion on the direction of the proof assistant
> SMM3, a.k.a smetamath-rs (https://github.com/sorear/smetamath-rs). It is
> nearly feature complete in other areas (and it has already delivered
> unprecedented verification times for set.mm), but the proof assistant is
> still a big question mark.
>
> The program is designed to work as a back end, so text editing facilities
> can be provided through integration with an existing IDE. Beyond that, how
> should proof input look? ...

I need to spend more time to grok this.

However, there are real advantages to mmj2's current "control-U" behavior.

I've used proof script tools (PVS, Coq, etc.), and they are often
too inflexible about the order of expressions that you have to prove.

In contrast, I find mmj2's approach very freeing.
Once I load the qed, I often first
load up expressions that I expect that I *might* need
(e.g., some definitions) - without knowing exactly how
or when I'll need them.

Indeed, I want to be able to support both forward and reverse proving using this method. The main difference is that the computer won't be reformatting your code every few seconds (although there may be options for that as well, if you want).

I also typically do "forward reasoning" by starting at a step,
make a small change, and then see if the tool can find a match
with an existing rule. Given 18K proofs, I can't memorize them all,
so I find it's faster to simply enter a small next step (as a tweaked expression)
and let the *tool* find the step. The current mmj2 behavior is awesome
since in many cases I know what the next step's expression (reasoning forward)
will be, I just don't remember what the rule is.

Indeed, I'd like the ability to do this going backwards: I know that expression X
should be preceded by expression Y - can it list what rules would cause that?

If the *only* way to input things is a proof script, I worry that you'll
lose the ability to start & move in all sorts of ways.
That flexibility is very helpful.

Having the *ability* to support proof scripts is very useful. However, it
should support *much* more than just direct invocations of a given theorem.
All proof script systems I'm aware of have a large library of automated reasoning
functions, each of which apply some different approach to attempt to
find the proof. Proof scripts are especially useful if what you're proving changes
(this is important for proving programs).

It seems to me you could support both.

Let's use your recent video (which uses both forward and reverse proving) as an example for how to type in this new style.

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )
[$]

This is the starting point. The brackets here indicate that the dollar is red-underlined, and is reporting "incomplete proof: |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )".

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt[$]

Now it is reporting

> 1. incomplete: |- ( &S1 = A -> &C1 = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

> 2. incomplete: |- sgn = ( &S1 e. RR* |-> &C1 )
> 3. incomplete: |- if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) e. _V

which indicates that there are three subgoals hidden by the dollar and they are all incomplete. Next we add df-sgn:

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt [$] df-sgn [$]

> incomplete: |- ( &S1 = A -> if ( &S1 = 0 , 0 , if ( &S1 < 0 , -u 1 , 1 ) ) =
if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

> incomplete: |- if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) e. _V

It is now reporting two errors at the two dollar locations. Next we want to manually select the set variable &S1, which means a partial type ascription:

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt {$}
df-sgn |- {$} = ( x e. RR* |-> {$} )
[$]

We can of course fill in the equation as much or as little as we want, although in this case we at least have to get down to the mapping variable so we can write "x" there. Now the first $ is highlighted green (indicated with braces here), indicating that it has found the solution. (Actually if I were writing this today I would have had it solve this goal even when it was a metavariable, but let's stick to mmj2-level automation capabilities.) The four dollars now report, respectively:

> (ifbieq2d eqeq1 (ifbid breq1))

> sgn
> if ( x = 0 , 0 , if ( x < 0 , -u 1 , 1 ) )

> incomplete: |- if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) e. _V

In order to replicate the behavior of the JS macro, which does not produce complete proofs (in this case blocked by the lack of theorems c0ex and 1ex), let's say it is called !rec:

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt {$}
df-sgn |- {$} = ( x e. RR* |-> {$} )

!rec[$]

The dollar suffix reports:

> 1. incomplete: |- 0 e. _V

> 2: incomplete: |- 1 e. _V

In other words, the macro has the same signature as a hypothetical theorem with two hypotheses |- 0 e. _V and |- 1 e. _V, and outputting |- if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) e. _V .

Next, we do a forward proving step. We want to type |- 0 e. CC, have it find the theorem, then use that to finish the missing step.

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt {$}
df-sgn |- {$} = ( x e. RR* |-> {$} )

= 100 {$} |- 0 e. CC

!rec[$]

I'm defining a new local step called "100", providing a type ascription, an skipping the proof by writing $. It is highlighted green because it was able to find a proof, and the last two dollars now report

> 0cn

> 1. (elexi 100)

> 2: incomplete: |- 1 e. _V

The macro still has two subgoals, but one of them was automatically proven since the step 100 is now in scope. We do the same for 1:

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt {$}
df-sgn |- {$} = ( x e. RR* |-> {$} )

= 100 {$} |- 0 e. CC

= 200 {$} |- 1 e. CC

!rec{$}

and now the proof is complete. (I'm still TBD on what to do with the completed proof, but there will probably be commands to update the .mm file.) I think that an "expand at point" IDE command would also be useful here, which would perform a (very localized) edit of the file at the dollar at the cursor, to add a solved gap into the proof script, or expand the result of a macro. That way it's not hidden in the bowels of the computer.

Proof 2:

$p sgn0ALT |- ( sgn ` 0 ) = 0
[$]

> incomplete: |- ( sgn ` 0 ) = 0

We start out with a forward proof:

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 100 sgnval

[$]

It's not reporting anything different right now, because it doesn't show all the statements unless you tell it to. Let's see that statement:

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 100 sgnval |- {$}

[$]

We use type ascription to do the job, and it is reporting

> ( &C1 e. RR* -> ( sgn ` &C1 ) = if ( &C1 = 0 , 0 , if ( &C1 < 0 , -u 1 , 1 ) ) )

at the first dollar now. You don't want any metavariables in the video, so you replace it with 0:

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 100 sgnval |- ( 0 e. RR* -> {$} )

[$]

> ( sgn ` 0 ) = if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) )

Next is another forward proof:

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 80 {$} |- 0 e. RR*

= 100 sgnval |- ( 0 e. RR* -> {$} )

[$]

The first dollar reports "0xr". Next we apply ax-mp (except you don't want to write that so you write the formula instead):

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 80 $ |- 0 e. RR*

= 100 sgnval |- ( 0 e. RR* -> $ )

= 120 ($ 80 100) |- ( sgn ` 0 ) = if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) ) [)]

$

Oops, all the errors are gone except for one on the last parenthesis: syntax error! Okay, fixed it:

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 80 {$} |- 0 e. RR*

= 100 sgnval |- ( 0 e. RR* -> {$} )

= 120 ({$} 80 100) |- ( sgn ` 0 ) = if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) )

[$]

The dollar after 120 now reports "ax-mp". Note that the parentheses around ($ 80 100) are required, in order to force it to look for theorems using 80 and 100, instead of a zero hyp theorem followed by unrelated expressions.

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 80 {$} |- 0 e. RR*

= 100 sgnval |- ( 0 e. RR* -> {$} )

= 120 ({$} 80 100) |- ( sgn ` 0 ) = if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) )

= 200 {$} |- 0 = 0
[$]

Yup, "eqid".

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 80 {$} |- 0 e. RR*

= 100 sgnval |- ( 0 e. RR* -> {$} )

= 120 ({$} 80 100) |- ( sgn ` 0 ) = if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) )

= 200 {$} |- 0 = 0
= 300 iftrue
[$]

after trying to type this by formula, you give up and write the theorem name. Inserting the subexpression:

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 80 {$} |- 0 e. RR*

= 100 sgnval |- ( 0 e. RR* -> {$} )

= 120 ({$} 80 100) |- ( sgn ` 0 ) = if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) )

= 200 {$} |- 0 = 0
= 300 iftrue |- ( {$} -> if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) = {$} )
[$]

Typing subexpressions is not as nice here; probably an "insert expression at point" feature will be needed. Anyway, the two dollars in the expression report "0 = 0" and "0".

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 80 {$} |- 0 e. RR*

= 100 sgnval |- ( 0 e. RR* -> {$} )

= 120 ({$} 80 100) |- ( sgn ` 0 ) = if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) )

= 200 {$} |- 0 = 0
= 300 iftrue |- ( {$} -> if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) = {$} )
= 320 ({$} 200 300) |- if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) = 0
{$}

Again, typing expressions is not as nice here - you can't actually copy this from the previous line, because "if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) = $" produces a *new* metavariable at the location of the dollar. Naming the holes is a possibility if this becomes too complicated, say:

$p sgn0ALT |- ( sgn ` 0 ) = 0

= 80 {$} |- 0 e. RR*

= 100 sgnval |- ( 0 e. RR* -> {$} )

= 120 ({$} 80 100) |- ( sgn ` 0 ) = if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) )

= 200 {$} |- 0 = 0
= 300 iftrue |- ( {$} -> if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) = {$1} )
= 320 ({$} 200 300) |- if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) = {$1}
{$}

where the $1 is used to indicate that it is the same variable in both places, so that they both unify with 0.

At this point we are actually done; the final dollar is now reporting "eqtri 120 320". It didn't do this automatically in the video because you had a question mark in the prompt (which suppresses autocomplete). TBH I'm not sure how it got there in your video; it is not there in the first proof, but I think you changed your mmj2 settings between the two parts.

Mario

Mario Carneiro

unread,

Jul 14, 2016, 2:29:53 PM7/14/16

to metamath

By the way, here is how I would prove those two theorems, using a more reverse proving style, when you know the theorem names.

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )
[$]

> incomplete: ( A e. RR* ->

( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt[$]

> 1. incomplete: |- ( &S1 = A -> &C1 = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

> 2. incomplete: |- sgn = ( &S1 e. RR* |-> &C1 )
> 3. incomplete: |- if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) e. _V

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )
fvmpt {$} df-sgn [$]

> incomplete: |- if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) e. _V

The first part unifies assuming that it can handle the proof with metavariables.

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt {$} df-sgn !rec[$]

> 1. incomplete: |- 0 e. _V

> 2. incomplete: |- 1 e. _V

$p sgnvalALT |- ( A e. RR* ->
( sgn ` A ) = if ( A = 0 , 0 , if ( A < 0 , -u 1 , 1 ) ) )

fvmpt {$} df-sgn !rec (elexi 0cn) (elexi 1cn)

and we're done. (In hindsight, I could even drop the elexi and just write ($ 0cn) ($ 1cn) to do the same job.)

Proof 2:

$p sgn0ALT |- ( sgn ` 0 ) = 0
[$]

> incomplete: |- ( sgn ` 0 ) = 0

$p sgn0ALT |- ( sgn ` 0 ) = 0

eqtri [$] [$]

> incomplete: |- ( sgn ` 0 ) = &C1

> incomplete: |- &C1 = 0

$p sgn0ALT |- ( sgn ` 0 ) = 0

eqtri ax-mp {$} sgnval [$]

> incomplete: |- if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) ) = 0

$p sgn0ALT |- ( sgn ` 0 ) = 0

eqtri ax-mp {$} sgnval ax-mp {$} iftrue

> incomplete: |- if ( 0 = 0 , 0 , if ( 0 < 0 , -u 1 , 1 ) ) = 0

and we're done.

Particularly in reverse proving style, this seems to need almost no additional typing - if you know what you are doing you can only type the important lines and the result is just as good as mmj2, without the need for constant clicking to readjust the cursor location.

Mario Carneiro

unread,

Jul 14, 2016, 2:50:26 PM7/14/16

to metamath

On Thu, Jul 14, 2016 at 12:42 PM, David A. Wheeler <dwhe...@dwheeler.com> wrote:

Indeed, I'd like the ability to do this going backwards: I know that expression X
should be preceded by expression Y - can it list what rules would cause that?

Yes, mmj2 has this capability, in the "step selector search". Type something like

1:: |- Y

2:1: |- X

and double click on line 2, and it will come up with all theorems that use line 1 to get line 2. Or:

1:: |- Y

2:1,: |- X

now double clicking on line 2 will give theorems that use line 1 and possibly other statements to get line 2.

If the *only* way to input things is a proof script, I worry that you'll
lose the ability to start & move in all sorts of ways.
That flexibility is very helpful.

I'm not sure I understand what this means. It is a possibility to have script export utilities to produce fully annotated scripts (with all parentheses and ascriptions added). "Moving" is handled by the IDE, which presumably comes with a variety of text input commands and behavior.

Having the *ability* to support proof scripts is very useful. However, it
should support *much* more than just direct invocations of a given theorem.
All proof script systems I'm aware of have a large library of automated reasoning
functions, each of which apply some different approach to attempt to
find the proof. Proof scripts are especially useful if what you're proving changes
(this is important for proving programs).

All "advanced functionality" will be via !macros, which have the basic syntax of expressions but otherwise are arbitrarily complicated: They can take string arguments, produce non-proof objects, evaluate arbitrary code, and so on.

Perhaps more importantly, you may want to focus on creating a nice library
that's easy to invoke to do various jobs. Then you can build all sorts of things
on top.

Okay, then let's talk about that. In order to be a service for general proof automation, we need an API / interchange format for incomplete proofs. The obvious candidate is .mmp format, but does that mean we should just be another mmj2? Thierry Arnoux has already shown that mmj2 can be used as a back end for an IDE, so I think that particular role is already filled.

In the longer term I think that you want a neural network - I think
a neural network could be trained to do a *lot* of automation.

We are still quite far behind "the other guys" when it comes to automation, and they have more users and funding, as well as a great practical need for automation of large projects. So I think it is fairly safe to track in their general direction, which means implementing ATP support in the long term. It is much more effective to hook in to a dedicated program that focuses on automated theorem proving rather than reinvent the wheel, if we are getting serious about it.

Mario

David A. Wheeler

unread,

Jul 14, 2016, 7:31:29 PM7/14/16

to metamath, metamath

On Thu, 14 Jul 2016 09:44:54 -0400, Mario Carneiro <di....@gmail.com> wrote:

> Hi All,
>
> I'd like to open up discussion on the direction of the proof assistant
> SMM3, a.k.a smetamath-rs (https://github.com/sorear/smetamath-rs). It is
> nearly feature complete in other areas (and it has already delivered
> unprecedented verification times for set.mm), but the proof assistant is
> still a big question mark.

> The inspiration this time is Haskell. Treat a proof as a datatype, and
> theorems as functions on this datatype. This idea is nothing new; in fact
> it is the Curry-Howard isomorphism (
> https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence).

As a computer theorem-proving approach this way predates Haskell.
It's essentially the same as LCF (1972),
from which HOL and family are descended. For history buffs:
http://www.cl.cam.ac.uk/~mjcg/papers/HolHistory.pdf
http://i.stanford.edu/pub/cstr/reports/cs/tr/72/288/CS-TR-72-288.pdf (Milner's original paper).

A key is that there's a small "kernel" that manages the data structures, and
only trusted operations can manipulate it. Then you can build many tactics on
top but they can't corrupt it; they may not prove what you want, but they'll
never generate something false.

--- David A. Wheeler

David A. Wheeler

unread,

Jul 15, 2016, 12:11:32 AM7/15/16

to metamath, metamath

On Thu, 14 Jul 2016 09:44:54 -0400, Mario Carneiro <di....@gmail.com> wrote:

> Hi All,
>
> I'd like to open up discussion on the direction of the proof assistant

> SMM3, a.k.a smetamath-rs (https://github.com/sorear/smetamath-rs)...

> $p id1 |- ( ( ph /\ ps /\ ch ) -> th )
> $e .1 |- ( ph -> ( ps -> ( ch -> th ) ) )
>
> sylbi df-3an imp31 .1
>
> The first two lines, supplied by the assistant, define the theorem being
> proven and establish local names for the hypotheses. In particular,
> "name.n" style hypotheses can be referred to simply as ".n" if this is
> unambiguous. The rest is the proof itself, as might be written by the user.
> The above is an extremely terse style, and there are more long winded ways
> to write it:
>
> (sylbi (df-3an) (imp31 .1))

I'm not sure that this is an improvement, but I'm still trying to grok the
proposal. So let's see where this takes us...

> Note that parentheses are evaluated lisp-style, that is, they surround the
> entire expression, including the function being applied. So this would not
> be legal:
>
> sylbi (df-3an imp31 .1)

If you want mostly-Lisp syntax, you could also support neoteric expressions, where
"term(...)" without a space before "(" is just an abbreviation for "(term ...)".
This enables the more conventional-looking things like
ax-mp(Whatever1 Whatever2)

> because df-3an does not take an argument. With type ascriptions:
>
> sylbi |- ( ( ph /\ ps /\ ch ) -> th )
> df-3an |- ( ( ph /\ ps /\ ch ) <-> ( ( ph /\ ps ) /\ ch ) )
> (imp31 |- ( ( $ /\ ch ) -> th )
> .1 |- ( ph -> ( ps -> $ ) ) )

That nesting looks complicated.

> sylbi df-3an $
>
> is a valid but incomplete proof. A dollar suffix can be used after a
> theorem to mark all of its 0 or more hypotheses as incomplete.

It seems to me that there's no need to include "$" at the end.
If you ask for something, then its preconditions should just show up in the
set of incomplete items. That way, instead of endless nesting, you can
nest if you wish, but otherwise stop after some point and jump around.

I find it important to jump around; a command to say "jump to this hypothesis"
would seem useful to me.

> For extra automation, there are two mechanisms. First, if the program can
> guess the proof at a hole, it will do so silently, marking the dollar with
> an info underline as in the case of holes in the expression. Second, for
> more directed hints we have named macros (TBD) which are marked with a
> leading !. For example,
>
> $p 1kp2ke3k
> !arith
>
> might run the arithmetic prover and either complete the proof, or else the
> macro would be marked with an error and would report the problem, and the
> desired goal.

Sometimes you want to speculatively run something and see the results;
other times you want to keep the results "only if it resolved things".

--- David A. Wheeler

Mario Carneiro

unread,

Jul 15, 2016, 2:05:30 AM7/15/16

to metamath

On Fri, Jul 15, 2016 at 12:11 AM, David A. Wheeler <dwhe...@dwheeler.com> wrote:

If you want mostly-Lisp syntax, you could also support neoteric expressions, where
"term(...)" without a space before "(" is just an abbreviation for "(term ...)".
This enables the more conventional-looking things like
ax-mp(Whatever1 Whatever2)

This is also a possible option. I think it might also be good to use brackets instead of parens, because metamath uses parens a lot and it gets hard to read if you mix more parens in.

> because df-3an does not take an argument. With type ascriptions:
>
> sylbi |- ( ( ph /\ ps /\ ch ) -> th )
> df-3an |- ( ( ph /\ ps /\ ch ) <-> ( ( ph /\ ps ) /\ ch ) )
> (imp31 |- ( ( $ /\ ch ) -> th )
> .1 |- ( ph -> ( ps -> $ ) ) )

That nesting looks complicated.

I think my preferred style for "fully explicit" is type ascriptions and indentation, but no parens. The parens around (imp31 .1) there are optional, but I threw them in to show what is possible.

> sylbi df-3an $
>
> is a valid but incomplete proof. A dollar suffix can be used after a
> theorem to mark all of its 0 or more hypotheses as incomplete.

It seems to me that there's no need to include "$" at the end.
If you ask for something, then its preconditions should just show up in the
set of incomplete items. That way, instead of endless nesting, you can
nest if you wish, but otherwise stop after some point and jump around.

This syntax relies heavily on being able to tell where an expression ends, based on the terms of the expression itself. That is, sylbi is known to have two hypotheses, so "sylbi" is a function of type (proof, proof) -> proof, (sylbi $) is not well formed, (sylbi $ $) is well-formed of type proof, and (sylbi $ $ $) is not well-formed. Without this, it would be impossible to parse parenthesis-free expressions like "sylbi df-3an imp31 .1". So it follows that the trailing $ is required in order to make a complete expression, because this could well be a term in a larger expression and we don't want to lose track of what hypotheses go where. If you type

sylbi df-3an[ ]

it will highlight the space after the last hypothesis like so to indicate that sylbi doesn't have enough arguments. If you instead write

sylbi[$] [df-3an]

this is an error at the df-3an because this is actually two expressions; the dollar suffix causes sylbi$ to be parsed as a complete expression with no hypotheses (well, stubbed out hyps), and then the df-3an is one statement too many. The parentheses can be used to force (or assert) a certain grouping of hypotheses to go with a theorem. (A saved proof script will be much more robust w.r.t hypothesis additions/removals in upstream theorems if all optional parentheses are inserted; otherwise it will be very hard to recover the original grouping. This is what makes saved proofs so brittle.)

In order to indicate some but not all theorems that are given to a theorem, you could combine the dollar suffix with an explicit parameter list, for example (sylbi$ df-3an) which attempts to pass only df-3an to sylbi, but indicates that there are others, so it could mean either (sylbi df-3an $) or (sylbi $ df-3an), and the program will guess which ever one makes sense (preferring the first option if it doesn't have enough info). This is analogous to mmj2's "1::df-3an 2:1,?:sylbi" behavior.

I find it important to jump around; a command to say "jump to this hypothesis"
would seem useful to me.

I wonder if bracket-matching code in the IDE can be leveraged so that if you have your cursor on sylbi, df-3an and imp31 get highlighted. Actual bracket matching also helps with this, if the expression is parenthesized.

Since this presents a lot less information in front of your face by comparison to mmj2's stream-of-expressions style, and newlines are optional, I expect that you will be able to fit much more on screen during a typical proof, meaning that long-distance jumps become not as visually distant.

> For extra automation, there are two mechanisms. First, if the program can
> guess the proof at a hole, it will do so silently, marking the dollar with
> an info underline as in the case of holes in the expression. Second, for
> more directed hints we have named macros (TBD) which are marked with a
> leading !. For example,
>
> $p 1kp2ke3k
> !arith
>
> might run the arithmetic prover and either complete the proof, or else the
> macro would be marked with an error and would report the problem, and the
> desired goal.

Sometimes you want to speculatively run something and see the results;
other times you want to keep the results "only if it resolved things".

Although I call it a script, the behavior will be as if the whole script was run from the beginning at each read, so there are no side effects if you delete everything and start over. (In actuality it will cache the last run and structurally compare the new script to the old version, so that it doesn't have to recalculate everything each time.) This is also how smm3's incremental parsing of .mm files works. So "undoing" a macro or theorem application is as trivial as textual undo.

This is, I think, a big win for the current design over mmj2, since mmj2 does not easily allow you to partially undo an automatic proof without a lot of cleanup typing. With this, you can speculatively run a macro by entering it in the place of a hole, and it reports back in the error/info message (1) if it resolved the goal, (2) what steps were used, and (3) what theorems remain to be proven. Since this information doesn't appear in the text itself the only thing necessary to undo is the one word you typed. To partially apply the macro, there will be a key command to expand the macro's partial proof, and then you can stub out any bad parts of the proof.

Also, since it does not evaluate expressions in any particular order it is more like a constraint solver than a declarative program anyway, so perhaps "script" is a misnomer.

Finally, I should mention that, if it is feasible, this will not be run via a key command like ctrl-U, but is instead continually run as you type, like the java compiler messages in Eclipse. This should yield a much quicker interaction loop between author and proof assistant, with relevant information being conveyed only where the author is focused, at the hole with the error (which is often the last thing typed).

Mario

David A. Wheeler

unread,

Jul 15, 2016, 10:28:07 AM7/15/16

to meta...@googlegroups.com, Mario Carneiro

>I think my preferred style for "fully explicit" is type ascriptions and
>indentation, but no parens. The parens around (imp31 .1) there are
>optional, but I threw them in to show what is possible.

>>
>

>This syntax relies heavily on being able to tell where an expression
>ends,
>based on the terms of the expression itself. That is, sylbi is known to
>have two hypotheses, so "sylbi" is a function of type (proof, proof) ->
>proof, (sylbi $) is not well formed, (sylbi $ $) is well-formed of type
>proof, and (sylbi $ $ $) is not well-formed. Without this, it would be
>impossible to parse parenthesis-free expressions like "sylbi df-3an
>imp31

I think parsing to find endings based on previous definitions would be a mistake. It would make highlighting etc. in other tools very complicated. It would also make parsing very fragile.

You might consider indentation sensitive parsing, like Python. Or a syntax marker at means "end of expression" For example, [ ... ]. Or both.

You might take a look at my suite of "readable Lisp" notations.

>
>I wonder if bracket-matching code in the IDE can be leveraged so that
>if
>you have your cursor on sylbi, df-3an and imp31 get highlighted. Actual

>bracket matching also helps with this, if the expression is parenthsized.

It should.

--- David A.Wheeler

David A. Wheeler

unread,

Jul 15, 2016, 10:38:25 AM7/15/16

to meta...@googlegroups.com, Mario Carneiro

>Although I call it a script, the behavior will be as if the whole
>script
>was run from the beginning at each read, so there are no side effects
>if
>you delete everything and start over.

This is a nontrivial difference from many other tools. That doesn't make it wrong, but it is different.

> (In actuality it will cache the
>last
>run and structurally compare the new script to the old version, so that
>it
>doesn't have to recalculate everything each time.) This is also how
>smm3's
>incremental parsing of .mm files works.

There should be a way to expose cached calcukations and storing them as "hints". That way, you can shit down and restart later. No big deal if it takes a fraction of a second. It is a big deal if recalculating some macros takes 8 to 10 hours.

> So "undoing" a macro or theorem
>application is as trivial as textual undo.
>
>This is, I think, a big win for the current design over mmj2, since
>mmj2
>does not easily allow you to partially undo an automatic proof without
>a
>lot of cleanup typing.

This could be trially suppoted by control-Z.

>Also, since it does not evaluate expressions in any particular order it
>is
>more like a constraint solver than a declarative program anyway, so
>perhaps
>"script" is a misnomer.

I would expect to be first to last, and support many commands.

--- David A.Wheeler

Mario Carneiro

unread,

Jul 17, 2016, 10:53:19 AM7/17/16

to David A. Wheeler, metamath

On Sun, Jul 17, 2016 at 10:27 AM, David A. Wheeler <dwhe...@dwheeler.com> wrote:

On Sat, 16 Jul 2016 22:05:47 -0400, Mario Carneiro <di....@gmail.com> wrote:
> However, I don't want it to have the problems that .mm file format has with
> needing to be linearly parsed in order to understand the proof. One part of
> that is macro handling: earlier I said that !rec would act like a
> hypothetical theorem with two hypotheses, but since you don't know how many
> results will remain unproven this means that parsing requires running the
> macro, which I don't want. So instead, macros are always assumed to have
> zero arguments, except if you write !macro$ (which allows any number of
> arguments but puts holes in each of them) or (!macro arg1 arg2),
> parentheses required (which enumerates and fills the arguments as given).

I worry that *requiring* that every hypothesis be included as a parameter
may require so much nesting that it's hard to understand.

I think it'd be useful to prove *some* things, and return with an
environment (with the default "current" expression at the last incomplete part).
You could then start from there.

E.G., you could do this:
ax-mp (prove minor premise) (prove major premise)

*or* this:
ax-mp $ $
(prove major premise)
(prove minor premise)

You can actually do something like that using local variables:

= maj (prove major premise)

= min (prove minor premise)

ax-mp min maj

or, if min/maj have enough information to reconstruct the type:

= maj (prove major premise)

= min (prove minor premise)

ax-mp $ $

That probably wouldn't work without more smarts than we have now, because ax-mp $ $ introduces a dummy variable and there isn't anything tying it to the definitions of min/maj.

The general practice of requiring hypotheses to be inline, causing the whole proof to have a deeply nested tree structure, is deliberate. This matches the tree structure of the proof itself. Hopefully making parentheses optional will make it not get too "lispy".

Another possibility is to have a macro that allows you to create your own tree substitutions. For example:

$p nncn

= my3syl !def syl $ syl $ $

my3syl nnnn0 nn0re recn

The !def macro takes a single expression, and counts up all unprovable holes (in this case, all three). The result is a function which has that many arguments, and so can be used like a locally defined utility function. This differs from being able to reuse a step because you can reuse the theorem with different inner substitutions. (For example, a lengthy substitution used four times in an induction proof.)

This also has the problem that the arity of my3syl will not be known until it is parsed, so I think the rule about nonlinear parsing should be relaxed to just parsing between let bindings. That is, the line "= my3syl !def syl $ syl $ $" can be parsed without knowing any arities except the theorems, and after that line my3syl is parsed/evaluated, so that "my3syl nnnn0 nn0re recn" can be understood (since my3syl is now known to take 3 args).

Mario

David A. Wheeler

unread,

Jul 17, 2016, 3:31:48 PM7/17/16

to Mario Carneiro, metamath

>= maj (prove major premise)
>= min (prove minor premise)
>ax-mp min maj

Ok, I think that addresses my concern about too much nesting.

>Hopefully making
>parentheses optional will make it not get too "lispy".

I think I would get confused pretty quickly without some hint about what goes with what. I would personally use (ax-mp min maj) even when it's optional.

Would you mind also supporting ax-mp(min maj), that is, if a name is a merely followed by a parenthesis without any whitespace, treated as the first brand or within the parentheses? I realize these are not really function calls, but the syntactic similarity would make it easier to read in my opinion.

>Another possibility is to have a macro that allows you to create your
>own
>tree substitutions.

Sure. In the longer run, I would want the ability to define additional automatic pattern matches, to simplify common cases. That way, you could separate minor steps from the important ones. If particular patterns become common, they could be slowly moved into automated ones that everyone uses.

--- David A.Wheeler

Reply all

Reply to author

Forward