MMJ question

87 views
Skip to first unread message

peter.h...@outlook.com

unread,
Aug 21, 2022, 12:32:57 PM8/21/22
to meta...@googlegroups.com

I’m trying to decide if it would be quicker to (port to Smalltalk) or  write new code?

If MMJ code is elegant/well designed, I will spend time reading it.

If MMJ is badly designed/messy code then I guess it will be quicker just to write new code without looking at MMJ?

 

What do you think?

Antony Bartlett

unread,
Aug 21, 2022, 5:34:23 PM8/21/22
to meta...@googlegroups.com

> I’m trying to decide if it would be quicker to (port to Smalltalk) or  write new code?

> If MMJ code is elegant/well designed, I will spend time reading it.

> If MMJ is badly designed/messy code then I guess it will be quicker just to write new code without looking at MMJ?


What's the broader context in terms of what you are trying to achieve here, please.  It's not always a good idea to try and answer a question such as this without first attempting to understand what might lie behind it.

My only relevant experience here is that I recently succeeded in porting checkmm from C++ to TypeScript.  That was just a verifier, and still not something to undertake lightly.  I've neither used or programmed MMJ, but it's a multi-faceted thing like metamath.exe isn't it - a proof-assistant with a suite of tooling surrounding it?  I would be looking for a good starting point - some much smaller piece of useful functionality I could extract out of MMJ on its own, if I was to contemplate porting it.

My intuition (I've never broached this notion before, I look forward to seeing if it falls completely flat on its face here) is that as things stand at the moment there are three pillars to working with mm files - verification, unification, and html generation.  And I was planning to start asking questions about the unification algorithm sometime in the next few months, as it's the only part I'm really hazy on (you can ununify a regular proof by removing all the tokens who's expressions don't start with |- ('thereby'), but I don't know how you put them back in.  It's important because having to specify all the symbols used in a proof would be really pedantic).  That's the piece of MMJ I'd be after first if it was me.

I was lucky in that I consider checkmm.cpp to be very good code, and agree with you that it probably wouldn't have been worth bothering otherwise.  That there's room for improving the quality of some of MMJ's code has been discussed here, but to be honest I'd expect that with any larger codebase, and I don't know what the overall picture is.

    Best regards,

        Antony

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/CO3PR18MB4960DE0C165129DE46DE8975A46E9%40CO3PR18MB4960.namprd18.prod.outlook.com.

Peter H. Meadows

unread,
Aug 22, 2022, 2:53:16 AM8/22/22
to meta...@googlegroups.com
> That's the piece of MMJ I'd be after first if it was me.

OK. Thanks for this clue! :)

> (you can un-unify a regular proof by removing all the tokens who's expressions don't start with |-

??? un-unify??

> but I don't know how you put them back in.

> It's important because having to specify all the symbols used in a proof would be really pedantic).

?


in a metamath verifier unification is trivial because:
(
Mandatory hypotheses must be pushed on the proof stack in the order in
which they appear.
In addition, each variable must have its type specified with a $f hypothesis
before it is used and that each $f hypothesis have the restricted
syntax of a typecode
(a constant) followed by a variable.
The typecode in the $f hypothesis must match the first symbol of the
corresponding RPN stack entry
(which will also be a constant),
so the only possible match for the variable in the $f hypothesis
is the sequence of symbols in the stack entry after the initial constant.
)


to discover new proofs
is non-trivial because ... ... Godel... etc. P != NP ......?



The general algorithm for unification described in the literature is
somewhat complex.
In the Proof Assistant, a more general unification algorithm is used.

DOES MMJ use the same algo?
HOW does MMJ do it?

What is the best (description of the) algorithm? ?

GtMM currently has some tricks/features.
eg, you can work forwards and backwards.

forwards = starting with the essential hypoths and moving from left right .

backwards = starting with the chain we want to make and moving right to left.
suppose we guess at the last machine we'll use,
then GtMM can show us what chain(s) that machine would need for its inputs






chains are strings of blobs.
blobs are metamath variables.
triangular blobs are variables that can be 'mapped' to other chains.
square blobs cannot be mapped.


unification = doing the mapping? / finding it

|- = provable / starting chain / essential hypoths

GtMM just ignores floating hypoths????

chain we need to make to complete the level
always shown top-center in Gtmm

essential hypoths = chains given to start with,
they are used as input for the machines.
shown in top left of gtmm.

machines you can use (= things we've already proved)
are shown on the right of the screen.

click on a machine to add it to the workspace pane.
workspace pane is always in the center of gtmm screen.







4.3.1 The Concept of Unification During the course of verifying a proof,

when Metamath encounters an assertion label,
it associates the mandatory hypotheses of the assertion
with the top entries of the RPN stack.

Metamath then determines what substitutions
it must make to the variables in the assertion’s mandatory hypotheses
in order for these hypotheses to become identical to their
corresponding stack entries.

This process is called unification.

(We also informally use the term “unification”
to refer to a set of substitutions that results from the process,
as in “two unifications are possible.”)

After the substitutions are made,
the hypotheses are said to be unified.

If no such substitutions are possible,
Metamath will consider the proof incorrect and notify you with an error message.

While a proof is being developed, sometimes not enough information is
available to determine a unique unification.
In this case Metamath will ask you to pick the correct one.









How is MetamathZero different?

Antony Bartlett

unread,
Aug 22, 2022, 6:50:07 AM8/22/22
to meta...@googlegroups.com
Sorry for causing confusion.  Oh dear, I did say I was pretty hazy on the concept of unification didn't I.

> in a metamath verifier unification is trivial

Yes.  Because a proof contains both "essential hypotheses" (logical assertions) and "non-essential hypotheses" (syntax construction), there is really very little unification work left for a verifier to do.  That means the proof assistant has to do it.  The metamath book uses the proof in demo0.mm as an example

$( Prove a theorem $)
th1 $p |- t = t $=
$( Here is its proof: $)
tt tze tpl tt weq tt tt weq tt a2 tt tze tpl
tt weq tt tze tpl tt weq tt tt weq wim tt a2
tt tze tpl tt tt a1 mp mp
$.


And guides the reader through proving it in metamath.exe with commands (Section 2.4 - Your First Proof): assign 1 mp, assign 4 mp, assign 3 a2, etc.

So you have to specify all the essential hypotheses (a1, a2, mp), but the proof assistant usually manages to infer all the syntax construction stuff (tt, tze, tpl, weq, etc.).  I currently have no clue where these come from and how it manages to do that, but this is the unification algorithm I was referring to.

> DOES MMJ use the same algo?
> HOW does MMJ do it?

Yes, you're asking all the right questions, I'd like to know too :-)

> My intuition (I've never broached this notion before, I look forward to seeing if it falls completely flat on its face here) is that as things stand at the moment there are three pillars to working with mm files - verification, unification, and html generation.

Now I've said it out loud (so to speak) and had a chance to think about it, this isn't profund at all is it.  All I'm really saying is that as a metamath user I'd like to be able to
1. Write my proof and have most of the syntax construction bits automatically filled in (unification)
2. Verify that my proof is correct (verification)
3. Have my proof displayed to myself and others in a nice format (html generation)
So it's obvious why those are the most fundamental parts of the system.  Oh well.

    Best regards,

        Antony

Glauco

unread,
Aug 22, 2022, 2:36:36 PM8/22/22
to Metamath
In my opinion, mmj2 is readable code.

If I'm not mistaken, some code is spent on computing max array size. Since some code is old, it may be Java back then didn't have the variable length ADTs it has today.

w.r.t. the unification algorithm, I'm pretty sure there's a comment where Mel writes he's using Robinson's mgu finder.

BR
Glauco


peter.h...@outlook.com

unread,
Aug 22, 2022, 7:20:16 PM8/22/22
to meta...@googlegroups.com

 

 

Sent from Mail for Windows

 

From: Glauco
Sent: 22 August 2022 19:36
To: Metamath
Subject: [Metamath] Re: MMJ question

 

  • some code is spent on computing max array size.

 

  • it may be Java back then didn't have the variable length ADTs

 

  • Robinson's mgu finder.
  •  

 

If U is the most general unifier of a set of expressions 

then any other unifier, V, 

can be expressed as V = UW, 

where W is another substitution.

 

What’s the best thing to read to grok this algorithm?

MMJ? But I don’t know Java, so is there an algorithm in English?

 

thx

 

Mario Carneiro

unread,
Aug 22, 2022, 7:53:37 PM8/22/22
to metamath
The principles of first order unification are not so bad. Here's a sketch of the algorithm.

Let's say you want to apply a theorem like negeqd: ( ph -> A = B ) |- ( ph -> -u A = -u B ) where the current goal is |- ( X < Y -> ?a = -u 3 ). Here ?a is a metavariable; mmj2 calls these "work variables" and writes them like &C1. They represent terms that have not yet been determined. They should be distinguished from X and Y, which are regular variables which are being held fixed for the purpose of the proof - they may as well be constants like 3. (Some sources will even call them "local constants" to emphasize this perspective.)

First, we instantiate all the variables in the theorem with fresh metavariables, resulting in ( ?ph -> ?A = ?B ) |- ( ?ph -> -u ?A = -u ?B ) where ?ph, ?A and ?B are new metavariables of their respective types (?ph is 'wff', ?A and ?B are 'class'). Each metavariable corresponds to a hole in the proof that we must eventually fill before generating the final proof output. (Metamath has no direct concept of metavariables / work variables, although for historical reasons the variables that metamath normally uses are sometimes also called metavariables.)

Next, we have to solve the "unification problem" ( ?ph -> -u ?A = -u ?B ) =?= ( X < Y -> ?a = -u 3 ). That is, we have two expressions, each containing metavariables, and we must come up with an assignment to the metavariables such that these two expressions come out identical. The procedure for this is as follows:

1. If both sides are equal, do nothing and succeed. That is, e =?= e is trivially satisfied.
2. If both sides are applications of the same term constructor, unify all the arguments. That is, from f(e1, ..., en) =?= f(e1', ..., en') we get subproblems e1 =?= e1', ..., en =?= en'.
3. If both sides are applications of different term constructors, fail: there is no possible resolution. (This case is more complicated if you can unfold definitions, but luckily this isn't an issue in metamath unification.)
4. If one side is a metavariable and the other side is not, assign the metavariable. That is, ?m =?= e is resolved by setting ?m := e. (There is a caveat, see below.)
5. If both sides are metavariables, either one can be assigned to the other one.

When a metavariable is assigned, all occurrences of it should either be immediately replaced with the assignment, or else you have to keep track of a map of metavariable assignments and do all matching and equality testing modulo this assignment map.

In our example, it works as follows:
* ( ?ph -> -u ?A = -u ?B ) =?= ( X < Y -> ?a = -u 3 ) is an implication on both sides, so we use rule 2 and get two subgoals
  * ?ph =?= X < Y is solved by assigning ?ph := X < Y (rule 4)
  * -u ?A = -u ?B =?= ?a = -u 3 has an equality on both sides, so we use rule 2 and get two subgoals
    * -u ?A =?= ?a is solved by assigning ?a := -u ?A
    * -u ?B =?= -u 3 is an application of -u on both sides, so use rule 2 with one subgoal
      * ?B =?= 3 is solved by assigning ?B := 3

At the end, we have an assignment {?ph := X < Y, ?a := -u ?A, ?B := 3}, which give us the substitution arguments to negeqd, and also result in the instantiated hypothesis subgoal ( X < Y -> ?A = 3 ), which is passed on to the user or the next step. Note that ?A was not solved by this process, and we will hopefully solve it later by more theorem applications. If after all theorems have been applied these variables are still sticking around, mmj2 will arbitrarily insert some variables to assign to the metavars, for example setting ?A := A (the literal variable A) to get rid of all the metavars and produce a valid metamath proofs. Also note that ?a was solved even though it was not a new metavariable: this will require rewriting any occurrences of ?a anywhere else in the proof with the assignment -u ?A.

The caveat mentioned in rule 4 is that we must check that metavariable assignments are not cyclic. If we have the goal ?a =?= ?a then rule 1 applies, so we don't have to assign ?a, but if the goal is ?a =?= -u ?a, then it would seem that we should use rule 4 and assign ?a := -u ?a, but this would result in an infinite expression ?a := -u -u -u -u ... which is bad. To check this, we just have to make sure when assigning a metavariable ?a := e due to rule 4 that e does not contain ?a in it. If it does, we fail, because it is impossible to solve the equation with a finite substitution in this case.

There are a variety of more complex extensions to this basic idea where you might need to delay solving constraints or combine multiple unification constraints, but the approach described here suffices for metamath unification.

Mario Carneiro


--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.

Antony Bartlett

unread,
Aug 23, 2022, 8:33:50 AM8/23/22
to meta...@googlegroups.com
Thanks to Glauco for getting this conversation back on track.

Thanks to Mario for the description of the principles of first order unification.  When I come to grapple with unifiers I'll probably start by studying the code and stepping through in a debugger.  Or maybe I'll just try to port it to TypeScript and sometime during that process will discover I understand it well enough for practical purposes like I did with the checkmm verifier (worst case the person doing the porting fulfills the role of "Chinese Room", but that's never been my experience).

Perhaps we could look briefly at what implementations currently exist?

metamath-exe: https://github.com/metamath/metamath-exe/blob/master/src/mmunif.c
metamath-knife: I don't think this has a proof assistant yet, and formula.rs (comes up in searches but) is part of the verifier?

Are there any more implementations anyone wants to offer, please?

I think it's quite possible that anything goes when it comes to writing a unification algorithm?  Maybe it doesn't matter in the slightest how you come by a unification, because the very next thing you're going to do is run a validator to ensure the proof is correct anyway, and because a perfect algorithm does not currently exist so you're always going to have to be able to fall back on the option of manual unification.  Heuristics, AI, quantum superposition of all potential unifications, whatever.  I'll almost certainly stick to the standard algorithm anyway, but it might become tempting to massage away some difficulties (if any) encountered while attempting to re-unifying theorems in set.mm.

Or maybe it does matter because it's important to know whether it's the proof or the unification that was wrong.

    Best regards,

        Antony
 

Mario Carneiro

unread,
Aug 23, 2022, 1:53:37 PM8/23/22
to metamath
On Tue, Aug 23, 2022 at 8:33 AM Antony Bartlett <a...@akb.me.uk> wrote:
Are there any more implementations anyone wants to offer, please?

Another implementation you might want to look at is
mm0-rs: https://github.com/digama0/mm0/blob/master/mm0-rs/src/elab/refine.rs#L692-L772

It's MM0 not metamath, but it basically represents what I would do for a metamath unification algorithm, with the slight extension that it supports unfolding definitions as well. You can just ignore the cases that call self.unfold (and the returned conversion proof) for mmj2-style unification.
 
I think it's quite possible that anything goes when it comes to writing a unification algorithm?  Maybe it doesn't matter in the slightest how you come by a unification, because the very next thing you're going to do is run a validator to ensure the proof is correct anyway, and because a perfect algorithm does not currently exist so you're always going to have to be able to fall back on the option of manual unification.  Heuristics, AI, quantum superposition of all potential unifications, whatever.  I'll almost certainly stick to the standard algorithm anyway, but it might become tempting to massage away some difficulties (if any) encountered while attempting to re-unifying theorems in set.mm.

Kind of. It's true that the result of a unification algorithm is checked, but it is possible to get the wrong answer, for example if you unify ?a =?= ?b with ?a := 1 and ?b := 2 that's obviously wrong and will cause issues in the proof assistant and confusion for the user, and if you resolve it with ?a := 1 and ?b := 1 that's also wrong because it's too specific and may cause the user to not be able to complete the proof. It's not true that there is no perfect algorithm though - for first order unification the "most general unifier" (mgu) is the unique best answer and it is decidable by exactly the algorithm I described. You will see that all extant verifiers are implementing something like that algorithm.

There is a caveat regarding the metamath-exe algorithm though, which is that it doesn't actually have a complete and correct math parser so the classical algorithm doesn't exactly apply (since it goes by recursion on the structure of the term), and moreover it will sometimes give syntactically incorrect outputs and/or require user input to determine how to parse the expression. This is a metamath peculiarity which only exists for non-grammatical databases, but most other proof assistants do some kind of parsing to handle this step correctly.
 
Mario

Glauco

unread,
Aug 23, 2022, 2:12:45 PM8/23/22
to Metamath
Hi Antony,

in the tool I'm working on, I definitely went for the Martinelli Montanari mgu algorithm (you will find a lot of clear descriptions of the algorithm and even a nice youtube video with a "concrete" example).

Please, be aware that the unifier associates 'parse trees' (not formula tokens) to working vars (the &Wnn and &Cmm stuff).

The hard part is not to implement the (nice) algorithm, you first need to parse your statements and build parse trees (the algorithm works with 'terms' but 'parse trees' play the same role, thanks to set.mm non-ambiguity).

And before building parse trees, you have to build the grammar :-) 
And this is done "on the fly" (it depends on the specific theory, i.e. .mm file).
And, of course, you need to write a parser to feed with the (generated) grammar and each statement (I used nearly.js, thus I did not needed to write a new one).

And once you've got parse trees, you'll need to build "substitutions" to feed the mgu algorithm with the 'starting pairs' of parse trees to unify.

Just for fun, this is a Jest test of what you can do with Diagnostic(s) (in a language server) for a failing unification  (mp2parse is not defined here: it is an object of the class MmParser that parses .mm format and has been fed with a supersmall subset of set.mm)

test('Expect Working Var unification error', () => {
    const mmpSource =
        'd1:: |- &W2\n' +
        'd2:: |- &W2\n' +
        'qed:d1,d2:ax-mp |- ph';
    const mmpParser: MmpParser = new MmpParser(mmpSource, mp2Parser.labelToStatementMap, mp2Parser.outermostBlock, mp2Parser.grammar, new WorkingVars());
    mmpParser.parse();
    expect(doesDiagnosticsContain(mmpParser.diagnostics, MmpParserErrorCode.workingVarUnificationError)).toBeTruthy();
    expect(mmpParser.diagnostics.length).toBe(4);
    mmpParser.diagnostics.forEach((diagnostic: Diagnostic) => {
        if (diagnostic.code == MmpParserErrorCode.workingVarUnificationError) {
            const errMsg = 'Working Var unification error: the  working var &W2 should be ' +
                'replaced with the following subformula, containing itself ( &W2 -> ph )';
            expect(diagnostic.message).toEqual(errMsg);
            expect(diagnostic.range.start.line == 0 || diagnostic.range.start.line == 1).toBeTruthy();
            expect(diagnostic.range.start.character).toBe(8);
            expect(diagnostic.range.end.character).toBe(11);
        }
    });
});

HTH
Glauco

Il giorno martedì 23 agosto 2022 alle 14:33:50 UTC+2 a...@akb.me.uk ha scritto:

Are there any more implementations anyone wants to offer, please?

I think it's quite possible that anything goes when it comes to writing a unification algorithm?  Maybe it doesn't matter in the slightest how you come by a unification, because the very next thing you're going to do is run a validator to ensure the proof is correct anyway, and because a perfect algorithm does not currently exist so you're always going to have to be able to fall back on the option of manual unification.  Heuristics, AI, quantum superposition of all potential unifications, whatever.  I'll almost certainly stick to the standard algorithm anyway, but it might become tempting to massage away some difficulties (if any) encountered while attempting to re-unifying theorems in set.mm.

Or maybe it does matter because it's important to know whether it's the proof or the unification that was wrong.

    Best regards,

        Antony

Antony Bartlett

unread,
Aug 23, 2022, 3:54:17 PM8/23/22
to meta...@googlegroups.com
Hi Glauco,

Please learn how to work with version control, so I can help, or at least cheer from the sidelines if you don't want help.  They have nice YouTube videos from that too and I'm sure it's much easier to understand than the Martinelli Montanari mgu algorithm.  The basic usage is probably easier than learning how to do unit testing with Jest.

I've just started writing a prettier plugin for mm files (https://github.com/Antony74/prettier-plugin-mm), and fully half the task is tied up and neatly tucked away if I can get my hands on a serviceable parse tree.  And surely there's no point in us both writing unifiers in TypeScript when we could both just work on yours?

My other option is to focus on the third pillar - html generation - while you carry on writing your unifier.

Mario writes:
> It's not true that there is no perfect algorithm though - for first order unification the "most general unifier" (mgu) is the unique best answer and it is decidable by exactly the algorithm I described.

Fantastic, I wasn't looking forward to having to implement a fallback on manual unification like how metamath-exe does.  So I'm never likely to see a unifier failure as long as I don't try to copy the metamath-exe algorithm or play with non-grammatical databases.  Great :-)

    Best regards,

        Antony


--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages