About Metamath zero (mm0)

227 views

Skip to first unread message

Olivier Binda

unread,

Feb 14, 2020, 4:16:39 AM2/14/20

to meta...@googlegroups.com

Hello

I have started working with mm0 and I'll be using this thread to talk about it.

My goals with this thread :

By explaining what I have understood about mm0, I might
- gain even better insights, especially if people interact and correct me.
- get some sound advice
- make it easier for other people to understand what mm0 is and what it can do

I'll be comparing mm0 with metamath (and somewhat evangelising mm0) :
- because I have worked a lot with metamath and I need to know how the same things can be done in mm0 and what are the benefits and the pain points of using mm0 over metamath

I'll somewhat evangelized mm0 (I'm sold to it and so I got a bias) though I'll try to be fair with the awesomeness that is metamath (but remember that I am human, correct me when I am wrong please)

- at some point in time, when mm0 is ready (stable, with much better tooling than metamath), I hope that people will jump ships because the (human brains/contributions of the) metamath community is really really precious and such a community is needed for mm0 also (at the moment, there is Mario... me since a few days... and hopefully others)

You have been warned :)

Let's dive into the subject :

mm0 and metamath can have the exact same internal representation of math expressions as trees.
the tree Node(wi) { Node(ph) { Node(wi) {Node(ps) Node(th)}}} where we have Node(id){ 0+ node children here},

- represents the metamath string "( ph -> ( ps -> th ) )"

- represents the mm0 string "( ph -> ( ps -> th ) )" or the string "ph -> ps -> th" or the string "ph -> (ps -> th)" or the string "ph -> (ps -> th )"

the parsers for mm0 is way more simpler that the one for metamath

- I wrote a pure kotlin antlr parser for metamath (I'll open source it) that is somewhat heavyweigth :
- it requires a few second to build from cold start (I guess you have to build a huge automat to make it work)
- it has a dependency on a pure kotlin antlr runtime

- it is be cpu intensive in some complex case (If I remember correctly)

- it work on the jvm (battle tested), on the browser/javascript (tested) and should work on native platforms like linux, windows, mac, ios, android, webasm (not tested yet)
- it is reasonnably fast (it is possible to build decent tools with that, see the mephistolus videos)

- it is static and only works for a particular version of set.mm

- I wrote a pure kotlin primary mm0 parse (I'll open source it)
- it is really lightweight : instantaneus to build (it is just a string and a few variables)

- it is performant (just looks the next token for branching, really simple, no headaches, minimal branches, minimal string copy operations

- lightest on cpu/memory

- works on every platform, no dependency, easily portable (360 locs for the parser, 60 locs for the tree structure)

- it parses the 19.2MB mm0 translation of set.mm (obtained through mm0-hs frommMM from the mm0 github site) in 154ms
-I wrote a pure kotlin secondary dynamic mm0 parser (I'll open source it)
- same stuff that the primary parser

- Major feature : it is dynamic, you can enhance it at runtime with

sorts (stuff you care about) like wffVar, classVar, setvar but it could also be nat (natural number ?), or other things (I guess that "|-" is not needed because it is embedded in the mm0 language/keywords)
strict provable sort wff; pure sort set; strict sort class;

operators/tree nodes (term) : like the implication or the negation logical operators
term wi (ph ps: wff): wff; term wn (ph: wff): wff;
definitions (def), like you define logical and from from the logical implication and negation
def wa (ph ps: wff): wff = $ ~(ph -> ~ps) $;

binary operators symbols (simple notations) : you can associate one (or many) symbols to a binary operator (a term or a definition) to allow the parser to work with it
infixl wa: $/\$ prec 20; infixr wi: $->$ prec 25; prefix wn: $~$ prec 40;
constructs (general notations) like { x | ph }
notation cab {x: set} (ph: wff x): class = (${$:0) x ($|$:0) ph ($}$:0);
delimiters (tokens for the parser are split around delimiters)
delimiter $ ( ) ~ { } $;

Major features :

you can tune the parser to recognise ->, =>, \rightarrow, \longrightarrow for id=wi,k so users from different part of the world can have the notation they like
someone without years of mathematical training can write ((ph -> (ps -> th )) and still be understood by the parser, without complain about missing spaces, etc..

This helps to lessen the barrier of entry to computer checked maths

mm0 was designed for expressiveness and efficiency and so it cleanly separates in the software code stuff (variables, different structures) that are all bundled together in a metamath string
like in the metamath string

|- ( ph -> ( ps -> th ) )

you have

the provable notion |-
variables, that have the wffVar types : ph, ps, th
the term constructors wff ( ph -> ps )
with substitution type checking/correctness wff ( ph -> ps ), wff ( ph -> ( ps -> th ) )

Well, that's all for documenting today...
In the future, I would like to ask (or speak when I have understood it) about

definition unfolding in proofs (what is the point with that ?, does it make things simpler, more efficient ? I would like to work with <-> and not with unfolded -> and not ? )
the possibility of having variadic antecedents in a theorem (mephistolus uses Gamma|-ph theorems instead of |-ph theorems...in the form of ( v0w -> v1w ) + ( v0w -> ( v1w -> v2w ) ) ---> ( v0w -> v2w ) where v0w stands for any number of antecedents
dummy variables
distinct variables (does mm0 have that ? does it need it, or is the free/bound of variables built in mm0 enough to get things done)
open term (what are the possibilities of that compared to HOL metavariables/compared to what mm has)
what about modularisation and/or name space/aliases to avoid name clash ? (people will want to build different formalisations for maths, what about reuse ?)
imports aren't in the grammar, but it looks like a really nifty feature to have (some people don't want big monolithic 19+MB mm0 files). Why aren't they in the grammar ?
software languages have partial functions, tuples, objects, variadic arguments. mm0 strives to be minimal (for formalism efficiency). Is mm0 missing features ?
input and output directives. That's for formalizing software programs and different ring mm0 compilers, right ?

Best regards,
Olivier

Olivier Binda

unread,

Feb 17, 2020, 2:14:29 PM2/17/20

to Metamath

In this public repository, https://github.com/Lakedaemon/mm0-kotlin-tooling,

I'll publish soon :

the antlr parsers I wrote for metamath (battle tested)
the handwritten parsers I'm writing for metamath zero
a proof-checker I'm writing for metamath zero

As it is all pure kotlin code, I'll try to provide a multi platform project that provides for the mm0 proof checker

a jvm target
a js target (for node.js/the browser)
a native target (linux or android)

I'll also publish a summary of my ramblings and understanding of metamath and metamath zero. So that, more people can understand the benefits of metamath and mm0

I already published a translation of set.mm (obtained through mm0-hs written by Mario) in

a set.mm.mm0 spec
a set.mm.mmu companion ascii proof file
a set.mm.mmb companion binary proof file

so that other people don't have to struggle with Haskel stack to build and compile the mm0 tools to do the translation themselves.
With those files, you can directly delve and use mm0. They are especially usefull if you have a prior experience with set.mm.

About compression :

mm and mm0 both use back references to shorten proofs, which is a kind of compression scheme especially usefull because it saves a lot of time, memory and energy.
I used zip, tar.xz and 7z on the result files to see what compression could be achieved by archivers and here are the results :

size of

set.mm (2020-02-13):

- uncompressed 37.3MB

- zip 13MB

- tar.xz 10.5MB

+ set.mm0:

- uncompressed 9MB

- zip 921KB

- tar.xz 707KB

+ set.mmu:

- uncompressed 448MB

- zip 22.7MB

- tar.xz 13.4MB

+ set.mmb:

- uncompressed 31.2MB

- zip 12.3MB

- tar.xz 9.6MB

- 7z 9.6MB

Best regards,
Olivier

Olivier Binda

unread,

Feb 21, 2020, 6:44:02 AM2/21/20

to Metamath

I have finally written 3 parsers :

primary parser for mm0 files
dynamic parser for mm0 formulas
primary parser for mmu files

and a mm0-mmu proof-checker (thanks to the explanations of Mario).

Though I believe it to be incomplete (it doesn't respect bound/unbound variables), it still proof-checks the mm0/mmu translation of set.mm in 15s.

Once I fix things and cleans the code, I'll publish everything in this public repository, https://github.com/Lakedaemon/mm0-kotlin-tooling

Now, the translation of set.mm to the mm0 format has :

3 sorts (wff, setVar, class)
1201 terms (basically the class/setVar/wff statements)
1276 axioms (true axioms and all mm definitions)
34946 theorems
0 coercion (I was expecting a set > class coercion)
0 simple definition (by adding those, you can customize the dynamic parser for your needs)
0 definition (in there should go the mm constructs like { x | ph } )
0 input (I see how this is usefull for formalizing stuff but not for formalizing maths)
0 output (same)
0 comments (the "modification discouraged", "deprecated", "should not use" tags would be handy to have though)

So basically, at the moment, it is possible to do mm-like work on mm0 but the set.mm.mm0 lacks the coercions/definitions/simple definitions to unlock all mm0 benefits

Maybee I'll write a plugin for intellij idea to handle mm0/mmu files (no promise there, maybee it is too hard/complex/long/not needed for me).
If I managed to write one and if it is nifty, I'll publish it too (in the same repository).
I know that using visual code could allow me to do what I want, but I really do not want to use visual code (I love intellij idea though) and also it might not even be possible for me (I live in ubuntu land).

Also, I would like to explore the possibility to have formalized theorems in mm0 of the kind
if Context |- a= 1 and Context |- b = 2 then Context |- ( a + b ) = 3

where Context has any number of antecedents (0+). This would enable vararg antecedent -> stuff

I need those for mephistolus (It allows having proof that are 2600 steps long instead of 1000000 single mm steps without compression)
and I had to use (Mario-style ph->stuff) and implement mephistolus theorem on top of metmath theorem to get that.

But maybee this is not necessary anymore in mm0 with a

sort context

Got to play with mm0 (or get Mario's advice) to see if it is possible

Best regards,
Olivier

Olivier Binda

unread,

Feb 22, 2020, 3:47:49 PM2/22/20

to meta...@googlegroups.com

Apparently, it is quite easy to write plugins for intellij idea.

I just tried and one day later, I had written lexers and primary parsers for the mm0 and mmu files, with JFlex and GrammarKit (and thanks to mario's answers and the experience gained by writting home made parsers for mm0/mmu),

A few minutes ago, I just pushed accidently my prototype plugin to https://github.com/Lakedaemon/mm0-kotlin-tooling
I did not push the plugin to the market place yet though (because it misses lots of functionnalities, I'll do that later when it is ready)

So, I'll probably be able to develop and distribute a fully feature plugin adding mm0 and mmu support to intellij idea (which has a free community version), with the usual intellisense goodness added to it.
This should be a nice alternative to vs-code.

There is an issue that must be addressed though :

Intellij idea refuses to open set.mm.mmu outside of raw mode (no syntax highlighting/advanced features) because it is much too big (at 450MB).
Which is quite understandable. But that means that people will have to break mmu (and mm0) files into smaller parts to be able

to work with them confortably.

There is an "import file.mm0;" statement at the top of https://github.com/digama0/mm0/blob/master/examples/string.mm0

that could mitigate the issue, by allowing to split files in smaller parts and to reassemble them when veryfing.
BUT the fact that these statements aren't in the mm0 grammar should render vscode and intellij idea unable to work with these files (because parsing these files that hold an invalid statement will fail).

So,it would be nice to have a solution that allows :

reusing mm0/mmu files (like modules)
limit their length/split them
agregate them

Mario recently published a tool that can agregate files with "import at their top".
But that doesn't fix the issue for intellij idea

Maybee, a solution would be to strip the mm0 files from the import statement and have a third filetype (.mmerge)

that holds a structured document telling how to build big mm0/mmu files from smaller parts

Best regards,
Olivier

Mario Carneiro

unread,

Feb 22, 2020, 10:24:26 PM2/22/20

to metamath

On Sat, Feb 22, 2020 at 12:47 PM Olivier Binda <olus....@gmail.com> wrote:

Apparently, it is quite easy to write plugins for intellij idea.

I just tried and one day later, I had written lexers and primary parsers for the mm0 and mmu files, with JFlex and GrammarKit (and thanks to mario's answers and the experience gained by writting home made parsers for mm0/mmu),

A few minutes ago, I just pushed accidently my prototype plugin to https://github.com/Lakedaemon/mm0-kotlin-tooling
I did not push the plugin to the market place yet though (because it misses lots of functionnalities, I'll do that later when it is ready)

So, I'll probably be able to develop and distribute a fully feature plugin adding mm0 and mmu support to intellij idea (which has a free community version), with the usual intellisense goodness added to it.
This should be a nice alternative to vs-code.

There is an issue that must be addressed though :

Intellij idea refuses to open set.mm.mmu outside of raw mode (no syntax highlighting/advanced features) because it is much too big (at 450MB).
Which is quite understandable. But that means that people will have to break mmu (and mm0) files into smaller parts to be able
to work with them confortably.

You probably know this already, but I've been focusing my user interface stuff on the .mm1 file format. The mmu file format currently doesn't even have a syntax highlighter, although it could without too much trouble (it's just an s-expression). But don't make users write mmu files, that's not nice. That's like asking people to write metamath proofs (the contents of $p $= proof... $.) by hand.

Writing an .mm1 IDE is not a trivial matter, not least because users want many things from an IDE, but now there are two, and they both conform to the LSP specification explicitly so that you can plug them into alternative editors. So if you like IntelliJ idea, see if you can write a plugin that communicates to mm0-rs via the json protocol. That's what it was designed for. (If you check the marketplace / plugin database, I will bet you can already find a LSP client implementation for intelliJ, and I've never even used it.)

There is an "import file.mm0;" statement at the top of https://github.com/digama0/mm0/blob/master/examples/string.mm0
that could mitigate the issue, by allowing to split files in smaller parts and to reassemble them when veryfing.
BUT the fact that these statements aren't in the mm0 grammar should render vscode and intellij idea unable to work with these files (because parsing these files that hold an invalid statement will fail).

So,it would be nice to have a solution that allows :
reusing mm0/mmu files (like modules)
limit their length/split them
agregate them

Yep, in my mind this is already a solved problem. Imports aren't in the mm0 grammar because raw mm0 should not get that large. The set.mm translation is supposed to be passed an -f argument to specify what theorem or theorems in set.mm you care about, and those go into the mm0 file. Even for very advanced theorems the resulting mm0 files are only a few hundred KB. The mmb file is much larger, but it is not meant to be edited by a text editor anyway. The mmu file is even larger because it is a textual version of the mmb file, but it is also not meant to be edited by a text editor, and it's rather overwhelming for non-toy problems.

These issues all come up when working on set.mm as well. Imports can be used to lop off the "introduction" section of a development and put it in a different file. As of recently, you can now directly import a .mmu file into an mm1 file (mmb import support is coming), meaning that you could have a small mm1 file that imports set.mm.mmb and then adds a few hundred theorems of your own on top. The imported file will not be rechecked on every keypress, so you get fast response times, and you can write your local development that way.

Mario recently published a tool that can agregate files with "import at their top".
But that doesn't fix the issue for intellij idea

Maybee, a solution would be to stip the mm0 files from the import statement and have a third filetype (.mmerge)

that holds a structured document telling how to build big mm0/mmu files from smaller parts

It sounds like you are proposing what the .mm0 files in the examples folder already are, which is to say, mm0 files which are non-conforming because of the addition of import statements, which can be assembled into conforming mm0 files by the aggregation utility. It is critical that import statements remain a "non-conforming extension" to the mm0 file format, rather than actually part of the spec, because it is extremely difficult to formalize file includes in a reasonable manner, and they aren't needed for scalability at the low level. (The aggregator is careful to preserve things like formatting when building the mm0 files, so that the output is still readable, yet self contained, because it is the source of truth for the target axiomatization.)

Mario

Olivier Binda

unread,

Feb 23, 2020, 5:14:57 AM2/23/20

to meta...@googlegroups.com

Hi Mario
Many thanks for the answer

You probably know this already, but I've been focusing my user interface stuff on the .mm1 file format. The mmu file format currently doesn't even have a syntax highlighter, although it could without too much trouble (it's just an s-expression). But don't make users write mmu files, that's not nice. That's like asking people to write metamath proofs (the contents of $p $= proof... $.) by hand.

Yes. I'll remove the part about editing/looking at mmu files and concentrate on mm0 (I experimented first on mmu because the grammar is easier)

Before I remove it, could you please have a look at the mmu lexer rules and the mmu primary parser bnf

And use those to update the specs of mmu in the official mm0 repository, please.

That way, it'll be easier for other developers to adopt mm0 and build their own tools

Writing an .mm1 IDE is not a trivial matter, not least because users want many things from an IDE, but now there are two, and they both conform to the LSP specification explicitly so that you can plug them into alternative editors. So if you like IntelliJ idea, see if you can write a plugin that communicates to mm0-rs via the json protocol.

I'll look into that

That's what it was designed for. (If you check the marketplace / plugin database, I will bet you can already find a LSP client implementation for intelliJ,

Good, it'll make things easier. I am just concerned about one bit :

the plugin will probably have to ship native binaries (for multiple platforms). So, it'll be at best more cumbersome that a java/kotlin/groovy/scala solution that runs on the jvm already shipped with intellij idea.
But if it is doable, I'll do it.

Yep, in my mind this is already a solved problem.

I knew it. :)

You are a bit like Fermat ("I have just solved a very important problem for mankind") except

- he wrote short esoteric blog posts in the margin of his proofs while you write short proofs in your esoteric blog posts
- that we can wring answers out of you and implement those solutions in a matter of days instead of centuries, which is definately nice. ;)

Imports aren't in the mm0 grammar because raw mm0 should not get that large. The set.mm translation is supposed to be passed an -f argument to specify what theorem or theorems in set.mm you care about, and those go into the mm0 file.

Mmmh. This is nice

Even for very advanced theorems the resulting mm0 files are only a few hundred KB. The mmb file is much larger, but it is not meant to be edited by a text editor anyway. The mmu file is even larger because it is a textual version of the mmb file, but it is also not meant to be edited by a text editor, and it's rather overwhelming for non-toy problems.

yes. The only people who will want to have a look at mmu/mmb files are probably software developers that want to implement mm0 backed features.
The rest of humanity don't need to look inside that. It just needs to work (as excpected).

These issues all come up when working on set.mm as well.

Yes

Imports can be used to lop off the "introduction" section of a development and put it in a different file. As of recently, you can now directly import a .mmu file into an mm1 file (mmb import support is coming),

ok. I'm not sure I'll ever want to go in the mm1 direction mostly because

no one likes to read piles of s-expressions

and selection doesn't seem to look favorably on Lisp-backed languages (which might just be a corollary of the first quote)

I try to be open minded but I think It'll take a lot of convinving to get me to even open a mm1 file.

meaning that you could have a small mm1 file that imports set.mm.mmb and then adds a few hundred theorems of your own on top. The imported file will not be rechecked on every keypress, so you get fast response times, and you can write your local development that way.

Mario recently published a tool that can agregate files with "import at their top".
But that doesn't fix the issue for intellij idea

Maybee, a solution would be to stip the mm0 files from the import statement and have a third filetype (.mmerge)
that holds a structured document telling how to build big mm0/mmu files from smaller parts

It sounds like you are proposing

Actually, I'm not proposing. I mostly want to talk about the issue to know what should be best done about it, so that we can reap benefits in the future (and not shed tears later)
- You have pondered a long long time and you did a marvelous job at designing mm0, but most of the explanations are still in your mind.
I need to know the reasoning behind things before I implement stuff, to do a good job (and also, because I'm curious and not a robot).

- it gives people (including me) the opportunity to speak their minds, suggest ideas, give advice and maybe find an even better solution

what the .mm0 files in the examples folder already are, which is to say, mm0 files which are non-conforming because of the addition of import statements, which can be assembled into conforming mm0 files by the aggregation utility.

It is critical that import statements remain a "non-conforming extension" to the mm0 file format, rather than actually part of the spec, because it is extremely difficult to formalize file includes in a reasonable manner, and they aren't needed for scalability at the low level.

Yes. I completely agree now on that (but I had to think about it/try to put myself in your shoes to reach that conclusion).

(The aggregator is careful to preserve things like formatting when building the mm0 files, so that the output is still readable, yet self contained, because it is the source of truth for the target axiomatization.)

mm0/mmu/mmb files excell at what they where designed for.

Imo, aggregating is an external concern and shouldn't touch mm0/mmu/mmb files and it would be nice to allow everyone (including non mm1 believers) to do the aggregation themselves.

So, it would be nice to design a very simple and open solution that would elegantly handle the issue.
Also, there is the issue of proof authoring (which is a fundamental part of getting mm0 to succeed). I'm not saying that you should solve proof authoring too.

But again it would be nice if everything was in place so that other people could have the basic core features to be able to write authoring tools on top of mm0/mmb.

This is quite a lengthy subjects so I'll just throw subjects around in later posts but we already spoke of :
1) spliting mm0 files in short parts (looks like you are ok with that)
2) having custom mm0 files with selected theorems (looks like you already did it)

3) aggregating mm0 files together (already brushed)

somehow this reminds me of libraries and software development and screams package managers. :)

4) having a standard file format to allow spec/proof conversion between theorem provers (You already did that)

Yet, you say

A production quality verifier should probably use a binary file format, but the information contained in the file should be about the same

which somewhat implies that you don't intend mmb to become an interchange format, as it would be best for the binary formats to get the best of the platform/endianness it is target at for performances reasons, I guess

So, do you intend .mmu files tobe used to exchange proofs between people, worldwide ?
Say, you exchange .mm0 files (in a contest) and someone A writes proofA on his linux pc/android tablet (say with mephistolus)

he sends the exported .mmu proof to someoneB on his windows/amiga/weird endianness box to read the proof and import it in hol (or whatever) ?

We need a proof exchange format, a format that everyone can target to get inter-compatibility.
(and that we can locally convert into a binary format for performance purposes then)

Also, I reread your marvelous (but stil esoteric) paper and I now, I can see how awesome the mmb format is (no memory allocations, structures usable in place, wow) and I really, really really want to implement that.

So, I'll be pestering you further to get the necessary details so that I can implement a mmb proof checker too.
Best regards,
Olivier

Mario

Mario Carneiro

unread,

Feb 23, 2020, 7:56:26 AM2/23/20

to metamath

On Sun, Feb 23, 2020 at 2:14 AM Olivier Binda <olus....@gmail.com> wrote:

Hi Mario
Many thanks for the answer

You probably know this already, but I've been focusing my user interface stuff on the .mm1 file format. The mmu file format currently doesn't even have a syntax highlighter, although it could without too much trouble (it's just an s-expression). But don't make users write mmu files, that's not nice. That's like asking people to write metamath proofs (the contents of $p $= proof... $.) by hand.

Yes. I'll remove the part about editing/looking at mmu files and concentrate on mm0 (I experimented first on mmu because the grammar is easier)

Before I remove it, could you please have a look at the mmu lexer rules and the mmu primary parser bnf
And use those to update the specs of mmu in the official mm0 repository, please.
That way, it'll be easier for other developers to adopt mm0 and build their own tools

Yep, there is a reason https://github.com/digama0/mm0/issues/13 is still open. :)

The mmu lexer is good (the whitespace rule can be simplified to just [ /n]+).

In the parser, the rule for term-binder is:

term-binder ::= LEFT_PARENTHESIS var-name sort-name LEFT_PARENTHESIS identifier* RIGHT_PARENTHESIS RIGHT_PARENTHESIS

but the third argument is optional (and is used to distinguish bound and regular variables). So it should read

term-binder ::= LEFT_PARENTHESIS var-name sort-name (LEFT_PARENTHESIS identifier* RIGHT_PARENTHESIS)? RIGHT_PARENTHESIS

(Edit: I see you've called this "theorem-binder". These two nonterminals can be merged.)

For input and output statements, the grammar is

input-stmt ::= 'input' input-kind expr*

input-kind ::= ID

output-stmt ::= 'output' output-kind expr*

output-kind ::= ID

with the constraint on valid input and output kinds in the later processing rather than in the parser. The supported input and output kinds are implementation defined, and I would assume you don't need to support any. Currently mm0-hs supports the 'string' input and output kinds, and mm0-rs supports no input or output kinds.

Your rule for identifiers:

identifier ::= ID | SORT | TERM

refers to lexer classes SORT and TERM that I can't find defined. I think this can just be "identifier ::= ID".

For theorem visibility, I think I may have gone back and forth on whether to use "local theorem" and "theorem", or "theorem" and "pub theorem", but in the latest version the mmu format only sees "local theorem" and "theorem", so the "pub" keyword is not used (although it is used in mm1 files). In any case, the semantics is that a (pub) theorem is one that bumps the mm0 cursor to the next statement and checks that the next statement is a theorem that matches this one in the mmu file, while a local theorem is one that does not appear in the mm0 file and so has no effect on the current position in the file.

expr ::= var-name | term-name | LEFT_PARENTHESIS term-name (expr)* RIGHT_PARENTHESIS

The "| term-name" part is not necessary here. (It is necessary in the mm0 grammar, but in mmu the parentheses around nullary term constructors are mandatory.)

proof-expr ::= ... | LEFT_PARENTHESIS assert-name (identifier)* RIGHT_PARENTHESIS

This production is not necessary. All theorem applications have the form (T (es) ps), so even if es and ps is empty it still looks like (T ()).

convertibilityProof ::= ... | LEFT_PARENTHESIS ':unfold' identifier expr LEFT_PARENTHESIS (identifier)* RIGHT_PARENTHESIS convertibilityProof RIGHT_PARENTHESIS

the "expr" in this should be a list of expressions, that is, "LEFT_PARENTHESIS (expr)* RIGHT_PARENTHESIS". The first one in the list is not applied to the rest, which TBH makes it look a bit odd; you get things like

(:conv an (ph ps) () (not (im ph (not ps))))

indicating that (an ph ps) unfolds to (not (im ph (not ps))), and the parentheses around (ph ps) look like they are in the wrong place. What are your thoughts on changing the syntax to

convertibilityProof ::= ... | '(' ':unfold' '(' identifier (expr)* ')' '(' (identifier)* ')' convertibilityProof ')'

where the parentheses around the expression are moved to before the target term? That way the same proof looks like

(:conv (an ph ps) () (not (im ph (not ps))))

which makes a bit more visual sense. We could do the same thing with proofs, moving the opening paren back so that T is at the head of the list, but then they would have the form ((T es) ps) with a double paren at the beginning. This is fine for parsing purposes, and it's still a valid s-expression, but it does make the form a bit unusual.

That's what it was designed for. (If you check the marketplace / plugin database, I will bet you can already find a LSP client implementation for intelliJ,

Good, it'll make things easier. I am just concerned about one bit :
the plugin will probably have to ship native binaries (for multiple platforms). So, it'll be at best more cumbersome that a java/kotlin/groovy/scala solution that runs on the jvm already shipped with intellij idea.
But if it is doable, I'll do it.

Yep, in my mind this is already a solved problem.

I knew it. :)

You are a bit like Fermat ("I have just solved a very important problem for mankind") except
- he wrote short esoteric blog posts in the margin of his proofs while you write short proofs in your esoteric blog posts
- that we can wring answers out of you and implement those solutions in a matter of days instead of centuries, which is definately nice. ;)

Imports aren't in the mm0 grammar because raw mm0 should not get that large. The set.mm translation is supposed to be passed an -f argument to specify what theorem or theorems in set.mm you care about, and those go into the mm0 file.

Mmmh. This is nice

Even for very advanced theorems the resulting mm0 files are only a few hundred KB. The mmb file is much larger, but it is not meant to be edited by a text editor anyway. The mmu file is even larger because it is a textual version of the mmb file, but it is also not meant to be edited by a text editor, and it's rather overwhelming for non-toy problems.

yes. The only people who will want to have a look at mmu/mmb files are probably software developers that want to implement mm0 backed features.
The rest of humanity don't need to look inside that. It just needs to work (as excpected).

Right. The core of the architecture is that rather than looking at the proofs directly, you have a program to do the looking for you and look at the program instead to make sure it is correct.

I'm not sure I'll ever want to go in the mm1 direction mostly because

no one likes to read piles of s-expressions

and selection doesn't seem to look favorably on Lisp-backed languages (which might just be a corollary of the first quote)

I try to be open minded but I think It'll take a lot of convinving to get me to even open a mm1 file.

The use of s-expressions in mm1 is sugared to make it easier to use, and in particular to avoid the parentheses explosion you see in mmu files (for which lisp is known). But more complex languages have more complex parsers, and I don't know if I can continue to deliver the same quality of syntax highlighting, error reporting and fast processing with more syntax hurdles. But syntax questions are largely orthogonal to the rest of the language, so if you have ideas for a better syntax be my guest. One thing that I think is certainly coming is a way to configure or replace the secondary parser, the thing that turns $ math expressions $ into data. So you can write your own domain specific language and put it in $ math quotes $ to get it parsed via the current default parser, or put it in "string quotes" and explicitly call a parsing function of your own. In these places you can have whatever syntax you like, although you may have to do extra work to get the same amount of support for hovers and such in the middle of a string.

However, MM1 is not core to the architecture the same way MM0/MMB is. If you want to make your own front end to compile to MM0, it need not have any syntax in common with MM0, and it may even only vaguely present MM0 primitives to the front end. I think that MM1 is the best tool for writing "pure MM0" code, in the same way that C is a good language for writing low level assembly, but you may want to layer a high level language on top of it and this language may look nothing at all like MM1. (Actually, MM1 has rather better metaprogramming facilities than C does, macros are a poor substitute. Maybe D is a better comparison.)

Maybee, a solution would be to stip the mm0 files from the import statement and have a third filetype (.mmerge)
that holds a structured document telling how to build big mm0/mmu files from smaller parts

It sounds like you are proposing

Actually, I'm not proposing. I mostly want to talk about the issue to know what should be best done about it, so that we can reap benefits in the future (and not shed tears later)
- You have pondered a long long time and you did a marvelous job at designing mm0, but most of the explanations are still in your mind.
I need to know the reasoning behind things before I implement stuff, to do a good job (and also, because I'm curious and not a robot).

- it gives people (including me) the opportunity to speak their minds, suggest ideas, give advice and maybe find an even better solution

Sure. I am happy to explain these things as necessary, but I need a bit more to see the ways in which my attempt to solve the import problem misses the mark.

(The aggregator is careful to preserve things like formatting when building the mm0 files, so that the output is still readable, yet self contained, because it is the source of truth for the target axiomatization.)

mm0/mmu/mmb files excell at what they where designed for.

Imo, aggregating is an external concern and shouldn't touch mm0/mmu/mmb files and it would be nice to allow everyone (including non mm1 believers) to do the aggregation themselves.

If you are a nonbeliever, then you use some high level proof language, like say lean, that has nothing whatsoever to do with MM0. A backend tool processes your proof and spits out an MM0/MMB pair, with the MM0 file readable enough that the target theorem is recognizable. No aggregation is necessary at the MM0 level, because the output is a single file (well, two files). Lots of aggregating might have happened with multiple lean files, but that's it's problem.

The point of aggregating mm0 files is in order to maintain a readable collection of theorem statements, in particular if they cover multiple subject matters and don't necessarily form a linear dependency graph. I'm currently using this for the definition of the MM0 verifier, which requires two entirely distinct theories: the definition of MM0 as a formal system, and the definition of the x86 architecture. It would be awkward to have the definition of mm0 import x86, or for x86 to import mm0. But the final theorem requires both, so there is a diamond dependency. Using separate files and an aggregator allows me to concisely specify all the downward-closed subposets of this lattice without repetition.

So, it would be nice to design a very simple and open solution that would elegantly handle the issue.
Also, there is the issue of proof authoring (which is a fundamental part of getting mm0 to succeed). I'm not saying that you should solve proof authoring too.

Well, as I've said MM1 is my answer to proof authoring, and I'm using it "in production" so to speak. Of course I am only one user, and I have specific tastes so the language has grown in rather lopsided ways as a result to support my proof style. More and different opinions are needed to make it more generally usable.

But again it would be nice if everything was in place so that other people could have the basic core features to be able to write authoring tools on top of mm0/mmb.

This is quite a lengthy subjects so I'll just throw subjects around in later posts but we already spoke of :
1) spliting mm0 files in short parts (looks like you are ok with that)
2) having custom mm0 files with selected theorems (looks like you already did it)
3) aggregating mm0 files together (already brushed)

somehow this reminds me of libraries and software development and screams package managers. :)

Quite so. Package managers have a tendency toward over-engineering, though, and all the projects I have overseen have tended more toward a "monorepo" style where the entire ecosystem is globally visible so that sweeping changes are possible.

4) having a standard file format to allow spec/proof conversion between theorem provers (You already did that)
Yet, you say
A production quality verifier should probably use a binary file format, but the information contained in the file should be about the same

which somewhat implies that you don't intend mmb to become an interchange format, as it would be best for the binary formats to get the best of the platform/endianness it is target at for performances reasons, I guess

Actually, that quote is from early days, before the .mmb format existed. I am alluding to it. (For context, "about the same" means about the same as the .mmu file format.) While some minor changes are still needed to the mmb format to make it more suitable as an interchange format, specifically regarding more ability to annotate proofs and terms in the index, I *do* intend it as one. It's true that there are endianness concerns, and for better or worse I've decided to just write in the spec that integers in mmb are little endian and let big endian machines deal with it. But verifiers are allowed to define their own formats if they want in any case, and one easy and common way to indicate that the endianness is reversed is to see if the magic number "MM0B" in the first 4 bytes comes out in reverse order. So if a verifier wanted to declare that it checks MM0B formats in native endianness, that's fine too, and proof authoring tools would need to keep that in mind when targeting that verifier. But this sort of thing gets annoying fast, and MM0B is already basically a bytecode interpreter, so it's not a huge deal to ask for platform independence.

So, do you intend .mmu files are what will be used to exchange proofs between people, worldwide ?
Say, you exchange .mm0 files (in a contest) and the someone A writes proofA on his linux pc, with mephistolus

he sends the exported .mmu proof to someoneB on his windows/amiga/weird endianness box to read the proof and import it in hol (or whatever) ?

You could do that. MMB is much better than MMU for large proofs though because it does more thorough deduplication, in particular between terms. In the worst case you can get an exponential blowup with large terms in mmu that are totally fine in mmb.

Mario

Olivier Binda

unread,

Mar 1, 2020, 4:07:35 PM3/1/20

to Metamath

I have just started having a look at working with the translation to mm0 of set.mm (that I'll call set.mm.mm0).

I started with trying to turn "1 + 1 = 2" into a tree

As set.mm.mm0 is a raw translation from set.mm, (without mm0 features like operators(simple notations)/notations (constructs)/definitions/coercions(set -> class)),
I used the dynamic parser built against set.mm.mm0 to parse

"( wceq ( co c1 caddc c1 ) c2 )"

to get

wceq {

co {

caddc

}

The tree is correct, but the string I had to write to get the tree is not to my liking, I just wanted to write "1 + 1 = 2" to get that tree.
But for that, you have to introduce some simple notations

For my second try, I experimented with "1 + 2 + 3"
First, I enhanced set.mm.mm0 with some operators with

prefix c1: $1$ prec max;

prefix c2: $2$ prec max;

prefix c3: $3$ prec max;

infixl mm0AddC: $+$ prec max;

def mm0AddC(a b: class): class = $(co a caddc b)$;

and then, I got

mm0AddC {

}

out of

"1 + 2 + 3"

and I got

mm0AddC {

}

out of 1 + (2+3)

Quite satisfying but this raises a question :
The parser of mm0 is great at parsing stuff with binary operators and turns things like

1 + 2 into

a tree of the form

+ {

}

Yet Metamath was designed around co, that takes 3 classes A B C to define ( A B C )

so the Metamath for the same tree is

co {

}

I tried to workaround that by adding a definition

def mm0AddC(a b: class): class = $(co a caddc b)$;

that is like a bridge between the 2 trees.

But now, this made me think

1) metamath trees are unique but not mm0 trees, because you can have definitions that allow to express the same tree in different manners

As you can see 1 + 2 + 3 can be represented by both trees in mm0 :

just terms (no definition, which are expended)

co {

}

and (with definition)

mm0AddC {

}

When definitions are expanded, this uniqueness should come back. But it implies more complicated reasonning/checks on trees.

2. set.mm.mm0 might not be the correct way to express maths in mm0 (ok, I'm naive).
It provides nice tests for the parsers/verifiers but eventually math will have to be rewritten in mm0

For example, maybee the 3 arguments structs like co should disappear.

They make it nice to have equalities like A=B, C= D, E=F -> ( A C E ) = ( C D F ) for a lot of operators but they don't play well with mm0 parser/trees

the logical stuff (and or, not)/2 args stuff should be simpler to migrate to mm0

But tools that refactor the mm theory/proofs into mm0 friendly expression should probably be written/used to translate set.mm into a more working condition for mm0

Probably quite a lot of things will have to change (And I had this hope to implement mephistolus on mm0 in 2 months, ahah, I'm so dumb...)
It will be hard as refactoring set.mm.mm0 requires refactoring the proof at the same time, which is not easy at a point in time where there aren't that many tools available

3. I'm wondering how set.mm should be turned into mm0. What are the major implementation differences ?

Oh well, back to play with it...

Best regards,
Olivier

Mario Carneiro

unread,

Mar 1, 2020, 5:59:03 PM3/1/20

to metamath

On Sun, Mar 1, 2020 at 1:07 PM Olivier Binda <olus....@gmail.com> wrote:

Quite satisfying but this raises a question :
The parser of mm0 is great at parsing stuff with binary operators and turns things like
1 + 2 into
a tree of the form
+ {
1
2
}
Yet Metamath was designed around co, that takes 3 classes A B C to define ( A B C )
so the Metamath for the same tree is
co {
1
+
2
}

This is correct. I have said on a few occasions that the metamath and mm0 notation systems are incompatible, and this is why. In MM0, every piece of notation is associated to a syntax constructor, while in metamath syntax constructors are associated to productions in a global CFG that need not have any particular characteristics. While parentheses are reserved for grouping, you could use another character as a notation character for the "co" constructor. Thus:

notation co (a b c: class): class = ($[$:max) a b c ($]$:0);

prefix cadd: $+$ prec max;

Now you can write [a + [b + c]] and it will be parsed as (co a cadd (co a cadd b)) just as in metamath. Note that in this situation + is not a binary operator at all, it is a "nullary prefix" operator, which is to say, a constant symbol.

I tried to workaround that by adding a definition
def mm0AddC(a b: class): class = $(co a caddc b)$;
that is like a bridge between the 2 trees.

But now, this made me think
1) metamath trees are unique but not mm0 trees, because you can have definitions that allow to express the same tree in different manners
As you can see 1 + 2 + 3 can be represented by both trees in mm0 :
just terms (no definition, which are expended)
co {
co {
1
+
2
}
+
3
}
and (with definition)
mm0AddC {
mm0AddC {
c1
c2
}
c3
}

I would say: MM0 trees are unique; these are different trees, that happen to be provably equal. Metamath does not have unique trees in the sense that a given token string can have two parses (although set.mm doesn't have this pathology). You have created these two trees by using different parser configurations for the same string, which is of course fine because the parser configuration (the sequence of "infix", "notation", etc) is fixed for a given database.

This is "mm0AddC" is another way to get metamath-ish notations inside MM0, and probably what I would do in a hand translation. It has some drawbacks, though. Because the terms are not literally the same as the corresponding metamath terms, a bunch of additional proofs have to be inserted, and it's not completely trivial to work out where those proofs should go. What's more, in metamath because there is only one operation, "co", involved, there is only one theorem "opeq2d" that gets used for a wide variety of operations, for example it can prove "B = C -> A + B = A + C" and also "B = C -> A - B = A - C". In MM0, these are two different syntax constructor, rather than the same syntax constructor with a different middle argument, so you need different theorems for these two cases. (MM1 will automatically generate all such equality theorems so the user pain around this is minimal.) But in an automated translation that means you have to do something clever whenever this theorem gets used.

2. set.mm.mm0 might not be the correct way to express maths in mm0 (ok, I'm naive).
It provides nice tests for the parsers/verifiers but eventually math will have to be rewritten in mm0

For example, maybee the 3 arguments structs like co should disappear.
They make it nice to have equalities like A=B, C= D, E=F -> ( A C E ) = ( C D F ) for a lot of operators but they don't play well with mm0 parser/trees

the logical stuff (and or, not)/2 args stuff should be simpler to migrate to mm0

Actually, I find MM0 much nicer to use than metamath when it comes to functions with 3 or more arguments, which come up a lot in the computer science type applications I have been working on. That's because there is a notation that requires no "notation", the default notation that everything gets, which is "foo x y z". This generalizes trivially to any number of arguments and does not require any parser directives, so it is the natural option for big and complicated predicates with many arguments. In metamath you would have to define a new notation for this, and the syntax is not so great; for functions you have the options of ( foo ` <. a , b , c >. ) (which doesn't generalize past 3), ( foo ` <. a , <. b , c >. >. ), ( foo ` <" a b c "> ), or ( ( ( foo ` a ) ` b ) ` c ) (which doesn't always work for class functions). The most generally applicable form is ( foo ` <. a , <. b , c >. >. ), but the syntax is the worst among all the options. (The <" A B ... Z "> notation was created specifically for this use case, but you still have to select the right constructor depending on how many arguments you have, and the notation gets verbose again once you get past 8 arguments.) You can also do funny things with arguments on the left and right with stuff like ( c ( a foo b ) d ) to make it more compact but this is pretty mind bending to read.

But tools that refactor the mm theory/proofs into mm0 friendly expression should probably be written/used to translate set.mm into a more working condition for mm0

Probably quite a lot of things will have to change (And I had this hope to implement mephistolus on mm0 in 2 months, ahah, I'm so dumb...)
It will be hard as refactoring set.mm.mm0 requires refactoring the proof at the same time, which is not easy at a point in time where there aren't that many tools available

3. I'm wondering how set.mm should be turned into mm0. What are the major implementation differences ?

Well, I think you've already seen the answer. The logical part is a one time cost; there is some work on the translator and then boom, you've got megabytes of raw MM0 to play with. That's already happened, but as you can see this leaves a lot to be desired when it comes to presentation. If you analyze the notations with an eye for the way set.mm writes things, you can probably reverse engineer a good number of MM0 notations. For example, things like df-an are easy (although you would have to specify the precedence and associativity manually, because metamath doesn't have any information to give in that regard). Some notations would have to change, like df-al and df-ral overlap their first symbol, which is not allowed. So for an automatic translation you have to somehow stick disambiguating marks on things without it looking terrible. And then there are notations "via co", which are difficult because you don't know which operations are going to be used in advance. If you want to declare a new constructor like your "mm0AddC" for each operation, you have to identify that df-pl, whose definition is |- + = ..., is a class that is meant to be used as a binary operation, whereas "|- 0 = ..." is a constant, not a binary operation, and "|- sin = ..." is a unary operation. One cue is that the binary operations are usually declared using df-mpt2, but not always. But there will always be "funny terms" like ( a ( x e. A , y e. B |-> C ) b ) that will actually be using co directly, so there is a fallback option if detection doesn't work out, and this can be viewed as an optimization.

Mario

Olivier Binda

unread,

Mar 2, 2020, 8:54:36 AM3/2/20

to Metamath

This is correct. I have said on a few occasions that the metamath and mm0 notation systems are incompatible, and this is why. In MM0, every piece of notation is associated to a syntax constructor, while in metamath syntax constructors are associated to productions in a global CFG that need not have any particular characteristics.

I would say: MM0 trees are unique; these are different trees, that happen to be provably equal.

Metamath does not have unique trees in the sense that a given token string can have two parses (although set.mm doesn't have this pathology).

Yes, I see your point. They were easy to reason about though (if you work with the (false) asumption that ambiguous parse don't happen in real life)...

Well, Mathematics are probably telling me : "if there are more than one tree for the same meaning, that's how it should be, deal with it"

You have created these two trees by using different parser configurations for the same string, which is of course fine because the parser configuration (the sequence of "infix", "notation", etc) is fixed for a given database.

for the sake of our potential future conversation, is "different parser configurations" equal to

1) "the different mm0 statements I made it gobble" ?
2) "the different strings that the parser had to parse ?

This is "mm0AddC" is another way to get metamath-ish notations inside MM0, and probably what I would do in a hand translation. It has some drawbacks, though. Because the terms are not literally the same as the corresponding metamath terms, a bunch of additional proofs have to be inserted, and it's not completely trivial to work out where those proofs should go.

already smells like trouble :/

What's more, in metamath because there is only one operation, "co", involved, there is only one theorem "opeq2d" that gets used for a wide variety of operations, for example it can prove "B = C -> A + B = A + C" and also "B = C -> A - B = A - C". In MM0, these are two different syntax constructor, rather than the same syntax constructor with a different middle argument, so you need different theorems for these two cases.

Yes, I guessed so.

(MM1 will automatically generate all such equality theorems so the user pain around this is minimal.)

I'll have to do that too. MM1 and Mephistolus share some concerns (both are proof authoring tools)

Actually, I find MM0 much nicer to use than metamath when it comes to functions with 3 or more arguments, which come up a lot in the computer science type applications I have been working on. That's because there is a notation that requires no "notation", the default notation that everything gets, which is "foo x y z". This generalizes trivially to any number of arguments and does not require any parser directives, so it is the natural option for big and complicated predicates with many arguments. In metamath you would have to define a new notation for this, and the syntax is not so great; for functions you have the options of ( foo ` <. a , b , c >. ) (which doesn't generalize past 3), ( foo ` <. a , <. b , c >. >. ), ( foo ` <" a b c "> ), or ( ( ( foo ` a ) ` b ) ` c ) (which doesn't always work for class functions). The most generally applicable form is ( foo ` <. a , <. b , c >. >. ), but the syntax is the worst among all the options. (The <" A B ... Z "> notation was created specifically for this use case, but you still have to select the right constructor depending on how many arguments you have, and the notation gets verbose again once you get past 8 arguments.) You can also do funny things with arguments on the left and right with stuff like ( c ( a foo b ) d ) to make it more compact but this is pretty mind bending to read.

S-expression stuff.
At the moment, I'm implementing mm0 trees by having

`data class MM0Tree(val id:String, val children:List<MM0Tree>)

But I could have

MM0STree(val args:List<T>)

with args[0]=id and args.frop(1) = children

I guess that it would be about the same optimization/view that the S expression

I still have to implement in place verification of the mmb proofs. When I do that, I'll probably have to implement a S-expression like view

But tools that refactor the mm theory/proofs into mm0 friendly expression should probably be written/used to translate set.mm into a more working condition for mm0

Probably quite a lot of things will have to change (And I had this hope to implement mephistolus on mm0 in 2 months, ahah, I'm so dumb...)
It will be hard as refactoring set.mm.mm0 requires refactoring the proof at the same time, which is not easy at a point in time where there aren't that many tools available

3. I'm wondering how set.mm should be turned into mm0. What are the major implementation differences ?

Well, I think you've already seen the answer.

Don't go sybilic/esoteric on me (I'm bad at catching subtle hints :/). I saw the mm0-hs fromMM translation tool that gets me a nice mm0 file from mm stuff. I like that, by the way.

The logical part is a one time cost; there is some work on the translator and then boom, you've got megabytes of raw MM0 to play with. That's already happened, but as you can see this leaves a lot to be desired when it comes to presentation. If you analyze the notations with an eye for the way set.mm writes things, you can probably reverse engineer a good number of MM0 notations. For example, things like df-an are easy (although you would have to specify the precedence and associativity manually, because metamath doesn't have any information to give in that regard). Some notations would have to change, like df-al and df-ral overlap their first symbol, which is not allowed. So for an automatic translation you have to somehow stick disambiguating marks on things without it looking terrible. And then there are notations "via co", which are difficult because you don't know which operations are going to be used in advance. If you want to declare a new constructor like your "mm0AddC" for each operation, you have to identify that df-pl, whose definition is |- + = ..., is a class that is meant to be used as a binary operation, whereas "|- 0 = ..." is a constant, not a binary operation, and "|- sin = ..." is a unary operation. One cue is that the binary operations are usually declared using df-mpt2, but not always. But there will always be "funny terms" like ( a ( x e. A , y e. B |-> C ) b ) that will actually be using co directly, so there is a fallback option if detection doesn't work out, and this can be viewed as an optimization.

I'm not sure that I understand all that you say but maybe I got the gist of it.

Anyway, here is my plan :

1) I'll use mm0-hs fromMM to translate stuff from mm into a raw mm0 file

2) I'll implement a patcher/refactor/remaper machine that eats a mm0+mmu file and spit a new enhanced mm0+mmu file (in kotlin :/, I'll open source it though because it is important to be available to anyone)...I'll get to mmb when I get to mmb. :)

3) I'll write public patchs to slowly turn set.mm into a mm0 file that use the features mm0 can provide (I hope to get some advice/guidance from you and others for that)

First, I'll start with id remapping, those c1, c2, ..., c9 scream to be renamed 1, 2, ..., 9 to get usable directly in the mathparser without addition.

I'll strive to replace stuff, remove stuff and add stuff only when absolutely necessary.

Best regards,
Olivier

Olivier Binda

unread,

Apr 3, 2020, 3:11:59 PM4/3/20

to Metamath

I have been playing some more with mm0 lately and made some progress :

- I implemented AGAIN a mm0 parser, a mmu parser, a mm0/mmu proof checker with a different design
(my first try was with immutable structures and lots of allocations, I tried to reduce allocations by using mutable structures/views/events)

My first try took 15 s to proof check set.mm.mm0
My second (mutable) try took 3min30s (lol), which made me realize how stupid I can be.
My third try is at 22s (but with unique string usage, a lot less memory) and I realized a few things about parsers, so there is still room for performance improvements. I would really like to reach something like 5s at some point, which might not be easy when you have to handle 480 MB of data.

but performance is not the issue there. A second go at implementing parsers/proof checkers gave me a chance to further understand mm0/mmu and how things should be done.

So, my second implementation is actually more correct.

Also, it is a good enough foundation, for me to start working on :
- adding mm0 features to set.mm
- Mephistolus on mm0

So, I'll now work on writing a patcher routine that takes set.mm.mm0/set.mm.mmu and slowly :
- add simple notations (to make the parser work with stuff like 1 + 2 + 3)

- replace terms with definitions

- replace axioms on terms with theorems on definitions (using conversibility proofs)

- replace usage co like "co (1 + 2 )" with regular binary operators like "1+2"

So far, I have only changed c1 into 1 (same for c2, c3, c4, c5, c6, c7, c8, c10). Which requires

to replace every instance of "c1" in the mmu and mm0 files with "1" (some work, but not very hard)

and added simple notations for the logical operators (this is even easier)

but now, I'll try to do something hard :

- either replace the term for logical or with a definition (which means refactoring lots of proofs)

- or replacing the co usage for +, with using a term (which probably means more refactoring)

I am not at all sure that I'll be able to pull this off. I do not know how long it will take me either.
(I only know that if I can pull this off, I'll be able to improve stuff a lot)

I'll probably go with replacing the or term with a definition (which will make me able to write conversibility proof, a necessary requirement for me).

Wish me luck !

Best regards,
Olivier

ps : I'll soon publish my patched mm0/mmu files (in case someone is interested) ans also my mm0/mmu parsers (first).
Those can/should still be improved but it can be done later.

Stay home and stay safe ! :)

Mario Carneiro

unread,

Apr 3, 2020, 7:23:41 PM4/3/20

to metamath

On Fri, Apr 3, 2020 at 12:12 PM Olivier Binda <olus....@gmail.com> wrote:

So far, I have only changed c1 into 1 (same for c2, c3, c4, c5, c6, c7, c8, c10). Which requires
to replace every instance of "c1" in the mmu and mm0 files with "1" (some work, but not very hard)

Note that "1" is not a valid mm0 identifier. You have to define "1" as a notation for a constant with a valid identifier, such as "c1".

and added simple notations for the logical operators (this is even easier)

but now, I'll try to do something hard :
- either replace the term for logical or with a definition (which means refactoring lots of proofs)
- or replacing the co usage for +, with using a term (which probably means more refactoring)

The MM0 pretty printer already knows how to print notations, so all you would have to do is to add a notation for wo. But it would have to happen between reading the incoming .mm file and producing the output, that is, inside mm0-hs, so that is on me to some extent. It needs to somehow accept configuration information to guide the process, and passing that by the command line would be very clunky. The internal metamath parser already knows how to read $j commands, so one option would be to add a new $j command declaring the notation.

I am not at all sure that I'll be able to pull this off. I do not know how long it will take me either.
(I only know that if I can pull this off, I'll be able to improve stuff a lot)

I'll probably go with replacing the or term with a definition (which will make me able to write conversibility proof, a necessary requirement for me).

Doing this alone is pretty easy, especially if you are doing manual fixups on the output file. Put the definition of or in for "def wo", and prove df_or by reflexivity (after unfolding)

Mario

Wish me luck !

Best regards,
Olivier
ps : I'll soon publish my patched mm0/mmu files (in case someone is interested) ans also my mm0/mmu parsers (first).
Those can/should still be improved but it can be done later.

Stay home and stay safe ! :)

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/65485950-3987-4be2-8746-cf1e93df5267%40googlegroups.com.

Olivier Binda

unread,

Apr 4, 2020, 6:14:42 AM4/4/20

to meta...@googlegroups.com

Le samedi 4 avril 2020 01:23:41 UTC+2, Mario Carneiro a écrit :

On Fri, Apr 3, 2020 at 12:12 PM Olivier Binda <olus....@gmail.com> wrote:
So far, I have only changed c1 into 1 (same for c2, c3, c4, c5, c6, c7, c8, c10). Which requires
to replace every instance of "c1" in the mmu and mm0 files with "1" (some work, but not very hard)

Note that "1" is not a valid mm0 identifier. You have to define "1" as a notation for a constant with a valid identifier, such as "c1".

Yes, you are right. I messed up (it was so in my first try but not in the second try where I changed the way things were done and I forgot about that).
I will fix that as soon as possible (I want to do conforming stuff).

I guess that the restriction on id is this way so that we can do nice stuff with notations (mostly with numbers).

For example, at some point, I plan to allow users to write 1283 and to have that (maybe in a preprocessing step) turned into (; ; ; 1 2 8 3 ) because normal users will want to use numbers the way they are used to

I messed up (not the first time, not the last time) but as always, there are benefits with failures :
I have a patcher routine that allows renaming ids and propagating changes...this will be useful in the future

and added simple notations for the logical operators (this is even easier)

Thanks for your guidance.

but now, I'll try to do something hard :
- either replace the term for logical or with a definition (which means refactoring lots of proofs)
- or replacing the co usage for +, with using a term (which probably means more refactoring)

The MM0 pretty printer already knows how to print notations, so all you would have to do is to add a notation for wo. But it would have to happen between reading the incoming .mm file and producing the output, that is, inside mm0-hs, so that is on me to some extent. It needs to somehow accept configuration information to guide the process, and passing that by the command line would be very clunky. The internal metamath parser already knows how to read $j commands, so one option would be to add a new $j command declaring the notation.

You hint that it could be done with mm0-hs. Ok.
Still, I don't want to rely on that. I need to be able to do that myself, because Mephistolus needs it.

So, I will only rely on mm0-hs to translate mm stuff into basic mm0-stuff

and then Mephistolus will take over from there (we will have duplicate efforts because, I'm pretty sure that you are doing wonderfull stuff in mm0-hs... but though Mephistolus can rely on the mm0/mmu specification, it cannot rely on an external software. It needs to be portable and standalone.

That's too bad for the duplicated effort but that deosn't mean that we cannot collaborate and share stuff so that our work is easier.
(and it is in my best interest to collaborate with mm0 users, so I will)

I am not at all sure that I'll be able to pull this off. I do not know how long it will take me either.
(I only know that if I can pull this off, I'll be able to improve stuff a lot)

I'll probably go with replacing the or term with a definition (which will make me able to write conversibility proof, a necessary requirement for me).

Doing this alone is pretty easy, especially if you are doing manual fixups on the output file. Put the definition of or in for "def wo", and prove df_or by reflexivity (after unfolding)

I'll do automatic fix up. I do not want to do manual fix ups. I need to implement the capability for users of Mephistolus to do thoses fix themselves.

I took some time to think yesterday (got a little daughter ans not as much time to ponder things as I would like to).
And I think that :

I will start implementing mephistolus on mm0 (on the raw simple set.mm.mm0 tranlsation that I have)

I should get really basic Mephistolus features out of it

Yet, I'll still implement the best graphic/ui Mephistolus can have with this

And also implement the possibility to dynamically improve the mathematical structures it use

1) In brief, users will be able to start from set.mm.mm0 or from scratch

2) they will be able to define notations/definitions/simple notations (that enhance the mathematical structures mephistolus relies on)
3) they'll be able to patch/modify this structure to slowly remove axioms and replace them with definitions/notations/theorems

That could be done with pure software step, but it will be much faster with graphical (regular math display/operations).
And it should quicken/make things easier a lot

Also, this is a feature that Mephistolus needs :
Mephistolus needs to enable mathematicians to change the axioms/definitions they use.
So that metamatematicians can work with a good workflow

Once it is done, I (and others)'ll be able to add mm0 features to set.mm.mm0 in an easier fashion

For example, proving df_ stuff should be orders of magnitude easier with the right software ui/backend support

Well, I am speaking big. Now, I got to make that happen. This will keep me busy the next 2/3 months

Please keep on guiding me in the future

Best regards,

Olivier

Mario
.

Wish me luck !

Best regards,
Olivier
ps : I'll soon publish my patched mm0/mmu files (in case someone is interested) ans also my mm0/mmu parsers (first).
Those can/should still be improved but it can be done later.

Stay home and stay safe ! :)

--
You received this message because you are subscribed to the Google Groups "Metamath" group.

To unsubscribe from this group and stop receiving emails from it, send an email to meta...@googlegroups.com.

Mario Carneiro

unread,

Apr 4, 2020, 1:19:40 PM4/4/20

to metamath

On Sat, Apr 4, 2020 at 3:14 AM Olivier Binda <olus....@gmail.com> wrote:

I guess that the restriction on id is this way so that we can do nice stuff with notations (mostly with numbers).

For example, at some point, I plan to allow users to write 1283 and to have that (maybe in a preprocessing step) turned into (; ; ; 1 2 8 3 ) because normal users will want to use numbers the way they are used to

I haven't quite got there yet, but I plan to do something like this in MM1. Inside a math expression, you can use antiquotation (denoted by a prefix comma) to interpolate a lisp expression, like $ 2 + 2 = ,(->string {2 + 2}) $ would call the ->string function on the result of evaluating {2+2}, resulting in "4" getting placed in the math expression. (Actually in this case you would have to do a bit more to make it typecheck; you need to prepend "c" to the string and then convert the string to an atom 'c4, and return that.) The resulting expression, '(eq (add c2 c2) c4), is just as if you had typed $ 2 + 2 = 4 $ to begin with.

That part works already today, but I want to extend this with a hook for the to-expr function to allow "malformed" expressions which are not just s-expressions of atoms but also include numbers and strings directly, calling a user defined callback to convert the number into a term. That way, you could write $ 2 + 2 = ,4 $ and the 4 would get elaborated into 'c4 by a user hook, and this would work just as well for larger terms and computations like $ ,11111 * ,11111 = ,{11111 * 11111} $. Similarly, you will be able to write $ ,"hello world" $ instead of the current much uglier $ _h :' _e :' _l :' _l :' _o :' __ :' _w :' _o :' _r :' _l :' _d :' s0 $.

but now, I'll try to do something hard :
- either replace the term for logical or with a definition (which means refactoring lots of proofs)
- or replacing the co usage for +, with using a term (which probably means more refactoring)

The MM0 pretty printer already knows how to print notations, so all you would have to do is to add a notation for wo. But it would have to happen between reading the incoming .mm file and producing the output, that is, inside mm0-hs, so that is on me to some extent. It needs to somehow accept configuration information to guide the process, and passing that by the command line would be very clunky. The internal metamath parser already knows how to read $j commands, so one option would be to add a new $j command declaring the notation.

You hint that it could be done with mm0-hs. Ok.
Still, I don't want to rely on that. I need to be able to do that myself, because Mephistolus needs it.
So, I will only rely on mm0-hs to translate mm stuff into basic mm0-stuff

If you already have it parsed and ready to analyze, then it shouldn't be too hard to add notations on your end, but you will have to write a pretty printer for mm0 if you want to produce notation-upgraded mm0 files. This isn't very hard though, especially if you don't worry about things like proper indentation. You won't be able to acquire information about notations through $j directives this way, unless you intend to also parse the source .mm file (in which case you may as well write the mm->mm0 translator yourself), and the set.mm0 output file loses most metadata so you will have to supply this in some other way, perhaps a configuration file of your own devising.