Minimalist Metamath

221 views
Skip to first unread message

Norman Megill

unread,
Mar 8, 2019, 11:36:21 PM3/8/19
to Metamath
Metamath was originally designed with a great deal of freedom in its syntax to allow math expressions that resemble what humans are used to.  It also was intended to allow for ambiguous grammars such as that in miu.mm (which in restrospect I half wish I'd never heard of :).  My assumption, perhaps misguided, was that $a's, even for syntax, are "axioms" to be vetted by humans, just like we need a human to vet the axioms of logic and ZFC, and should not be tinkered with by novices who don't understand the consequences.  Keep in mind that I originally designed the language primarily for my own modest use.  I never imagined that it would grow into the incredible project it has become.

Careless or intentional ambiguities in a Metamath database grammar can allow us to prove contradictions without violating the spec.  The language does exactly what it is intended to do, which is to prove consequences of $a statements, and as the saying goes, "garbage in, garbage out."  While grammar soundness can be (and now is, thanks to Mario and David) automatically checked at a higher level (outside of spec validation and proof verification) by tools such as mmj2, it still makes the language conceptually much more complex than it could be.

The Metamath Solitaire applet, written several years after metamath.exe, took a different approach that I was pleased with.  I made the grammar much more restrictive, requiring formulas to be expressed in Polish prefix notation, with a separate mechanism for displaying human-readable formulas.  The syntax definitions in the applet store both Polish prefix for working internally and a display notation for the screen.  (Polish prefix is much easier to work with in the code.)  The user is not aware of the Polish prefix underneath.  After writing the applet, I considered modifying the Metamath language to become a variant of the Metamath Solitaire "language" (which is hard-coded into the program since at the time applets could not read files), but I didn't pursue it.

Polish prefix notation is essentially immune to ambiguous grammar abuse.  It is also trivial to enforce in the Metamath language: simply require that all $a statements for syntax consist of two constants (a typecode and a new constant for the operator) followed by zero or more variables.  This restriction does not allow us to express a language with an ambiguous grammar such as that in miu.mm, but I don't see that as a major loss.

For those who like the simplicity and perceived additional rigor and robustness of Polish prefix, it would be relatively straightforward to automate the creation of a Polish prefix version of say set.mm, and also do it in a way that it can be converted back to the original.  The Polish prefix version would continue to be a valid Metamath language file that can be verified with existing verifiers.  If desired, a verifier could be enhanced with a "Polish" option to ensure the syntax $a restriction I mentioned.

In order not to lose the "human" notation for converting back to the original set.mm, we could add a pseudo-keyword tag such as $h (analogous to $t or $j) for use in comments, that would restate the $a statement with the original notation.  For example, w3a in the original set.mm

  $( Extend wff definition to include 3-way conjunction ('and'). $)
  w3a $a wff ( ph /\ ps /\ ch ) $.

would become in the Polish set.mm

  $c /\3 $.
  $( Extend wff definition to include 3-way conjunction ('and').
     $h ` wff ( ph /\ ps /\ ch ) ` $)
  w3a $a wff /\3 ph ps ch $.

The new token /\3 would have to be created by the converter if the preferred token has been used (such as /\ for wa), ideally using some reasonable heuristic.  (I think Raph's Metamath to Ghilbert translator does something similar, not sure.)  But since Polish prefix isn't meant to be directly human-readable it doesn't matter much what the new tokens are called; we could also use a variant of the label like _w3a.  The new tokens would be discarded upon conversion back to the original set.mm.

We would need $h comment tags only for the syntax $a's.  This provides enough information so that we can losslessly convert the Polish prefix version back to the original set.mm.  (A slightly tricky problem is how to preserve indentation and line breaks in the original set.mm's $a/$p/$e math expressions, but I think I have a solution to that.)

Note that proofs would stay exactly the same and don't have to be touched.

If there is a strong demand for this minimalist subset of the Metamath language, in principle our various software tools could be modified so the user would see and interact only in the "human" syntax.  However, conversion from set.mm to Polish set.mm and back should be reasonably fast as well as lossless, so either one could be used for development and display.  (There are a few logicians such as Ted Ulrich who work primarily in Polish notation: https://web.ics.purdue.edu/~dulrich/Home-page.htm )

Although a conversion program can probably be written without an inordinate amount of effort, I don't have any immediate plans for such a project unless there is a serious demand.  I am mainly describing what is possible in principle, if it helps lessen the perception that "the soundness story in current Metamath is rough."

Norm

Olivier Binda

unread,
Mar 9, 2019, 1:44:00 AM3/9/19
to Metamath
I second that. Simple, efficient, computer-friendly, fast parsing, lossless.
You cannot use text editor to directly hack in the resulting database, but it should be easy to build wysiwym applications.
You can still use the old set.mm database if you want.

Sauer, Andrew Jacob

unread,
Mar 9, 2019, 12:33:07 PM3/9/19
to meta...@googlegroups.com
Thirded. Heard talk by Mario about a project called "Metamath Zero" that might be aiming for the same thing.

Virus-free. www.avast.com

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To post to this group, send email to meta...@googlegroups.com.
Visit this group at https://groups.google.com/group/metamath.
For more options, visit https://groups.google.com/d/optout.

Norman Megill

unread,
Mar 9, 2019, 2:36:22 PM3/9/19
to Metamath
It was definitely inspired by Metamath Zero and the discussions there, and initially I was going to add my post to that thread.  Metamath Zero, though, is a different language/framework that would require new verifiers and other tools.  I looked at what could be done within the existing framework to address concerns about potentially ambiguous grammars, so that existing verifiers and tools would continue to work.  In addition, I felt that lossless translation back and forth would make it more acceptable, since no one would have to change how they work if they don't want to.

This is something I've had in the back of my mind for a number of years, vaguely assuming that it would require a new language or at least extensions incompatible with existing tools.  But after some thought it seems the existing language will suffice, with a minor markup addition (which isn't part of the language per se) and an extra trivial check to ensure the syntax $a's are Polish.

All of this is still hypothetical, though.  If it is implemented, would people really use it, or would it just be an academic exercise?

Norm

Mario Carneiro

unread,
Mar 9, 2019, 3:04:10 PM3/9/19
to metamath
The thing I like least about metamath zero is the fact that it is honestly significantly more complicated than metamath. I've been trying to grapple with how much is necessary for metamath patching reasons and how much is just me being silly and putting in unnecessary features. The $h approach leads roughly to the "double writing" problem in the initial metamath zero proposal. This requires that people write their terms in prefix notation, which they don't seem to like to do, and the human readable version is only non-misleading if the grammar is unambiguous. (Perhaps this is too much to ask for, but if the $h version is generated by pretty printing then it's still susceptible to this kind of representation hacking.)

This also only addresses the grammar soundness issue; it does not handle soundness checking of definitions (I don't see any easy way to hack a verifier to check definitions without a lot more metadata), and it does not prevent unsoundness from terms of the "set" sort (which requires more metadata about the sorts).

I think it is possible to get metamath zero checking with a superset of the metamath specification (i.e. a compatible syntax), and I will be doing just that when I get around to writing a translation from metamath zero to metamath. The result would be a metamath database with lots of $j annotations that is interconvertible with a metamath zero file, such that a metamath verifier can at least check all the proofs in the same way as metamath zero, although it wouldn't be able to do all the auxiliary soundness stuff.

--

Norman Megill

unread,
Mar 9, 2019, 3:56:31 PM3/9/19
to Metamath
On Saturday, March 9, 2019 at 3:04:10 PM UTC-5, Mario Carneiro wrote:
The thing I like least about metamath zero is the fact that it is honestly significantly more complicated than metamath. I've been trying to grapple with how much is necessary for metamath patching reasons and how much is just me being silly and putting in unnecessary features. The $h approach leads roughly to the "double writing" problem in the initial metamath zero proposal. This requires that people write their terms in prefix notation, which they don't seem to like to do, and the human readable version is only non-misleading if the grammar is unambiguous. (Perhaps this is too much to ask for, but if the $h version is generated by pretty printing then it's still susceptible to this kind of representation hacking.)

The $h would be generated automatically when going from original set.mm to Polish set.mm and stripped out when going back.  The Polish would have the same typecode as the non-Polish, followed by a new constant (reusing an existing constant from the non-Polish syntax when possible, like in the Syntax Hints on the web pages), followed by the same variables in the same order as in the non-Polish.

Note that the $h would exist only in the $a statements for syntax.  There isn't any double writing of the logical $a's, $p's, $e's; those would be in Polish only.  (I suppose $h's could be put on those automatically by the conversion if people prefer it for better readability, but it is optional.)

The non-Polish set.mm would continue to be checked by mmj2 as it is now.  If we purposely put an ambiguous grammar in the non-Polish version then convert to Polish, I haven't thought much about what would happen.  But I would guess that  the conversion of some logical $a/$p statements would fail, or some proofs would fail, or if everything passes, the Polish version would have "corrected" the ambiguity with something definite (whether it is what you intended or not, at least it would be sound).
 

This also only addresses the grammar soundness issue; it does not handle soundness checking of definitions (I don't see any easy way to hack a verifier to check definitions without a lot more metadata),

Correct, it only addresses the grammar soundness.  Definitional soundness would continue to need an outside tool.

BTW is there a description of how the definitional soundness check works?  What is the metadata that it needs?
 
and it does not prevent unsoundness from terms of the "set" sort (which requires more metadata about the sorts).

I'm not sure what you mean.

Norm

Mario Carneiro

unread,
Mar 9, 2019, 4:16:22 PM3/9/19
to metamath
On Sat, Mar 9, 2019 at 3:56 PM Norman Megill <n...@alum.mit.edu> wrote:
and it does not prevent unsoundness from terms of the "set" sort (which requires more metadata about the sorts).

I'm not sure what you mean.

The declaration

v0 $a set (/) $.

is inconsistent in set.mm, which is surprising for a few reasons. On the one hand it's a natural thing to want to say if you think that "set" means sets, since this is saying that "(/)" is a set, which is of course true and a reasonable axiom. And I haven't even added any axioms about it, it's just a new term constructor. How could it break anything? Here's how:

1::equid |- (/) = (/)
2:1:ax-gen |- A. (/) (/) = (/)
3::dtru |- -. A. (/) (/) = (/)
qed:2,3:pm2.24ii |- F.

Obviously something's strange here since we are using (/) as if it was a variable. But that's what the declaration "set (/)" permits us to do. Furthermore, we substituted both x and y for (/) in dtru, even though it requires x and y be distinct. This is because constants are distinct from everything, including themselves.

All of the above still holds if we declare a constant "v0" with "set v0". This is also a natural thing to try to do, because we claim that metamath is a metalogic over object language expressions including variables like v0, v1, ... and it would be nice to have explicit ways to talk about these variables. With this the proof talks about "|- A. v0 v0 = v0" which doesn't look so weird, but |- -. A. v0 v0 = v0 is still provable by dtru, where we would want to argue that "v0" is not disjoint from itself, but metamath thinks that it's fine because it's a constant.

 
--
Message has been deleted

Norman Megill

unread,
Mar 9, 2019, 6:49:35 PM3/9/19
to Metamath
On Saturday, March 9, 2019 at 4:16:22 PM UTC-5, Mario Carneiro wrote:

On Sat, Mar 9, 2019 at 3:56 PM Norman Megill wrote:
and it does not prevent unsoundness from terms of the "set" sort (which requires more metadata about the sorts).

I'm not sure what you mean.

The declaration

v0 $a set (/) $.

is inconsistent in set.mm, which is surprising for a few reasons. On the one hand it's a natural thing to want to say if you think that "set" means sets, since this is saying that "(/)" is a set, which is of course true and a reasonable axiom. And I haven't even added any axioms about it, it's just a new term constructor. How could it break anything? Here's how:

1::equid |- (/) = (/)
2:1:ax-gen |- A. (/) (/) = (/)
3::dtru |- -. A. (/) (/) = (/)
qed:2,3:pm2.24ii |- F.

Obviously something's strange here since we are using (/) as if it was a variable. But that's what the declaration "set (/)" permits us to do. Furthermore, we substituted both x and y for (/) in dtru, even though it requires x and y be distinct. This is because constants are distinct from everything, including themselves.

Right.  The MPE home page and other places state that "A set variable can be substituted only with another set variable (in other words with an expression of length one, whose only symbol is a set variable)", so such a $a is illegal - indeed any $a starting with "set" is illegal (and there are none in set.mm).

However, the spec doesn't protect against this (by design, to provide maximum freedom for $a's), and it seems our other tools don't either (that I know of), so it's currently a $a dependent on human vetting and an example of a construct intimately tied in with the logical $a's.  I'm not sure where the best place to add this check is, but we should add it somewhere (maybe I'll temporarily hard code it into metamath.exe as a stopgap).

How would Metamath Zero handle this?  Where would you propose to check it in set.mm?
 

All of the above still holds if we declare a constant "v0" with "set v0". This is also a natural thing to try to do, because we claim that metamath is a metalogic over object language expressions including variables like v0, v1, ... and it would be nice to have explicit ways to talk about these variables. With this the proof talks about "|- A. v0 v0 = v0" which doesn't look so weird, but |- -. A. v0 v0 = v0 is still provable by dtru, where we would want to argue that "v0" is not disjoint from itself, but metamath thinks that it's fine because it's a constant.

It is getting into a murky area to call the object language variables "constants" (as they might be in a set-theoretical model of them).  For example, a proof language that works directly with object language variables probably doesn't treat them the same way as it does 0-ary constants.  But as stated on the MPE home page, we can only derive schemes from other schemes and never instances of the intended object language: "In fact, the Metamath language cannot even express specific, actual theorems such as ( v1 = v2 → v1 = v2). We would have to do that outside of the Metamath system."  Even the "constants" in set.mm can be considered representations of schemes; e.g. if we define (/) as { x | x =/= x }, it would represent any of the object language instances { v0 | v0 =/= v0 }, { v1 | v1 =/= v1 }, etc. (to get technical we would expand wffs containing these to eliminate the set builders and arrive at primitive language wffs, as described in the Theory of Classes on the MPE home page).

Even though set.mm can't represent object-language variables, we can of course emulate them to a reasonable degree with set variables that are mutually distinct.

Norm

Mario Carneiro

unread,
Mar 9, 2019, 7:24:47 PM3/9/19
to metamath
On Sat, Mar 9, 2019 at 6:49 PM Norman Megill <n...@alum.mit.edu> wrote:
How would Metamath Zero handle this?  Where would you propose to check it in set.mm?

In Metamath Zero this is addressed by means of the modifiers on sorts. When you declare 'set' via

pure nonempty sort set;

the "pure" means that it is illegal to declare a term of this sort, so only variables can inhabit it. Actually, even without this modifier, mm0 is safe because an mm0 sort translates to two mm sorts, one for variables and one for terms, and the terms always land in the term sort while the "bound variables" always live in the variable sort. The "pure" modifier is just an indication for the translation process that the term sort won't be needed.

The drawback of this approach is that there are now "two kinds of variables", corresponding to the two mm sorts for each mm0 sort, which I call "bound variables" vs "regular variables".  In set.mm the distinction isn't taken advantage of so much because none of the three sorts really has an exact counterpart of the other kind. One can imagine a set-term sort, which contains terms that are manifestly sets, like (/), but in set.mm this role is taken by the "class" sort instead, which doesn't have a one-to-one-correspondence to the variables. In normal FOL one would expect to have variables and terms to range over the same domain, so that would be an example.

All of the above still holds if we declare a constant "v0" with "set v0". This is also a natural thing to try to do, because we claim that metamath is a metalogic over object language expressions including variables like v0, v1, ... and it would be nice to have explicit ways to talk about these variables. With this the proof talks about "|- A. v0 v0 = v0" which doesn't look so weird, but |- -. A. v0 v0 = v0 is still provable by dtru, where we would want to argue that "v0" is not disjoint from itself, but metamath thinks that it's fine because it's a constant.

It is getting into a murky area to call the object language variables "constants" (as they might be in a set-theoretical model of them).  For example, a proof language that works directly with object language variables probably doesn't treat them the same way as it does 0-ary constants.

Right. Metamath doesn't have the necessary language to even talk about these object language variables, and clearly treating them as constants doesn't work.

  But as stated on the MPE home page, we can only derive schemes from other schemes and never instances of the intended object language: "In fact, the Metamath language cannot even express specific, actual theorems such as ( v1 = v2 → v1 = v2). We would have to do that outside of the Metamath system."  Even the "constants" in set.mm can be considered representations of schemes; e.g. if we define (/) as { x | x =/= x }, it would represent any of the object language instances { v0 | v0 =/= v0 }, { v1 | v1 =/= v1 }, etc. (to get technical we would expand wffs containing these to eliminate the set builders and arrive at primitive language wffs, as described in the Theory of Classes on the MPE home page).

Even though set.mm can't represent object-language variables, we can of course emulate them to a reasonable degree with set variables that are mutually distinct.

In Metamath Zero all bound variables are distinct, which may cause some incompatibility but saves a lot of headache on the HOL translation.
 
Mario

David A. Wheeler

unread,
Mar 10, 2019, 3:55:14 PM3/10/19
to Norman Megill, Metamath
I certainly agree that it would be possible to generate a Polish representation of set.mm. It's not clear to me that it would help a serious problem.

It would enable the detection of certain kinds of ambiguities, yes, but there are other ways to do it that are probably simpler. In addition, this would require writing some code to do the translations back and forth, which is exactly the kind of external programming that I thought you were trying to avoid.

I think very very few people would ever want to read the Polish representation, especially since it has no beginning and ending markers unlike s-expressions. Even s-expressions are difficult to get people to use, and s-expressions are far clearer. I've done lisp for many years, and can easily read s expressions, but I believe the majority of people who develop software loathe s expressions and will not use them. I think the number of mathematicians who will want to read a s-expressions or Polish syntax is even smaller. Ghilbert uses s-expressions, and while they're certainly powerful, I think that has been one of the things that has limited the number of it's possible users.

So before building the Polish translator, you might want to make sure that it is actually solving a problem that is best solved that way. I may be misunderstanding your goals.

As far as misusing the set type code, I think the main problem is that the type code should be named something else. Something like "setvar" (which we have discussed before) would eliminate the problem. The name "set" is very misleading, because that is not the intended semantics. Good names are hard, but good names eliminate a lot of problems without adding a lot of complexity.

--- David A.Wheeler

Raph Levien

unread,
Mar 10, 2019, 4:19:18 PM3/10/19
to Metamath Mailing List, Norman Megill
Old (Python) Ghilbert uses s-expressions. New (Rust) Ghilbert uses a simple Pratt parser with precedence.

I got way too ambitious, trying to add types and solving problems of interop between different logical systems, as well as a sophisticated module system. I'm considering taking another pass at it, cleaning up what I already have (which is pretty close to Metamath, certainly compared to my explorations into lambda-pi). I think it actually solves most of the problems targeted by MM0 and this thread. However, my time is scarcer than usual.

Raph

--

Thierry Arnoux

unread,
Mar 10, 2019, 5:01:04 PM3/10/19
to meta...@googlegroups.com
On 10/03/2019 20:55, David A. Wheeler wrote:
> So before building the Polish translator, you might want to make sure
> that it is actually solving a problem that is best solved that way. I
> may be misunderstanding your goals.

Imagine you want to write a tool which needs to programmatically read a
Metamath database: At first you'll be doing prototyping, trying
something out. Using the Polish notation version will save you writing a
dynamic parser (one which learn its syntax on the fly), this will be
much more straightforward. You can get directly to the core of your
work: basically, the Polish notation version is a pre-parsed version. An
early version of the tool may require that Polish version as an input.
Once the tool is more advanced, a better parser can be implemented. I
believe that simplifying the entry step for new tools may be a worthy goal.


David A. Wheeler

unread,
Mar 10, 2019, 7:06:45 PM3/10/19
to metamath, metamath
On Sun, 10 Mar 2019 22:01:03 +0100, Thierry Arnoux <thierry...@gmx.net> wrote:
> Using the Polish notation version will save you writing a
> dynamic parser (one which learn its syntax on the fly)...
> Once the tool is more advanced, a better parser can be implemented. I
> believe that simplifying the entry step for new tools may be a worthy goal.

Simplying the creation of new tools is a worthy goal.
But Polish notation (as described here) doesn't help you fully avoid dynamic parsing.

Standard Polish notation has no begin/end marker, so a parser has to know
how many parameters are used by each operation, which the tool has to
figure out dynamically (or at least before trying to read an expression that uses it).

Would S-expressions (or some variant) be a better notation,
if that's the goal? Processing s-expressions is *really* well-understood,
and then a tool doesn't need to track syntax to figure out where an
expression begins and ends.

--- David A. Wheeler

David A. Wheeler

unread,
Mar 10, 2019, 7:14:37 PM3/10/19
to metamath, metamath
> Would S-expressions (or some variant) be a better notation,
> if that's the goal? Processing s-expressions is *really* well-understood,
> and then a tool doesn't need to track syntax to figure out where an
> expression begins and ends.

(Replying to myself):

Just to be clear, generating and processing an s-expression style of syntax is
certainly possible with the existing metamath processors, and then
the data reader does *NOT* need to read the syntax definitions to figure
out how to do basic parsing (if that is your goal).

You can generate df-an's version of /\ in various ways, e.g.:
( /\ ph ps )
or
( _df-an ph ps )

You could generate df-3an's version of /\ in various ways, e.g.:
( /\ ph ps ch )
or
( _df-3an ph ps ch )

You could add the ability to generate data in formats
(such as s-expressions or JSON), perhaps
as part of some program preprocessing.

--- David A. Wheeler

Mario Carneiro

unread,
Mar 10, 2019, 7:17:54 PM3/10/19
to metamath
Polish notation isn't a problem for metamath math strings, because the arity of each term constructor is fixed. But it is definitely less readable than s-exprs because humans don't know the arities by heart like the computer does, and something like ` ( FOO ph ps )` would also accomplish the same purpose with parentheses added for readability. But I think this is a kind of worthless half-measure, because now it's not pure polish so the verifier is more complex, and it's not all that readable in the first place which is why $h was suggested in the first place.

Mario

André L F S Bacci

unread,
Mar 10, 2019, 7:40:52 PM3/10/19
to meta...@googlegroups.com
On Sun, Mar 10, 2019 at 8:14 PM David A. Wheeler <dwhe...@dwheeler.com> wrote:
> Would S-expressions (or some variant) be a better notation,
> if that's the goal?   Processing s-expressions is *really* well-understood,
> and then a tool doesn't need to track syntax to figure out where an
> expression begins and ends.

(Replying to myself):

Just to be clear, generating and processing an s-expression style of syntax is
certainly possible with the existing metamath processors, and then
the data reader does *NOT* need to read the syntax definitions to figure
out how to do basic parsing (if that is your goal).

Well, in this stage defining the goal (or goals) it's necessary. I was reading in recent emails to figure out what the objectives are, and I not sure anymore.

And may be necessary some different notations for different objectives:
- A pure text, pure human edit, one file, like now;
- A easy format to write side by side a easy to parse equivalent;
- A easy format to parse and tooling.
- A low level language and a high level tool who only outputs low level.


You can generate df-an's version of /\ in various ways, e.g.:
  ( /\ ph ps )
or
  ( _df-an ph ps )

You could generate df-3an's version of /\ in various ways, e.g.:
  ( /\ ph ps ch )

Here, a bit of precedence or AST explicit conversion may be necessary:

( /\ ph ps ch ) -> ( /\ ph ( /\ ps ch ) )

You could add the ability to generate data in formats
(such as s-expressions or JSON), perhaps
as part of some program preprocessing.

Parentheses/brackets hell should follow.

André

Norman Megill

unread,
Mar 10, 2019, 11:58:26 PM3/10/19
to Metamath
One goal of all this is to provide a format that is easier for a computer to read and do things with, in additional to possibly providing some more psychological assurance about the grammar not being ambiguous.  Human readability isn't a goal, although we could have an option to include a $h (human-readable expression) in every $a/$p/$e comment for those who want to read the Polish source in an easier way.

From my experience with programs written before metamath.exe (mostly playing with propositional calculus subsystems) and with writing MM Solitaire, Polish notation (or equivalently RPN) was a pleasure to work with in code, and I always used it because that seemed to me it was the best way.  You naturally just step through it token by token, pushing and popping a stack.  It is a natural fit to unification algorithms like the one in MM Solitaire as well as condensed detachment (a form of unification).  Subformulas are very easy to identify and are easy to "break out" as substrings of tokens for various purposes including unification.

Since the arity of all connectives/operators is fixed, S-expressions don't buy anything, and the parentheses would just get in the way because you need extra bits of code to skip over them and extra care to keep track of the stack operation step vs. token position.

If Polish is displayed in an indented format, it isn't terribly hard to read if you have the vertical screen space.  That's how I often used to do it.  But I also typically had a simple conversion function to normal infix notation that I'd use to display output for studying it.  It would be trivial to write a Polish to S-expression converter for display purposes, but it seems like a somewhat poor compromise, and you might as well go all the way to standard notation for studying, debugging, etc.

Once a set.mm converter to Polish is written (although at this point it is primarily a thought experiment), it would probably be relatively simple to add an option to produce an S-expression version as well, for those who want it.


On Sunday, March 10, 2019 at 7:14:37 PM UTC-4, David A. Wheeler wrote:
> Would S-expressions (or some variant) be a better notation,
> if that's the goal?   Processing s-expressions is *really* well-understood,
> and then a tool doesn't need to track syntax to figure out where an
> expression begins and ends.

(Replying to myself):

Just to be clear, generating and processing an s-expression style of syntax is
certainly possible with the existing metamath processors, and then
the data reader does *NOT* need to read the syntax definitions to figure
out how to do basic parsing (if that is your goal).

You can generate df-an's version of /\ in various ways, e.g.:
  ( /\ ph ps )
or
  ( _df-an ph ps )

You could generate df-3an's version of /\ in various ways, e.g.:
  ( /\ ph ps ch )

Overloading operators would introduce a new level of complexity.  A goal of the conversion to Polish is to keep the language as simple as possible (regardless of human readability).  If we overload operators like this, then yes, we would need S-expressions, and code for things like unification would probably be more complex.
 

or
  ( _df-3an ph ps ch )

You could add the ability to generate data in formats
(such as s-expressions or JSON), perhaps

If people want a JSON version of set.mm, that shouldn't be hard to produce from the Polish version, but for manipulating low-level expressions in code, I would guess that Polish would usually be more efficient.  I suppose if you already have JSON libraries and don't care about run times, that might be more convenient for some implementations.  I've never used JSON so I don't really know how its efficiency compares, though, or what its advantages might be for manipulating mathematical expressions.
 
as part of some program preprocessing.

Norm

Marnix Klooster

unread,
Mar 11, 2019, 5:09:49 AM3/11/19
to Metamath
Hello all,

With the up-front apology that I haven't been able to properly respond in earlier cases when I brought up this same parsing solution, let met try again. :-)

Forbidding |- set (/) seems simple, if one requires that for every axiom $a blah bla $. there is exactly one proof of $p IS_TOPLEVEL blah blah $= ? $. given the following:

$c IS_TOPLEVEL $.
TOP.wff $a IS_TOPLEVEL wff ph $.
TOP.turnstile $a IS_TOPLEVEL |- ph $.
TOP.set $a IS_TOPLEVEL set x $.
TOP.class $a IS_TOPLEVEL class x $.

And that proof is then the parse tree of that axiom.

The non-existence of a proof of $p IS_TOPLEVEL |- set (/) $= ? $. simply means that this is a syntactically incorrect statement.

That way no external parser is necessary, we don't need to look at Polish notation, LL(whatever) parsers, etc etc.  Just a single "every axiom must have a unique parse tree" requirement.

I might very well be missing something-- could someone explain why such a small extension to Metamath would not be sufficient?

Thanks!

Groetjes,
 <><
Marnix

Op zo 10 mrt. 2019 om 01:24 schreef Mario Carneiro <di....@gmail.com>:
--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To post to this group, send email to meta...@googlegroups.com.
Visit this group at https://groups.google.com/group/metamath.
For more options, visit https://groups.google.com/d/optout.


--
Marnix Klooster
marnix....@gmail.com

Mario Carneiro

unread,
Mar 11, 2019, 7:05:19 PM3/11/19
to metamath
On Mon, Mar 11, 2019 at 5:09 AM Marnix Klooster <marnix....@gmail.com> wrote:
Hello all,

With the up-front apology that I haven't been able to properly respond in earlier cases when I brought up this same parsing solution, let met try again. :-)

Forbidding |- set (/) seems simple, if one requires that for every axiom $a blah bla $. there is exactly one proof of $p IS_TOPLEVEL blah blah $= ? $. given the following:

$c IS_TOPLEVEL $.
TOP.wff $a IS_TOPLEVEL wff ph $.
TOP.turnstile $a IS_TOPLEVEL |- ph $.
TOP.set $a IS_TOPLEVEL set x $.
TOP.class $a IS_TOPLEVEL class x $.

And that proof is then the parse tree of that axiom.

The non-existence of a proof of $p IS_TOPLEVEL |- set (/) $= ? $. simply means that this is a syntactically incorrect statement.

That way no external parser is necessary, we don't need to look at Polish notation, LL(whatever) parsers, etc etc.  Just a single "every axiom must have a unique parse tree" requirement.

And how do you propose to check that an axiom has a unique parse tree? It's one thing to check that an axiom has a parse tree, when we require that the user provide the parse (as a proof). But to check that there is exactly one parse tree is much more complex. To do it in general requires complicated and nonuniform proof methods (it's undecidable as I've mentioned elsewhere), and I wouldn't even be surprised if there is a grammar that is ambiguous but not provably so in ZFC. If you restrict your concern to only a subset of grammars of interest, then you are in the "LL(whatever) parsers" area.

More to the point though, I think you misinterpreted the problem. I'm not talking about forbidding "$a |- set (/)" at all. In fact this is a useless axiom which does no harm because "set (/)" is not a wff so it can't be applied. I'm talking about forbidding "$a set (/)", which is asserting that (/) is an term with typecode "set". This is perfectly analogous to other declarations like "$a class _V" so there is no a priori reason for metamath to reject it.

Moreover, requiring a proof for every axiom is viciously circular - you can't prove anything until you have axioms with which to prove things. Suppose you want to introduce a term $a wff T. $. By your rules, we first have to prove "IS_TOPLEVEL wff T.", which we obtain by applying TOP.wff. Now we have to prove "wff T.", but that's the axiom we wanted to introduce so it's not yet available.
 
Mario

Norman Megill

unread,
Mar 13, 2019, 10:42:49 PM3/13/19
to Metamath
On Sunday, March 10, 2019 at 3:55:14 PM UTC-4, David A. Wheeler wrote:
...

As far as misusing the set type code, I think the main problem is that the type code should be named something else. Something like "setvar" (which we have discussed before) would eliminate the problem. The name "set" is very misleading, because that is not the intended semantics. Good names are hard, but good names eliminate a lot of problems without adding a lot of complexity.

I am now in agreement with David.  If there are no objections, I will change the typecode "set" to "setvar" in a week or so.

Norm

naip moro

unread,
Mar 18, 2019, 10:59:24 PM3/18/19
to metamath-forum
On Fri, Mar 8, 2019 at 11:36 PM Norman Megill <n...@alum.mit.edu> wrote:
Polish prefix notation is essentially immune to ambiguous grammar abuse.  It is also trivial to enforce in the Metamath language: simply require that all $a statements for syntax consist of two constants (a typecode and a new constant for the operator) followed by zero or more variables.  This restriction does not allow us to express a language with an ambiguous grammar such as that in miu.mm, but I don't see that as a major loss.

If the goal is a general theory of formal systems, it's definitely a loss. Metamath is not perfect; for one thing, it is restricted to linear notations, so is unable to treat diagrammatic systems natively. But it's the closest thing we have to a general formal system. It would be a shame to abandon that high ground.

regards,
naipmoro

Mario Carneiro

unread,
Mar 19, 2019, 1:46:23 AM3/19/19
to metamath
What do you mean by a "general theory of formal systems"? The Metamath specification is not a general theory of any kind. It is an implementation of a particular proof verification system, which happens to be quite general but is necessarily not generic enough for all possible things people might want to consider. But it is general enough to define ZFC, and in that language you actually can define just about anything you can dream up. In particular, you can define the proof theory of miu and even prove that MU is not provable, which is not something miu.mm can formalize. I would argue that this is in fact closer to the original intent of Hofstadter's example, which is supposed to be a baby version of proof theory in preparation for Godel's incompleteness theorems.
 
Mario

Thierry Arnoux

unread,
Mar 19, 2019, 10:47:45 AM3/19/19
to meta...@googlegroups.com, Mario Carneiro
A note from my side:

I had read Hofstadter before discovering Metamath, and I think seeing
how Metamath worked on a very simple system like MIU helped me a lot to
grasp the way Metamath worked for more complex systems like set.mm.

I still recall that Aha! moment when I understood that Metamath was
based on just a bit more than substitutions rules - just like MIU - and
therefore, so was ZFC. Simple, but powerful, and beautiful. Nothing
complex, nothing hidden.

Norman Megill

unread,
Mar 19, 2019, 1:53:37 PM3/19/19
to Metamath
On Monday, March 18, 2019 at 10:59:24 PM UTC-4, naipmoro wrote:

On Fri, Mar 8, 2019 at 11:36 PM Norman Megill wrote:
Polish prefix notation is essentially immune to ambiguous grammar abuse.  It is also trivial to enforce in the Metamath language: simply require that all $a statements for syntax consist of two constants (a typecode and a new constant for the operator) followed by zero or more variables.  This restriction does not allow us to express a language with an ambiguous grammar such as that in miu.mm, but I don't see that as a major loss.

If the goal is a general theory of formal systems, it's definitely a loss.

(Mario addressed that, although perhaps you meant "general framework for formal systems"?)

Metamath is not perfect; for one thing, it is restricted to linear notations, so is unable to treat diagrammatic systems natively.

While I'm not sure how you would propose to do this, handling that natively would seem to be a huge complication antithetical to the language's philosophy of simplicity.  I see no reason why, in principle, diagrammatic systems can't be represented linearly in the language then converted to two dimensions for display or interaction purposes by an IDE or other tool.
 
But it's the closest thing we have to a general formal system. It would be a shame to abandon that high ground.

There is a misconception about what I proposed.

There is no intent in my proposal to change anything in the Metamath language.  miu.mm and everything else would still exist.  set.mm in its present version would still be available and would continue to be the preferred version for development.  Nothing would be replaced or abandoned.

The proposal is to provide an optional conversion of set.mm into Polish notation, using a subset of the Metamath language's grammar capabilities.  It would have the advantage that checking for unambiguous grammar becomes trivial, putting to rest any concerns about the complexity of that checking.  (The grammar check for Polish would still be done outside of spec/proof verification.)  It also may also make it faster to write specialized or one-off tools for various purposes since essentially no parsing is required.  Polish is hard for humans to read but easy for computer algorithms to work with.

An important requirement (to me at least) is that the conversion to and from Polish be bidirectional and lossless, so that a round trip conversion would restore the starting set.mm exactly (i.e. diff would have no output).  This way a tool written for the Polish version can do its thing and the result converted back to the original set.mm format.  A second requirement is that the Polish version would still be a legal Metamath language file, so that all existing verifiers would continue to work.

When I said no "major loss" I meant that miu.mm and your lof.mm (G. Spencer-Brown's Laws of Form) cannot be represented directly with Polish notation, so such a conversion would not apply to them.  They would continue to exist in their present form.
 
Norm

naip moro

unread,
Mar 19, 2019, 5:47:17 PM3/19/19
to metamath-forum
Hello,
I suppose I was unduly alarmed by Norm's lament, "I half wish I'd never heard of [miu]." I read that as, "I half wish metamath lacked the power to encode such edge cases." I'm looking in the other direction: any formal system that doesn't trivially generate every possible formula should be a candidate for encoding. And now I wonder (since I never considered this question before) whether metamath is actually able to do that in the case of all linear notations (I'm guessing not...).

I understand about encoding miu (and others) in ZFC. We could encode it in Conway's Game of Life, too. But there's something to be said about a native representation (under some highly technical definition of 'native').

The history of ideas is complicated! Sometimes I think the originator of a novelty understood it better than subsequent experts. Perhaps it's revealing that the two independent inventors of predicate logic, Frege and Peirce, both expressed their systems in a planar notation.

Generalizing metamath to diagrammatic systems[1] would be a blast.

[1] Logical Reasoning with Diagrams, Allwein and Barwise.

regards,
naipmoro

--

Norman Megill

unread,
Mar 25, 2019, 5:13:25 PM3/25/19
to Metamath
The typecode "setvar" is now incorporated in set.mm as well as mmset.html
http://us2.metamath.org:88/mpeuni/mmset.html, mmcomplex.html, mmzfcnd.html.

There is a small glitch.  The mmj2 definition check has "set" hard-coded into it, and to use it you need to temporarily change "setvar" to "set" in set.mm (just do a blind global substitution, which will work for mmj2 definition check purposes).  A temporary workaround that does this has been put into the Travis check for pull request purposes.

iset.mm and nf.mm haven't been changed yet.  Probably we should wait until mmj2 is fixed.

Norm

Norman Megill

unread,
Mar 25, 2019, 5:40:48 PM3/25/19
to Metamath


If you are working only with set.mm, it might be more convenient to change the string "set" (including quotes) to "setvar" in mmj2jar/macros/definitionCheck.js and reload mmj2.  Then you don't have to edit set.mm each time.
Reply all
Reply to author
Forward
0 new messages