Are we listening?

664 views
Skip to first unread message

vvs

unread,
Feb 10, 2020, 7:52:10 AM2/10/20
to Metamath
He is trying to deliver this message to all, so I think people here might be interested too:

Each formal proof verification system (Lean, Coq, Isabelle/HOL, UniMath, all of the others) has its own community, and it is surely in the interests of their members to see their communities grow. These systems are all claiming to do mathematics, as well as other things too (program verification, research into higher topos theory and higher type theory etc). But two things have become clear to me over the last two years or so:
  1. There is a large community of mathematicians out there who simply cannot join these communities because the learning curve is too high and much of the documentation is written for computer scientists.
  2. Even if a mathematician battles their way into one of these communities, there is a risk that they will not find the kind of mathematics which they are being taught, or teaching, in their own department, and there is a very big risk that they will not find much fashionable mathematics.

My explicit question to all the people in these formal proof verification communities is: what are you doing about this?


Full post is here: https://xenaproject.wordpress.com/2020/02/09/where-is-the-fashionable-mathematics/

OlivierBinda

unread,
Feb 10, 2020, 8:38:45 AM2/10/20
to meta...@googlegroups.com

Here is my opinion.

Metamath is a very simple, yet very powerful and expressive language, completely usable by both computers and humans, because it mimics/is really close to mathematics

With a quite thin layer of tooling upon metamath, it is possible to use it like math, but with enhancements brought by  a computer and to allow :

12+ years old to do high level maths and produce computer-checked proofs.
For that, you only need to select parts of math expression with the mouse and to select a mutation in a list of possible results. This brings down considerably the training you need to do proper maths (and metamath).

In the same way that calculators allow nowadays, anybody to do 1234.3657 * 578932.231 in a few seconds when it was hell in year 1560 .

Also, for professional mathematicians, it should enable them to write their proof, export them to LaTeX (if they want) send them to research papers, bypass the peer review process to get their research checked/approved...and accelerate the whole damn research process, while allowing anyone, anywhere to have access to the whole database of mathematical discoveries


So, my answer to both questions are :
1) it is happening NOW for metamath (thanks to MathJax). When people will look at what metamath can achieve, others (coq/lean/..) will either throw the towel and convert or do the same thing (good luck to them, with their otherly complex C/C++ code)

It is just a question of time (give us 1 or 2 years). People will be able to do maths simply with the help of a computer.

2) let's build a decentralized modular database of math knowledge.
That way, people will be able to do the maths they want, the way they want without censorship/having to conform
(I followed the discussion about metamath goals and there was quite a lot of heat).
Ideally, I want to allow anyone to do maths the way they want and to have what they want.

It could be done that way  :
human to the tool : "use the definitions and theorems of Zfc, this theory, that theory, category, groups, monoid"...
Tool : "done, these theorems can be used, those cannot" 


Also, I am researching mm0 right now and developing a love-hate relation with Mario's work (documentation... :/).
In the process of writing a precedence parser that handles it, I'm understanding things.
for example, I hate precedence levels, nobody should have to learn a precedence level, that's not how we do maths...

But, I guess that the people who don't know precedence can write

ph -> (ps -> th)

whereas the wizards (or just Mario, the wizard) will write

ph -> ps -> th

And I kinda like a lot this flexibility and that it still allows people to do what they want/know


Also, there might be way for the tool to compute precedence levels for the user (to avoid him the tedious parts)
like having + <. x <. ... instead of +=20 x=45


ah well stopping my rant and getting back to implementing the part about custom notations in the mm0 parser...

Best regards,
Olivier

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/6478d27f-2d3b-4296-9f95-9236562bd7e8%40googlegroups.com.

Ken Kubota

unread,
Feb 23, 2020, 10:24:59 AM2/23/20
to meta...@googlegroups.com, cl-isabe...@lists.cam.ac.uk, hol-...@lists.sourceforge.net, coq-...@inria.fr, Prof. Kevin Buzzard, Prof. Peter B. Andrews, Prof. Lawrence C. Paulson, Prof. Jeremy Avigad, Leonardo de Moura, Daniel Selsam
Let me comment on Kevin Buzzard's intervention (blog: xenaproject.wordpress.com, website: wwwf.imperial.ac.uk/~buzzard)
Where is the fashionable mathematics?
placed into the Metamath list by vvs <vvs...@gmail.com> at

Notes on Lean are placed in section A.2.,
notes on HOL and Isabelle/HOL in section B.

Freek Wiedijk's criticism quoted at https://owlofminerva.net/files/fom_2018.pdf#page=10,
and my comment on the Curry-Howard isomorphism at the end of section A further below.



A. General debate

Quote:
The historical approach of formalizing all (or all important) mathematical knowledge is the QED Project: mizar.org/qed


My proposal for a solution consists of two steps:


1. Creating a formal language according to Andrews' notion of expressiveness, i.e., using the simplest, weakest means in order to express all of mathematics in a natural way.
This would result in a Hilbert-style system with lambda notation (and use of the description operator) such as Andrews' logic Q0:
"Q0 requires only two basic types [...] and only two basic constants [...], thus reducing the language of formal logic and mathematics to a minimal set of basic notions."

• Peter B. Andrews on expressiveness as the main criterion for the design of Q0: “Therefore we shall turn our attention to finding a formulation of type theory which is as expressive as possible, allowing mathematical ideas to be expressed precisely with a minimum of circumlocutions, and which is as simple and economical as is possible without sacrificing expressiveness. The reader will observe that the formal language we find arises very naturally from a few fundamental design decisions.” [Andrews, 2002, pp. 205 f.]

"A bottom-up representation (which is better suited for foundational research) 
is clearly a Hilbert-style system: It has the elegance of having only a few 
rules of inference (in Q0 even only a single rule of inference - Andy Pitts: 
"From a logical point of view, it would be better to have a simpler 
substitution primitive, such as 'Rule R' of Andrews' logic Q0, and then to 
derive more complex rules from it." [Gordon and Melham, 1993, p. 213]). 
Metatheorems are not expressible within the formalism; the metatheorems are 
literally "meta" ("above" - i.e., outside of - the logic). In software 
implementations of Q0 or descendants (Prooftoys by Cris Perdue or my R0 
implementation), the metalogical turnstile (⊢) symbol is replaced by the 
logical implication, which shows the tendency to eliminate metalogical elements 
from the formal language."
https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2018-May/msg00013.html

Adding dependent types (or at least type abstraction) results in R0, in which abstract algebra (here: group theory) can be expressed very naturally:

(Note that Metamath isn't really Hilbert-style, in my opinion, since it can express metatheorems. It seems to start directly at a higher level, like all logical frameworks.
A true Hilbert-style system consists only of the (mathematical) object language. So I disagree with Mario Carneiro at this point:


2. Implementing a natural deduction variant of the formal language created in step one (i.e., making metamathematical theorems symbolically expressible as required for automation):

Maybe one should go to the other extreme in this step and establish/use a logical framework (e.g., Twelf, Isabelle):

"A top-down representation (which is better suited for applied mathematics: 
interactive/automated theorem proving) is either a natural deduction system 
(like HOL) or a logical framework (like Isabelle): It has a proliferation of 
rules of inference (e.g., eight rules for HOL [cf. Gordon and Melham, 1993, pp. 
212 f.]). Metalogical properties (metatheorems) are expressible to a certain 
extent, e.g., using the turnstile (⊢) symbol (the conditionals / the parts 
before the turnstile may differ in the hypothesis and the conclusion), or the 
meta-implication (⇒) in Isabelle's metalogic M (not to be confused with the 
implication (⊃) of the object-logic), see
Unfortunately, the gain of expressiveness in terms of metalogical properties by 
making metatheorems symbolically representable is obtained at the price of 
philosophical rigor and elegance in expressing the underlying object logic 
(object language)."

According to the concept of expressiveness, unnecessary types or constants should be avoided.

The list type as implemented in Lean
is unnecessary, as the definition of ordered pairs [Andrews 2002 (ISBN 1-4020-0763-9), p. 208] can be used to establish vectors, which may serve as lists:
• Ordered Pairs With One Type Variable
https://www.owlofminerva.net/files/formulae.pdf#page=376
• Vectors (Dependent Type Theory)

The Pi type, or dependent function type, as implemented in Lean
is unnecessary, as it simply can be expressed with type abstraction. The following statement is incorrect:
"It is clear that cons α should have type α → list α → list α. But what type should cons have? A first guess might be Type → α → list α → list α, but, on reflection, this does not make sense: the α in this expression does not refer to anything, whereas it should refer to the argument of type Type. In other words, assuming α : Type is the first argument to the function, the type of the next two elements are α and list α. These types vary depending on the first argument, α."
It is overlooked that exactly this dependency ("does not refer to anything") can be expressed within the formal language by type abstraction (as suggested by Mike Gordon and as implemented in R0), which overcomes the strict separation of the term level from the type level:
"In order to express this properly, type abstraction is required, which abstracts
from the concrete type ("their concrete presentations"). It creates a link between
the type and the term level, e.g.
[\t.\*:ttt . a * id = a ]
or
[\t.\x_{ttt} . a x id = a ]
for some identity element id.
Note that the variable t appears both at the term and at the type (subscript) level.

Standard HOL doesn't provide this feature, which was already suggested by
HOL founder Mike Gordon (he called it "'second order' λ-terms" [Gordon, 2001, p. 22]).
[...]
Group theory is best suited to demonstrate this, and I also chose group theory:
See the definitions "Grp" and "GrpIdy" there."
https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2018-September/msg00045.html

A natural approach should avoid using the Curry-Howard isomorphism.
"All constructive foundations, including Coq, use the encoding of proofs based on 
the Curry-Howard isomorphism. But the concept of expressiveness yields the 
_natural_ expression of mathematics, hence the desire is not an _encoded_, but 
an _unencoded_ (direct) form of expression."



B. Kevin Buzzard on Isabelle/HOL

"Where is the fashionable mathematics?
Posted on February 9, 2020

[...]

Isabelle/HOL users

Isabelle/HOL: Manuel Eberl is tirelessly generating 20th century analysis and analytic number theory. This is a highly respectable thing to do — he is planting the seeds. He is doing the basics. He cannot do it alone! Furthermore, HOL has no dependent types. Does this mean that there are entire areas of mathematics which are off limits to his system? I conjecture yes. Prove me wrong. Can you define the Picard group of a scheme in Isabelle/HOL? I am not sure that it is even possible to write this code in Isabelle/HOL in such a way that it will run in finite time, because you have to take a tensor product of sheaves of modules to define the multiplication, and a sheaf is a dependent type, and your clever HOL workarounds will not let you use typeclasses to define module structures on the modules which are values of the sheaves. So how will you do the tensor product? I don’t know. Does anyone know? Is this actually an interesting open research problem? Picard groups of schemes are used all over the place in the proof of Fermat’s Last Theorem. They are basic objects for a “proper mathematician”. What can Isabelle/HOL actually do before it breaks? Nobody knows. But what frustrates me is that nobody in the Isabelle/HOL community seems to care. Larry Paulson says that it’s easy to understand: different kinds of maths work best in different systems, and so you might want to choose the system depending on the maths you want to do. But do you people want to attract “working mathematicians”? Then where are the schemes? Can your system even do schemes? I don’t know. Does anyone know? If it cannot then this would be very valuable to know because it will help mathematician early adopters to make an informed decision about which system to use."

There is no question that Isabelle's theoretical foundation
is a scientific achievement and that Isabelle itself is an impressive piece of software.

But it is also clear that HOL's type system (and therefore Isabelle/HOL's type system) is poor.

Freek Wiedijk on the (current) HOL family: “There is one important reason why the HOLs are not yet attractive enough to be taken to be the QED system:
• The HOL type system is too poor. As we already argued in the previous section, it is too weak to properly do abstract algebra.
But it is worse than that. In the HOL type system there are no dependent types, nor is there any form of subtyping. (Mizar and Coq both have dependent types and some form of subtyping. In Mizar the subtyping is built into the type system. In Coq a similar effect is accomplished through the use of automatic coercions.)
For formal mathematics it is essential to both have dependent types and some form of subtyping.” [Wiedijk, 2007, p. 130, emphasis as in the original]


Not even group theory (abstract algebra) can be expressed adequately – without circumlocations or workarounds –, since type abstraction is still missing, which HOL founder Mike Gordon suggested nearly 20 years ago for HOL:

14. Mike Gordon on explicit type variable quantification in HOL: “In future versions of HOL it is expected that there will be explicit type variable quantification [Melham, 1993b], i.e., terms of the form ∀α.t (where α is a type variable). The right hand side of definitions will be required to be closed with respect to both term and type variables. Melham has shown that this will make defining mechanisms much cleaner and also permit an elegant treatment of type specifications.” [Gordon, 2000, p. 175, fn. 7]

17. Mike Gordon suggesting type abstraction for HOL: “The syntax and semantics of type variables are currently being studied by several logicians. A closely related area is the theory of ‘second order’ λ-terms like λα. λx:α. x, perhaps such terms should be included in the HOL logic.” [Gordon, 2001, p. 22]

No visible attempts are made to implement type abstraction, neither in the current HOL4 nor in the Isabelle/HOL community, although this is the natural next step.

Andrei Popescu and Ondřej Kunčar have some philosophical intuition concerning (negative) self-reference.


But I am not aware of any current attempts to overcome the obvious restrictions, which is not satisfying for a logician/mathematician.

Following Andrews' concept of expressiveness, i.e., naturally expressing all of formal logic and mathematics with a small set of symbols and rules (which includes what Kevin Buzzard calls "proper mathematics" or "fashionable mathematics"), it shouldn't be too difficult to implement type abstraction (and maybe full dependent types) in HOL4 and Isabelle/HOL.

Why doesn't this happen?



Kind regards,

Ken Kubota

____________________________________________________

Norman Megill

unread,
Feb 24, 2020, 2:48:55 PM2/24/20
to Metamath
On Sunday, February 23, 2020 at 10:24:59 AM UTC-5, Ken Kubota wrote:
... 
(Note that Metamath isn't really Hilbert-style, in my opinion, since it can express metatheorems. It seems to start directly at a higher level, like all logical frameworks.
A true Hilbert-style system consists only of the (mathematical) object language. So I disagree with Mario Carneiro at this point:
...
I think this is somewhat quibbling.  Yes, Metamath proves schemes from other schemes.  However, any of these schemes, as well as any step of any proof, can be immediately and trivially instantiated with an object-language substitution, resulting in a Hilbert-style proof per your requirements if we do that all the way back to axioms.  To be pedantic I guess one could say that Metamath "directly represents" Hilbert-style proofs if you disagree that it "is" Hilbert-style.

In real life no one does object-language proofs, not even textbooks that claim to work with Hilbert-style systems, because you can't reuse the theorems for other instances, and proofs become astronomically long.  Let us say the object-language variables are v1, v2,... There is a Metamath (set.mm) scheme called "id" that states "phi -> phi" for any wff phi.  If a proof in the style you are insisting on needed two instances of this, say "v1 = v2 -> v1 = v2" and "v3 e. v4 -> v3 e. v4" ("e." meaning "element of"), they would each have to be proved all over again starting from appropriate instantiations of the starting axiom schemes.

http://us.metamath.org/mpeuni/mmset.html#2p2e4length gives an example of this.  "As an example of the savings we achieve by using only schemes, the 2p2e4 proof described in 2 + 2 = 4 Trivia above requires 26,323 proof steps [which include the complex number construction] to be proved from the axioms of logic and set theory. If we were not allowed to use different instances of intermediate theorem schemes but had to show a direct object-language proof, where each step is an actual axiom or a rule applied to previous steps, the complete formal proof would have 73,983,127,856,077,147,925,897,127,298 (about 7.4 x 10^28) proof steps."

Norm

Ken Kubota

unread,
Feb 24, 2020, 3:55:21 PM2/24/20
to meta...@googlegroups.com
Am 24.02.2020 um 20:48 schrieb Norman Megill <n...@alum.mit.edu>:

On Sunday, February 23, 2020 at 10:24:59 AM UTC-5, Ken Kubota wrote:
... 
(Note that Metamath isn't really Hilbert-style, in my opinion, since it can express metatheorems. It seems to start directly at a higher level, like all logical frameworks.
A true Hilbert-style system consists only of the (mathematical) object language. So I disagree with Mario Carneiro at this point:
...
I think this is somewhat quibbling.  Yes, Metamath proves schemes from other schemes.  However, any of these schemes, as well as any step of any proof, can be immediately and trivially instantiated with an object-language substitution, resulting in a Hilbert-style proof per your requirements if we do that all the way back to axioms.  To be pedantic I guess one could say that Metamath "directly represents" Hilbert-style proofs if you disagree that it "is" Hilbert-style.

You argue the same way as Mario does.

My knowledge about Metamath is limited, but I believe that Mario is correct in that the form of reasoning resembles that of a Hilbert-style system.

However, my perspective is different since my intention is to find the natural language of mathematics (following Andrews' notion of expressiveness).
This means that (like in Andrews' Q0) metatheorems cannot be expressed directly using the means of the formal language only.

While implementations based on Q0 (such as Cris Perdues' Prooftoys or my R0 software) use logical implication (located at the object language level) as the key symbol of the Hilbert-style system, logical frameworks use the turnstile or another symbols (located at the meta-language level) which denotes derivability / provability.

So by definition Metamath (as a logical framework / a metalogic) doesn't fit into the definition of a Hilbert-style system in the narrow sense, which simply reflects the fact that the design purposes are different and therefore the language levels are.


In real life no one does object-language proofs, not even textbooks that claim to work with Hilbert-style systems, because you can't reuse the theorems for other instances, and proofs become astronomically long.  Let us say the object-language variables are v1, v2,... There is a Metamath (set.mm) scheme called "id" that states "phi -> phi" for any wff phi.  If a proof in the style you are insisting on needed two instances of this, say "v1 = v2 -> v1 = v2" and "v3 e. v4 -> v3 e. v4" ("e." meaning "element of"), they would each have to be proved all over again starting from appropriate instantiations of the starting axiom schemes.

Exactly this is done in the R0 software implementation: it does object language proofs only.
Instead of schemes, templates are used (*.r0t files instead of the regular *.r0 files), so you can re-use the template with different initial values and simply include the template file.
In his 2002 textbook, Andrews carefully distinguishes meta-language proofs from object language proofs by introducing boldface "syntactical variables" [Andrews, 2002, p. 11].

The corresponding theorem in R0 is K8005 (H => H) and uses the logical implication, so a direct comparison is difficult here.
But substitution may serve as an example here:

##  Proof A6102:  Peano's Postulate No. 5 for Andrews' Definition of Natural Numbers

##  Source: [Andrews 2002 (ISBN 1-4020-0763-9), p. 262]

## .1

%K8005

## use Proof Template A5221 (Sub):  B  -->  B [x/A]
:= $B5221 %0
:= $T5221 o
:= $X5221 x{$T5221}
:= $A5221 ((&{{{o,o},o}}_(p{{o,$S}}{{o,$S}}_(AZERO{{{o,{o,\3{^}}},^}}_t{^}{^}){$S}){o}){{o,o}}_((A{{{o,{o,\3{^}}},^}}_$S{^}){$To2S}_[\x{$S}{$S}.((=>{{{o,o},o}}_((ANSET{{{o,{o,{o,\4{^}}}},^}}_t{^}{^}){{o,$S}}_x{$S}{$S}){o}){{o,o}}_((=>{{{o,o},o}}_(p{{o,$S}}{{o,$S}}_x{$S}{$S}){o}){{o,o}}_(p{{o,$S}}{{o,$S}}_((ASUCC{{{{o,{o,\4{^}}},{o,{o,\4{^}}}},^}}_t{^}{^}){{$S,$S}}_x{$S}{$S}){$S}){o}){o}){o}]{{o,$S}}){o})
<< A5221.r0t.txt
:= $B5221; := $T5221; := $X5221; := $A5221

The line starting with "<<" includes the template file.
In the preceding lines the initial values are set (pattern:   := variable value   ).
In the following lines the values are unset (pattern:   := variable   ).
The semicolon is equivalent to the end of line.

So in fact all proofs are done at the object-language level.


http://us.metamath.org/mpeuni/mmset.html#2p2e4length gives an example of this.  "As an example of the savings we achieve by using only schemes, the 2p2e4 proof described in 2 + 2 = 4 Trivia above requires 26,323 proof steps [which include the complex number construction] to be proved from the axioms of logic and set theory. If we were not allowed to use different instances of intermediate theorem schemes but had to show a direct object-language proof, where each step is an actual axiom or a rule applied to previous steps, the complete formal proof would have 73,983,127,856,077,147,925,897,127,298 (about 7.4 x 10^28) proof steps."

It is true that proofs sometimes become slow, but in the approach logical rigor was more important than practical considerations.
Run
make A6102.r0.dbg.txt && open $_
on a Mac to see the verbose proof of the Principle of Mathematical Induction, where each proof instance is shown (29M filesize, 624908 lines).
For comparison, the abbreviated proof (not expanding the template proofs) obtained by
make A6102.r0.out.txt && open $_
has only a filesize of 44K and 973 lines.


In order to fulfil both philosophical and practical needs, my suggestion is a two-step approach:

The first step focuses on logical rigor and uses the object language only. (This is my current R0 software implementation.)
In the second step the logic is reformulated in a meta-language where metatheorems can be expressed directly.

But it is important to separate these two steps.

Mario Carneiro

unread,
Feb 24, 2020, 4:31:07 PM2/24/20
to metamath
On Mon, Feb 24, 2020 at 12:55 PM Ken Kubota <ma...@kenkubota.de> wrote:
My knowledge about Metamath is limited, but I believe that Mario is correct in that the form of reasoning resembles that of a Hilbert-style system.

However, my perspective is different since my intention is to find the natural language of mathematics (following Andrews' notion of expressiveness).
This means that (like in Andrews' Q0) metatheorems cannot be expressed directly using the means of the formal language only.

Do you have a reference for "Andrews' notion of expressiveness"? I see many references to your work in your posts, but not any external citations. (I don't know who Andrews is in this comment.) I think that expressiveness is indeed key, but to me this also includes the avoidance of certain in-principle exponential blowups that occur with some formalisms, naively implemented. More on that below.

While implementations based on Q0 (such as Cris Perdues' Prooftoys or my R0 software) use logical implication (located at the object language level) as the key symbol of the Hilbert-style system, logical frameworks use the turnstile or another symbols (located at the meta-language level) which denotes derivability / provability.

I do not know how to place metamath by this metric. You can identify both an implication arrow in the logic, and a turnstile that is not quite in the metalogic but is not directly accessible to reasoning.

So by definition Metamath (as a logical framework / a metalogic) doesn't fit into the definition of a Hilbert-style system in the narrow sense, which simply reflects the fact that the design purposes are different and therefore the language levels are.

I believe it is fair to characterize Metamath as a logical framework, although it uses a weaker metalogic than LF does (essentially only composition of universally quantified horn clauses).

But if you are too restrictive in your definition of a Hilbert style system, you may end up excluding Hilbert as well. *No one* actually works at the object level, certainly not the logicians that describe the object level. Metamath aims to cover enough of the metalevel to be able to do "logician proofs" about the object level. That is an expressiveness concern, in case you can't tell.
In real life no one does object-language proofs, not even textbooks that claim to work with Hilbert-style systems, because you can't reuse the theorems for other instances, and proofs become astronomically long.  Let us say the object-language variables are v1, v2,... There is a Metamath (set.mm) scheme called "id" that states "phi -> phi" for any wff phi.  If a proof in the style you are insisting on needed two instances of this, say "v1 = v2 -> v1 = v2" and "v3 e. v4 -> v3 e. v4" ("e." meaning "element of"), they would each have to be proved all over again starting from appropriate instantiations of the starting axiom schemes.

Exactly this is done in the R0 software implementation: it does object language proofs only.
Instead of schemes, templates are used (*.r0t files instead of the regular *.r0 files), so you can re-use the template with different initial values and simply include the template file.
In his 2002 textbook, Andrews carefully distinguishes meta-language proofs from object language proofs by introducing boldface "syntactical variables" [Andrews, 2002, p. 11].

If that's the case, then the system is fundamentally limited, and subject to exponential blowup, long before you get to the level that mathematicians care about.

For comparison to a system you may be more familiar with, HOL has a rule called INST, which allows you to derive from A1,...,An |- B the sequent A1(sigma),...,An(sigma) |- B(sigma) where sigma = [x1 -> t1, ..., xm -> tm] is a substitution of free variables in the A's and B with terms. (You can also substitute type variables with a separate rule.)

This is what metamath does in its "one rule". The point here is that you can prove a theorem once and instantiate it twice, at two different terms. If you don't do this, if you reprove the theorem from axioms every time, then if T2 uses T1 twice at two different terms, the proof of T2 will be (at least) twice as long as that of T1. If T3 uses T2 twice and T4 uses T3 twice and so on, the lengths of the proofs grow exponentially, and if you are rechecking all these proof templates at all instantiations then the check time also increases exponentially. This is not a merely theoretical concern; metamath.exe has a mechanism to calculate the length of direct from axioms proofs and they are quite often trillion trillions of steps. If you have no mechanism to deal with this, your proof system is dead on arrival.

In order to fulfil both philosophical and practical needs, my suggestion is a two-step approach:

The first step focuses on logical rigor and uses the object language only. (This is my current R0 software implementation.)
In the second step the logic is reformulated in a meta-language where metatheorems can be expressed directly.

The thing is, the trust has to go into the higher level system too, because you can't run object level proofs. (To this you will say you have an implementation already, but as I and Norm have argued this system will die on most proofs people care about.) The meta level system is stronger, but only in the sense that weak subsystems of PA can be subsumed by stronger subsystems of PA. Basically you have to believe in the existence of large numbers with short descriptions, like 2^2^2^60. If the system is strong enough to prove these numbers exist, then you can do proof existence proofs to collapse that exponential blowup.

That's the logician analysis of what a substitution axiom does for a proof system. Yes it's conservative, but it is also essential for getting things done.

In summary, the object language (the first step) is not important, except as a specification for the second step. Only the second step matters for correctness and practical usability.

Mario

Ken Kubota

unread,
Feb 24, 2020, 5:20:51 PM2/24/20
to meta...@googlegroups.com
Most of your arguments concern (are answered with) the two-step approach, so let me answer your questions on Andrews and expressiveness first and then jump to the very end, where the two-step approach is discussed.

____________________________________________________


Am 24.02.2020 um 22:30 schrieb Mario Carneiro <di....@gmail.com>:



On Mon, Feb 24, 2020 at 12:55 PM Ken Kubota <ma...@kenkubota.de> wrote:
My knowledge about Metamath is limited, but I believe that Mario is correct in that the form of reasoning resembles that of a Hilbert-style system.

However, my perspective is different since my intention is to find the natural language of mathematics (following Andrews' notion of expressiveness).
This means that (like in Andrews' Q0) metatheorems cannot be expressed directly using the means of the formal language only.

Do you have a reference for "Andrews' notion of expressiveness"? I see many references to your work in your posts, but not any external citations. (I don't know who Andrews is in this comment.) I think that expressiveness is indeed key, but to me this also includes the avoidance of certain in-principle exponential blowups that occur with some formalisms, naively implemented. More on that below.

In section 1 of my original post I provided a quote by Andrews and an explanation:
The basic idea is to naturally express all of mathematics with a minimum of requirements.
Q0 basically has lambda notation, two primitive symbols and two primitive types, and a single rule of inference.
R0 has only little more.
Both are Hilbert-style systems (operating at the object language level only).

Andrews is Prof. Peter B. Andrews, creator of the higher-order logic Q0, Professor of Mathematics, Emeritus, Carnegie Mellon University.



While implementations based on Q0 (such as Cris Perdues' Prooftoys or my R0 software) use logical implication (located at the object language level) as the key symbol of the Hilbert-style system, logical frameworks use the turnstile or another symbols (located at the meta-language level) which denotes derivability / provability.

I do not know how to place metamath by this metric. You can identify both an implication arrow in the logic, and a turnstile that is not quite in the metalogic but is not directly accessible to reasoning.

So by definition Metamath (as a logical framework / a metalogic) doesn't fit into the definition of a Hilbert-style system in the narrow sense, which simply reflects the fact that the design purposes are different and therefore the language levels are.

I believe it is fair to characterize Metamath as a logical framework, although it uses a weaker metalogic than LF does (essentially only composition of universally quantified horn clauses).

But if you are too restrictive in your definition of a Hilbert style system, you may end up excluding Hilbert as well. *No one* actually works at the object level, certainly not the logicians that describe the object level. Metamath aims to cover enough of the metalevel to be able to do "logician proofs" about the object level. That is an expressiveness concern, in case you can't tell.
In real life no one does object-language proofs, not even textbooks that claim to work with Hilbert-style systems, because you can't reuse the theorems for other instances, and proofs become astronomically long.  Let us say the object-language variables are v1, v2,... There is a Metamath (set.mm) scheme called "id" that states "phi -> phi" for any wff phi.  If a proof in the style you are insisting on needed two instances of this, say "v1 = v2 -> v1 = v2" and "v3 e. v4 -> v3 e. v4" ("e." meaning "element of"), they would each have to be proved all over again starting from appropriate instantiations of the starting axiom schemes.

Exactly this is done in the R0 software implementation: it does object language proofs only.
Instead of schemes, templates are used (*.r0t files instead of the regular *.r0 files), so you can re-use the template with different initial values and simply include the template file.
In his 2002 textbook, Andrews carefully distinguishes meta-language proofs from object language proofs by introducing boldface "syntactical variables" [Andrews, 2002, p. 11].

If that's the case, then the system is fundamentally limited, and subject to exponential blowup, long before you get to the level that mathematicians care about.

For comparison to a system you may be more familiar with, HOL has a rule called INST, which allows you to derive from A1,...,An |- B the sequent A1(sigma),...,An(sigma) |- B(sigma) where sigma = [x1 -> t1, ..., xm -> tm] is a substitution of free variables in the A's and B with terms. (You can also substitute type variables with a separate rule.)

This is what metamath does in its "one rule". The point here is that you can prove a theorem once and instantiate it twice, at two different terms. If you don't do this, if you reprove the theorem from axioms every time, then if T2 uses T1 twice at two different terms, the proof of T2 will be (at least) twice as long as that of T1. If T3 uses T2 twice and T4 uses T3 twice and so on, the lengths of the proofs grow exponentially, and if you are rechecking all these proof templates at all instantiations then the check time also increases exponentially. This is not a merely theoretical concern; metamath.exe has a mechanism to calculate the length of direct from axioms proofs and they are quite often trillion trillions of steps. If you have no mechanism to deal with this, your proof system is dead on arrival.

In order to fulfil both philosophical and practical needs, my suggestion is a two-step approach:

The first step focuses on logical rigor and uses the object language only. (This is my current R0 software implementation.)
In the second step the logic is reformulated in a meta-language where metatheorems can be expressed directly.

The thing is, the trust has to go into the higher level system too, because you can't run object level proofs. (To this you will say you have an implementation already, but as I and Norm have argued this system will die on most proofs people care about.) The meta level system is stronger, but only in the sense that weak subsystems of PA can be subsumed by stronger subsystems of PA. Basically you have to believe in the existence of large numbers with short descriptions, like 2^2^2^60. If the system is strong enough to prove these numbers exist, then you can do proof existence proofs to collapse that exponential blowup.

That's the logician analysis of what a substitution axiom does for a proof system. Yes it's conservative, but it is also essential for getting things done.

In summary, the object language (the first step) is not important, except as a specification for the second step. Only the second step matters for correctness and practical usability.

This is not quite correct.
Not only that the first step is the logical foundation on which the second is established upon.
Moreover, a reference implementation is required to study certain cases.
In particular the violation of type restriction has to be avoided, as otherwise the system is rendered inconsistent.
This is quite tricky, and case studies are necessary, especially if you want to implement new features such as type abstraction and dependent types.
Also, you want to show that in principle all of mathematics can be expressed in the reference implementation (and the second step is due to practical considerations only).
While of course you need the second step for numbers such as 2^2^2^60, all principle questions (such as whether an antinomy can be expressed) can be addressed with the first step.

Finally, only in the first step (the ideal formal language) the formulation is very elegant, and logical rigor preserved in a philosophical sense.
For example, a certain condition in Q0 appears as two distinct (independent) conditions in Isabelle's metalogic (a natural deduction system).

Quote:

For example, in Isabelle's metalogic M by Larry Paulson, the Eigenvariable 
conditions appear as two distinct (independent) conditions:
[...]
By contrast, in the logic Q0 by Peter Andrews, the restrictions in these 
(derived) rules have their origin in the substitution procedure of (the 
primitive) Rule R', which is valid only provided that it is not the case that 
"x is free in a member of [the set of hypotheses] H and free in [A = B]." 
[Andrews, 2002, p. 214].
[...]
Hence, in a bottom-up representation (like Q0) - unlike in a top-down 
representation (like Isabelle's metalogic M) - it is possible to trace the 
origin of the two Eigenvariable conditions back to a common root, i.e., the 
restriction in Rule R'.
https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2018-May/msg00013.html

So the formal language obtained in the first step is much more elegant than a practical system obtained in the second step.
For foundational and philosophical research, the first is preferred, for advanced mathematics such as with large numbers the second is preferred.

Mario Carneiro

unread,
Feb 24, 2020, 10:27:20 PM2/24/20
to metamath
On Mon, Feb 24, 2020 at 2:20 PM Ken Kubota <ma...@kenkubota.de> wrote:
Most of your arguments concern (are answered with) the two-step approach, so let me answer your questions on Andrews and expressiveness first and then jump to the very end, where the two-step approach is discussed.

____________________________________________________


Am 24.02.2020 um 22:30 schrieb Mario Carneiro <di....@gmail.com>:



On Mon, Feb 24, 2020 at 12:55 PM Ken Kubota <ma...@kenkubota.de> wrote:
My knowledge about Metamath is limited, but I believe that Mario is correct in that the form of reasoning resembles that of a Hilbert-style system.

However, my perspective is different since my intention is to find the natural language of mathematics (following Andrews' notion of expressiveness).
This means that (like in Andrews' Q0) metatheorems cannot be expressed directly using the means of the formal language only.

Do you have a reference for "Andrews' notion of expressiveness"? I see many references to your work in your posts, but not any external citations. (I don't know who Andrews is in this comment.) I think that expressiveness is indeed key, but to me this also includes the avoidance of certain in-principle exponential blowups that occur with some formalisms, naively implemented. More on that below.

In section 1 of my original post I provided a quote by Andrews and an explanation:
The basic idea is to naturally express all of mathematics with a minimum of requirements.
Q0 basically has lambda notation, two primitive symbols and two primitive types, and a single rule of inference.
R0 has only little more.
Both are Hilbert-style systems (operating at the object language level only).

Why do you say Q0 or R0 and not "simple type theory"? There are contentions that HOL is too weak a type theory for much of mathematics. Simple type theory is even weaker. So in what sense is it expressive? (Granted, you can do a great deal with just PA, and I'm sure your system is at least as strong. But there is plenty of mathematics that goes beyond this.)
In summary, the object language (the first step) is not important, except as a specification for the second step. Only the second step matters for correctness and practical usability.

This is not quite correct.
Not only that the first step is the logical foundation on which the second is established upon.
Moreover, a reference implementation is required to study certain cases.
In particular the violation of type restriction has to be avoided, as otherwise the system is rendered inconsistent.
This is quite tricky, and case studies are necessary, especially if you want to implement new features such as type abstraction and dependent types.
Also, you want to show that in principle all of mathematics can be expressed in the reference implementation (and the second step is due to practical considerations only).
While of course you need the second step for numbers such as 2^2^2^60, all principle questions (such as whether an antinomy can be expressed) can be addressed with the first step.

Finally, only in the first step (the ideal formal language) the formulation is very elegant, and logical rigor preserved in a philosophical sense.

But the move to the second step adds new axioms, so even if the ideal formal language is elegant and rigorous, that property may not be preserved by the second level language, which is also the one that people actually use.

For example, a certain condition in Q0 appears as two distinct (independent) conditions in Isabelle's metalogic (a natural deduction system).

Quote:

For example, in Isabelle's metalogic M by Larry Paulson, the Eigenvariable 
conditions appear as two distinct (independent) conditions:
[...]
By contrast, in the logic Q0 by Peter Andrews, the restrictions in these 
(derived) rules have their origin in the substitution procedure of (the 
primitive) Rule R', which is valid only provided that it is not the case that 
"x is free in a member of [the set of hypotheses] H and free in [A = B]." 
[Andrews, 2002, p. 214].
[...]
Hence, in a bottom-up representation (like Q0) - unlike in a top-down 
representation (like Isabelle's metalogic M) - it is possible to trace the 
origin of the two Eigenvariable conditions back to a common root, i.e., the 
restriction in Rule R'.
https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2018-May/msg00013.html

So the formal language obtained in the first step is much more elegant than a practical system obtained in the second step.
For foundational and philosophical research, the first is preferred, for advanced mathematics such as with large numbers the second is preferred.

What is the computer implementation for then? As a practical tool, it is too limited, and in particular the exponential blowup means that you are pushing the upper bound caused by the finiteness of the hardware down to below many theorems that matter to people. If you want a system "for foundational and philosophical research", you can do that on paper. If you want to prove properties about the system, you need to formalize the system itself inside a theorem prover (presumably with a stronger axiom system). Within that system, you will be able to prove the provability of a lot of things you wouldn't be able to do directly in the system, by making use of the metalogical natural numbers to prove schematic constructions.

Mario

Ken Kubota

unread,
Feb 25, 2020, 7:44:28 AM2/25/20
to meta...@googlegroups.com

Why do you say Q0 or R0 and not "simple type theory"? There are contentions that HOL is too weak a type theory for much of mathematics. Simple type theory is even weaker. So in what sense is it expressive? (Granted, you can do a great deal with just PA, and I'm sure your system is at least as strong. But there is plenty of mathematics that goes beyond this.)

Q0 is still simple type theory (the most elegant formulation, an improvement of Church's type theory and equivalent to it),
but R0 is much stronger.
R0 has type abstraction and provides some of the means for dependent type theory.
It can naturally express group theory, which isn't possible in polymorphic type theory (e.g., HOL) or the even weaker simple type theory (e.g., Q0).

For a short overview (and comparison), see p. 1 of: https://owlofminerva.net/files/fom_2018.pdf

I believe that with "substitution at the type (subscript) level" (p. 12), the only feature still missing,
what can be expressed in the language of R0 coincides with (all of) mathematics.

(Of course, meta-mathematical theorems / metatheorems cannot be expressed within a Hilbert-style system.)


In summary, the object language (the first step) is not important, except as a specification for the second step. Only the second step matters for correctness and practical usability.

This is not quite correct.
Not only that the first step is the logical foundation on which the second is established upon.
Moreover, a reference implementation is required to study certain cases.
In particular the violation of type restriction has to be avoided, as otherwise the system is rendered inconsistent.
This is quite tricky, and case studies are necessary, especially if you want to implement new features such as type abstraction and dependent types.
Also, you want to show that in principle all of mathematics can be expressed in the reference implementation (and the second step is due to practical considerations only).
While of course you need the second step for numbers such as 2^2^2^60, all principle questions (such as whether an antinomy can be expressed) can be addressed with the first step.

Finally, only in the first step (the ideal formal language) the formulation is very elegant, and logical rigor preserved in a philosophical sense.

But the move to the second step adds new axioms, so even if the ideal formal language is elegant and rigorous, that property may not be preserved by the second level language, which is also the one that people actually use.

It depends on the purpose you use the system for.
(See below.)


For example, a certain condition in Q0 appears as two distinct (independent) conditions in Isabelle's metalogic (a natural deduction system).

Quote:

For example, in Isabelle's metalogic M by Larry Paulson, the Eigenvariable 
conditions appear as two distinct (independent) conditions:
[...]
By contrast, in the logic Q0 by Peter Andrews, the restrictions in these 
(derived) rules have their origin in the substitution procedure of (the 
primitive) Rule R', which is valid only provided that it is not the case that 
"x is free in a member of [the set of hypotheses] H and free in [A = B]." 
[Andrews, 2002, p. 214].
[...]
Hence, in a bottom-up representation (like Q0) - unlike in a top-down 
representation (like Isabelle's metalogic M) - it is possible to trace the 
origin of the two Eigenvariable conditions back to a common root, i.e., the 
restriction in Rule R'.
https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2018-May/msg00013.html

So the formal language obtained in the first step is much more elegant than a practical system obtained in the second step.
For foundational and philosophical research, the first is preferred, for advanced mathematics such as with large numbers the second is preferred.

What is the computer implementation for then? As a practical tool, it is too limited, and in particular the exponential blowup means that you are pushing the upper bound caused by the finiteness of the hardware down to below many theorems that matter to people. If you want a system "for foundational and philosophical research", you can do that on paper. If you want to prove properties about the system, you need to formalize the system itself inside a theorem prover (presumably with a stronger axiom system). Within that system, you will be able to prove the provability of a lot of things you wouldn't be able to do directly in the system, by making use of the metalogical natural numbers to prove schematic constructions.

Arguing that way means that Russell and Whitehead should have skipped all three volumes of Principia Mathematica but sketch the formal system only and do some demonstration proofs.
But it was important for them to write and publish all three volumes (today one would do this in the computer, which I did for R0).
The difference is, that type systems nowadays are much more sophisticated, such that a software implementation is required even for step one.

With an advanced type system (in particular, with type abstraction), paper proofs aren't reliable anymore, which holds for step one, too.
So you need computer verification for this.
The type restrictions become more sophisticated since the Eigenvariable conditions have to be preserved for both the term as well as the type level, which aren't strictly separated anymore. See, for example:
Scope Violation in Lambda Conversion at Type Level
"[λtτ.[λuτ.xt]()]uτ"
Moreover, you need a type referencing mechanism similar to de Bruijn indices in order to preserve the dependencies.
Even in simple symbols such as the universal quantifier (with type abstraction) you have dependencies over three levels, which cannot be effectively handled on paper in a reliable way:
"definition of the universal quantifier with type abstraction := [λtτ .[λpot.(=o(ot)(ot)[λxt.To]pot)o](o(ot))] = [λt.[λp.(= [λx.T ] p)]]o(o\3)τ , with depth/nesting index \3 in o(o\3)τ referring to τ , hence ω is of type o(o\3)[ω / \3] = o())."

One might consider formalizing the systems of both step one and step two and prove them equivalent.

____________________________________________________

Reply all
Reply to author
Forward
0 new messages