This will be a bit long but ...
I'm working to combine Axiom and LEAN. LEAN is a CMU/MS project[0][1]
They have a proof assistant and a rapidly growing database of theorems
and proofs. The whole database (mathlib[2]) is strongly typed so it is easy
to know if a theorem applies to a step in your proof. They have a program
which will show a graph of dependency graph of each proof so it is clear
what proof depends on what axiom/proof/definition.
I want to prove the algorithms used in Axiom. Buchberger's Groebner
basis algorithm has already been formally proven (attached). I am working
on a proof hierarchy parallel to the category/domain hierarchy. Proofs of
group theory theorems are attached to the corresponding Axiom categories.
That way they get inherited by the domains. Functions in a domain can use
the theorems and proofs to prove each domain function. Ultimately I'd like
to see a proof of the Risch algorithm.
I'm also deeply involved with AI. Most of my career is actually in the AI
area. My Masters degree work was in machine vision and a lot of my
jobs were in Robotics, including at Unimation, the company that invented
Industrial Robots. I also authored Expert Systems software from IBM
and done work at CMU to change a car tire with Human/Robot cooperation.
These days I'm heavily involved with AI. I have 71 LLMs which I am
benchmarking on their ability to do proofs. Most of them already can.
The basic research goal is to use AI techniques to automate proofs.
Historically this has been a nonsense idea. The basic argument is that
the only effective method was enumerating possible next steps. This
leads to a combinatorial explosion. So, the argument goes, there is
no possible automated solution.
Several things have changed.
First, reinforcement learning has shown that it can be trained to play
any game (chess, go, video games, etc). It requires STATES (the current
state of the play), ACTIONS (what moves can be made in each state),
SUCCESS/FAIL feedback (either only at the end of the game or on each
step if that is available).
Second, LEAN has developed an IDE that maintains STATES. It will
show you where you are in the proof. It will manage multiple branches
so you can work on any one branch at a time.
LEAN provides ACTIONS. Either a proof applies or there are programs
called tactics which automate a lot of the steps needed to move the
proof ahead. The important part is that each LEAN theorem / tactic
is strongly typed which severely limits the available actions. Notice
that this undercuts the combinatorial explosion argument.
LEAN provides SUCCESS/FAIL feedback. It can keep track of what
branches of a proof are complete and which are still open. It provides
SUCCESS when the proof is complete.
So all of the elements required for reinforcement learning already exist
in LEAN. Viewed as a game, LEAN proofs can be automated.
Also of note, after the LLM AI systems are trained there is a final step
called RLHF, Reinforcement Learning with Human Feedback. I intend
to choose certain LLMs to already do proofs and train them on LEAN
using reinforcement learning.
I have collected the theorem / proofs from my Advanced Calculus
college book[3] to use as new, unseen test cases.
All of this is a subgoal of the larger goal of merging computer algebra
and proof assistants to form "computational mathematics". These two
fields of automated mathematics don't seem to talk. Of the hundreds
of papers I've read only James Davenport is mentioned in the bibliography
of both fields.
All of that said, I will note that the latest Gemini 1.5 and GPT4.2 are
so much more powerful than the LLMs I'm currently benchmarking.
I used to be skeptical that LLMs could do proofs but I not only believe
that they can ... I now believe that they will soon exceed us.
Bringing such foundational mathematics to computer algebra systems
will greatly enhance our efforts.
Tim
[0] Jeremy Avigad "Hoskinson Center For Formal Mathematics"
[1] Leonardo de Maura et al. "The LEAN Theorem Prover"
[2] A Mathlib Overview
[3] Angus E. Taylor; W. Robert Mann "Advanced Calculus 2nd Ed."
Xerox College Publishing 0-536-00587-7 (1972)