Hi,
I'm interested in tautology proofs. Since in this domain there's no significant difference between Ghilbert and Metamath, I figured I'd post here to reach a wider audience.
It occurs to me that automatically generating tautology proofs from SAT solvers should be practical. Many (but not all) modern SAT solvers can produce traces when they determine that the input is unsatisfiable. The state of the art seems to be DRAT format, but for my purposes I'm thinking it might be easier to work with TraceCheck. Heule and Biere's
Satisfiability Solvers is an excellent introduction to both in addition to background on SAT solvers and satisfiability in general. I've found the easiest source for TraceCheck proofs is PicoSat.
Here's the outline of my approach. Assume for the sake of simplicity we want to prove a tautology as a theorem (ie, no hypotheses). Clearly this is the most general case, because we can always add hypotheses as antecedents and use modus ponens, although I suspect there are some optimization opportunities.
For each unique node in the AST of the input formula (other than not), assign a SAT variable. To handle not, just invert the sense of the variable. One such SAT variable will represent the entire expression, and also each variable in the input will correspond to a SAT variable. For example, picking con1b,
x1: ( ( -. ph -> ps ) <-> ( -. ps -> ph ) )
x2: ( -. ph -> ps )
x3: ( -. ps -> ph )
x4: ph
x5: ps
Then, for each node in the AST, generate clauses based on the type of the node.
x is of form A \/ B
x \/ -A
x \/ -B
-x \/ A \/ B
x is of form A /\ B
x \/ -A \/ -B
-x \/ A
-x \/ B
x is of form A -> B
x \/ A
x \/ -B
-x \/ -A \/ B
x is of form A <-> B
x \/ A \/ B
x \/ -A \/ -B
-x \/ A \/ -B
-x \/ -A \/ B
The above example then gets one instantiations of the <-> case and two of ->:
-x1 \/ x2 \/ -x3
-x1 \/ -x2 \/ x3
x1 \/ x2 \/ x3
x1 \/ -x2 \/ -x3
x2 \/ -x4
x2 \/ -x5
-x2 \/ x4 \/ x5
x3 \/ -x4
x3 \/ -x5
-x3 \/ x4 \/ x5
Finally, to these add -x1, ie we want to prove the negation of our tautology unsatisfiable. I've attached a file in DIMACS standard notation, trivially found unsatisfiable by any decent SAT solver. It should be clear that this reduction is linear in the size of the input.
Ok, informally it should be clear that finding the system unsatisfiable is a proof, but how do we get it into a form that checks in Metamath (or Ghilbert)?
The trace basically contains a sequence of resolutions. This is the core resolution step, expressed as a deduction:
ph -> ( ps \/ ch )
ph -> ( ta \/ -. ch )
-----------------------
ph -> ( ps \/ ta )
There are variations, as ps or ta might be empty (false). The case where both are empty is pm2.65i, which proves -. ph. In the general case, we probably have to do some manipulation of the clauses (associativity and commutativity) so that the shared ch is last, and also clean up ps \/ ta so that duplications are absorbed.
The proof works with expressions of the form -. A -> B \/ C \/ ... \/ D, where A is the tautology we want to prove, and B, C, D represent either some subterm or its negation. Since \/ is a binary operator, the actual Metamath expression is a binary tree. An optimized implementation would keep these trees balanced, so pulling a subterm out is O(log n) in the size of the clause.
The clauses generated from AST nodes are instances of simple theorems; they're tautologies even without the -. A antecedent, so that can be added with a1i. The clause corresponding to -x1 is a straightforward instance of id: -. A -> -. A.
Each resolution in the trace generates another such clause. These are stored on the RPN stack as needed, or saved with "Z" in compressed proof format if the clause will be used more than once. The last resolution should result in the empty clause and thus be a proof of -. -. A (so proved with pm2.65i followed by notnotri).
By my count, there are 13 resolutions in the trace for con1b (also attached). This seems like a reasonable number. My intuition is telling me that the resulting proofs would be tend to be a bit larger than hand-written one, but not hugely so. And of course this technique promises to scale into the thousands of variables. I also suspect that there are more optimization opportunities.
Thoughts?
Raph