Tautology proofs from SAT solvers?

Raph Levien

unread,

Aug 23, 2017, 4:25:03 PM8/23/17

to Metamath Mailing List

Hi,

I'm interested in tautology proofs. Since in this domain there's no significant difference between Ghilbert and Metamath, I figured I'd post here to reach a wider audience.

It occurs to me that automatically generating tautology proofs from SAT solvers should be practical. Many (but not all) modern SAT solvers can produce traces when they determine that the input is unsatisfiable. The state of the art seems to be DRAT format, but for my purposes I'm thinking it might be easier to work with TraceCheck. Heule and Biere's Satisfiability Solvers is an excellent introduction to both in addition to background on SAT solvers and satisfiability in general. I've found the easiest source for TraceCheck proofs is PicoSat.

Here's the outline of my approach. Assume for the sake of simplicity we want to prove a tautology as a theorem (ie, no hypotheses). Clearly this is the most general case, because we can always add hypotheses as antecedents and use modus ponens, although I suspect there are some optimization opportunities.

For each unique node in the AST of the input formula (other than not), assign a SAT variable. To handle not, just invert the sense of the variable. One such SAT variable will represent the entire expression, and also each variable in the input will correspond to a SAT variable. For example, picking con1b,

x1: ( ( -. ph -> ps ) <-> ( -. ps -> ph ) )

x2: ( -. ph -> ps )

x3: ( -. ps -> ph )

x4: ph

x5: ps

Then, for each node in the AST, generate clauses based on the type of the node.

x is of form A \/ B

x \/ -A

x \/ -B

-x \/ A \/ B

x is of form A /\ B

x \/ -A \/ -B

-x \/ A

-x \/ B

x is of form A -> B

x \/ A

x \/ -B

-x \/ -A \/ B

x is of form A <-> B

x \/ A \/ B

x \/ -A \/ -B

-x \/ A \/ -B

-x \/ -A \/ B

The above example then gets one instantiations of the <-> case and two of ->:

-x1 \/ x2 \/ -x3

-x1 \/ -x2 \/ x3

x1 \/ x2 \/ x3

x1 \/ -x2 \/ -x3

x2 \/ -x4

x2 \/ -x5

-x2 \/ x4 \/ x5

x3 \/ -x4

x3 \/ -x5

-x3 \/ x4 \/ x5

Finally, to these add -x1, ie we want to prove the negation of our tautology unsatisfiable. I've attached a file in DIMACS standard notation, trivially found unsatisfiable by any decent SAT solver. It should be clear that this reduction is linear in the size of the input.

Ok, informally it should be clear that finding the system unsatisfiable is a proof, but how do we get it into a form that checks in Metamath (or Ghilbert)?

The trace basically contains a sequence of resolutions. This is the core resolution step, expressed as a deduction:

ph -> ( ps \/ ch )

ph -> ( ta \/ -. ch )

-----------------------

ph -> ( ps \/ ta )

There are variations, as ps or ta might be empty (false). The case where both are empty is pm2.65i, which proves -. ph. In the general case, we probably have to do some manipulation of the clauses (associativity and commutativity) so that the shared ch is last, and also clean up ps \/ ta so that duplications are absorbed.

The proof works with expressions of the form -. A -> B \/ C \/ ... \/ D, where A is the tautology we want to prove, and B, C, D represent either some subterm or its negation. Since \/ is a binary operator, the actual Metamath expression is a binary tree. An optimized implementation would keep these trees balanced, so pulling a subterm out is O(log n) in the size of the clause.

The clauses generated from AST nodes are instances of simple theorems; they're tautologies even without the -. A antecedent, so that can be added with a1i. The clause corresponding to -x1 is a straightforward instance of id: -. A -> -. A.

Each resolution in the trace generates another such clause. These are stored on the RPN stack as needed, or saved with "Z" in compressed proof format if the clause will be used more than once. The last resolution should result in the empty clause and thus be a proof of -. -. A (so proved with pm2.65i followed by notnotri).

By my count, there are 13 resolutions in the trace for con1b (also attached). This seems like a reasonable number. My intuition is telling me that the resulting proofs would be tend to be a bit larger than hand-written one, but not hugely so. And of course this technique promises to scale into the thousands of variables. I also suspect that there are more optimization opportunities.

Thoughts?

Raph

con1b.cnf

con1b.trace

David A. Wheeler

unread,

Aug 23, 2017, 4:58:00 PM8/23/17

to metamath, Metamath Mailing List

On Wed, 23 Aug 2017 16:25:01 -0400, Raph Levien <raph....@gmail.com> wrote:
> ... The *trace* basically contains a sequence of resolutions. This is the core

> resolution step, expressed as a deduction:
>
> ph -> ( ps \/ ch )
> ph -> ( ta \/ -. ch )
> -----------------------
> ph -> ( ps \/ ta )

My mathbox already contains a proof of resolution, called "resolution".
In my version the "part to be removed" is the second instead of the
first parameter:

$( Resolution rule. This is the primary inference rule in some automated
theorem provers such as prover9. The resolution rule can be traced back
to Davis and Putnam (1960). (Contributed by David A. Wheeler,
9-Feb-2017.) $)
resolution $p |- ( ( ( ph /\ ps ) \/ ( -. ph /\ ch ) ) -> ( ps \/ ch ) )
$=
( wa wn simpr orim12i ) ABDBAEZCDCABFHCFG $.

> There are variations, as ps or ta might be empty (false). The case where
> both are empty is pm2.65i, which proves -. ph. In the general case, we
> probably have to do some manipulation of the clauses (associativity and
> commutativity) so that the shared ch is last, and also clean up ps \/ ta so
> that duplications are absorbed.

It might be easier in metamath to have separate proofs when one is empty.
Then, for each source step, it's easy to map to a theorem.

> ... By my count, there are 13 resolutions in the trace for con1b (also

> attached). This seems like a reasonable number. My intuition is telling me
> that the resulting proofs would be tend to be a bit larger than
> hand-written one, but not hugely so. And of course this technique promises
> to scale into the thousands of variables. I also suspect that there are
> more optimization opportunities.

Once you *have* a proof, mechanisms like metamath's "minimize" might
be especially effective, so even a relatively long starting proof might
(after minimization) turn out to be quite reasonable.

--- David A. Wheeler

Cris Perdue

unread,

Aug 23, 2017, 8:13:56 PM8/23/17

to meta...@googlegroups.com

Raph,

Could you elaborate a bit on the nature of your interest in proofs of tautologies? It might help focus discussion.

Depending on your objectives, your goals, approaches other than SAT solvers might be more attractive. My experience has been that high performance has not been an issue at all for any tautology proofs I have needed.

By the way, thank you for the link and suggested references related to SAT solving.

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+unsubscribe@googlegroups.com.
To post to this group, send email to meta...@googlegroups.com.
Visit this group at https://groups.google.com/group/metamath.
For more options, visit https://groups.google.com/d/optout.

Norman Megill

unread,

Aug 23, 2017, 11:54:29 PM8/23/17

to Metamath

On Wednesday, August 23, 2017 at 4:58:00 PM UTC-4, David A. Wheeler wrote:

On Wed, 23 Aug 2017 16:25:01 -0400, Raph Levien wrote:

[...]

> ... By my count, there are 13 resolutions in the trace for con1b (also
> attached). This seems like a reasonable number. My intuition is telling me
> that the resulting proofs would be tend to be a bit larger than
> hand-written one, but not hugely so. And of course this technique promises
> to scale into the thousands of variables. I also suspect that there are
> more optimization opportunities.

Once you *have* a proof, mechanisms like metamath's "minimize" might
be especially effective, so even a relatively long starting proof might
(after minimization) turn out to be quite reasonable.

I'm sure people like Wolf Lammen will have a field day finding shorter proofs. :)

Of course interfacing to a SAT prover is an interesting and potentially useful thing to have. While it would certainly be nice to know we can solve complex tautologies that SAT provers are good at, I do wonder how it will help us for typical proofs.

The propositional calculus section grows very slowly, leading me to think that we have essentially all the ones we need for the kinds of proofs that we have been doing.

Right now there are about 1151 axiom, definitions, and theorems in the propositional calculus section of set.mm. I broke them down by year in the table below. (I looked for "Contributed by" and forgot to remove axioms and definitions, but I didn't want to make the table a 2nd time.)

For recent years, the 14 theorems added in each of 2014 and 2015 were miscellaneous ones, about half by Wolf Lammen to shorten other proofs and half by Mario during a period of major contributions he made to the rest of set.mm. Most of the ones in 2016 were by Mario for an adder problem; these are very specialized and seem unlikely to be used again. Most of the ones in 2017 were 2 series of theorems ad3antlr...ad10adantlr and simp-4l...simp-11r added by Mario.

Other notable older years were 2005, when Roy Longton and I added the rest of the Principia Mathematica theorems (perhaps a mistake, since I think most aren't used); 2011, when Anthony Hart showed the equivalence of many different axiomatizations (most of which can't be used outside of those sections); and 2012 when a large series of simp* and syl*anc theorems were added by me and used to shorten a bunch of proofs.

From the trend, I would guess that it will be rare that we will add more than a dozen or so general-purpose propositional calculus theorems per year going forward.

(Along the same lines, I have wondered the same about how FOL provers might help us - it is rare that we add new FOL theorems, and proving FOL theorems is typically not the "hard" part of doing advanced proofs, or at least that hasn't been the case for me.)

Norm

year   # of prop calc stmts
1993    219
1994    182
1995     77
1996     34
1997      4
1998      4
1999     27
2000      1
2001      2
2002     40
2003      6
2004     43
2005    197
2006     49
2007     62
2008     27
2009     56
2010     33
2011    133
2012    171
2013     30
2014     14
2015     14
2016     43
2017     43
       ----
Total     1151

fl

unread,

Aug 24, 2017, 6:56:13 AM8/24/17

to Metamath

Right now there are about 1151 axiom, definitions, and theorems in the propositional calculus section of set.mm.

When you think that in a ,natural deduction system, they have a fistful of logical propositions and are happy with that.!

It is just the moment you realize Gentzen was a great man and that the Russians should have been more careful

when they caught him.

Exactly like the English with Turing in my opinion.

--

FL

fl

unread,

Aug 24, 2017, 7:08:02 AM8/24/17

to Metamath

It is just the moment you realize Gentzen was a great man and that the Russians should have been more careful
when they caught him.

I meant "a great mathematican" obviously.

--

FL

Raph Levien

unread,

Aug 24, 2017, 9:36:10 AM8/24/17

to Metamath Mailing List

I realize SAT is not always the best approach, especially when the number of variables is small. In those cases, I strongly suspect that a brute force search through the library will yield the best results, possibly combined with case analysis to remove one variable at a time if the search fails.

My interest is largely intellectual. It was inspired, though, by the Boolean Pythagorean triple problem, see https://en.m.wikipedia.org/wiki/Boolean_Pythagorean_triples_problem . For a problem of that scale, a SAT solver is probably the only reasonable hope.