Proof Validation Track for SMT-COMP

88 views
Skip to first unread message

Jochen Hoenicke

unread,
Jul 19, 2021, 10:44:17 AM7/19/21
to smt...@googlegroups.com, smt-...@cs.nyu.edu
Hello,

Yesterday, I mentioned the plans for the Proof Validation Track. I
know that finding a good solver independent proof format is difficult.
I would like to go with something simple with a minimal but complete
set of axioms. Then every solver has the same burden to translate its
own proof steps into this simple format. In this mail I want to
outline the proposal I have in mind with a few more details. I can
elaborate further, but maybe it's better to get feedback first, if you
think this is a way this can work.

One question is how to explain the CNF encoding. When the proof
format supports arbitrary SMT-LIB terms as atoms, this is simple: an
assert is just introducing a unit clause with one atom. Then you add
all Tseitin-encoding clauses like "~(or a b), a, b" and use
resolution. A solver that doesn't use Tseitin-encoding can still
explain the clauses in its CNF conversion using these axioms and
resolution.

By allowing the prover to use SMT-LIB terms in the proof that do not
occur in the input formula, the proof calculus supports extended
resolution and arbitrary intermediate transformations, e.g., a solver
can use de-Morgan to eliminate "and" in favor of "or", or vice versa.
All these transformation rules need to be proven by the respective
solver (or a proof-postprocessor) using only the minimal set of
axioms.

I experimented a bit and it looks like the standard Tseitin-encoding
axioms for logical operators,
reflexivity/symmetry/transitivity/congruence for equality, and a few
rules for ite/distinct/chained equalities should already be complete
for QF_UF. For UF, one would add the axioms for universal and
existential quantifiers using some syntax for Hilbert's
epsilon-operator. Other theories can be added by adding theory
specific axioms, like McCarthy's array axioms. But we could keep the
proof validation track limited to the easier logics at first. I have
at the moment no idea how to extend it to the bitvector theory, for
example.

Replaying rewrite based proofs to this format is a bit tedious, as you
first have to prove the rules you use. As an example, proving (= (or
a b) (or b a)) would take eight axioms and seven resolution steps.
However, this is a constant overhead so it would cause only linear
proof size increase and only for the pre-processing part. And luckily
we can automate the tedious work of adding these proof steps :)

Finally, for a compact representation of the proof, you need to
abbreviate common subterms. For this, I would suggest using the
standard let syntax from SMT-LIB. A difficulty added by the let
encoding is that terms need to be considered equal if one is the let
expansion of the other term.

Probably most controversial in this proposal would be the question
which terms are equal, i.e. cancel each other out if used as pivot in
a resolution step. I would like to be very strict, e.g. even "(or a b
c)" and "(or (or a b) c)" should be different, and "((or a b) :named
c)", "(or a b)", and "c" are three different terms. A let expansion
is considered equal to the unexpanded term, though. Otherwise, proofs
cannot be compactly represented.

I hope I don't step on too many toes by going a bit ahead in this proposal.

I would be interested in some feedback. Especially if you are
concerned that certain features of your proofs cannot be represented
without quadratic or even exponential overhead, then I would like to
hear these concerns.

Some SMT-solvers use SAT solvers that create DRAT proofs. While these
proofs are not easily expressible using resolution, they can be
converted to extended resolution proofs with polynomial overhead. My
feeling is that I would rather keep the proof validator simple and let
someone provide an engine that converts DRAT to extended resolution
that DRAT-based solvers can use. The polynomial overhead seems
unavoidable, since checking a DRAT proof incurs polynomial overhead,
if I'm not mistaken.

Regards,
Jochen Hoenicke

Tjark Weber

unread,
Jul 20, 2021, 10:04:44 AM7/20/21
to smt...@googlegroups.com, smt-...@cs.nyu.edu
Jochen,

On Mon, 2021-07-19 at 16:44 +0200, Jochen Hoenicke wrote:
> Yesterday, I mentioned the plans for the Proof Validation Track. I
> know that finding a good solver independent proof format is
> difficult.

Indeed it is. Nearly ten years ago, Sascha Böhme and I -- having worked
extensively with existing proof formats at the time -- discussed some
high-level considerations that I think are still relevant today: see
http://user.it.uu.se/~tjawe125/publications/boehme11designing.pdf

> I would like to go with something simple with a minimal but complete
> set of axioms. Then every solver has the same burden to translate
> its own proof steps into this simple format.

There are trade-offs: a simple proof format makes it easier to
implement a proof checker but makes it more difficult/laborious to
generate proofs, and might also lead to larger proofs.

> I would be interested in some feedback. Especially if you are
> concerned that certain features of your proofs cannot be represented
> without quadratic or even exponential overhead, then I would like to
> hear these concerns.

My initial impression (not based on any experimental data) is that your
proposal isn't necessarily hitting the sweet spot in the available
design space. I understand that you want the proof checker to be
simple, but for instance differentiating between (or a b) and (or b a)
could be a pain in solvers (and lead to considerable proof overhead),
while it would probably be relatively easy in the checker to treat a
disjunction as a set of disjuncts (as long as there is no ambiguity).

> Some SMT-solvers use SAT solvers that create DRAT proofs. While
> these proofs are not easily expressible using resolution, they can be
> converted to extended resolution proofs with polynomial overhead. My
> feeling is that I would rather keep the proof validator simple and
> let someone provide an engine that converts DRAT to extended
> resolution that DRAT-based solvers can use.

My understanding is that the SAT community is using DRAT for good
reasons (including performance). Given that SMT is an extension of SAT,
and that SMT solvers often use a SAT solver internally, I think it
might be worth looking into DRAT in more detail, and investigate if it
wouldn't be beneficial to base a proof format for SMT on (a suitable
extension of) DRAT. An easy test for any SMT proof format is how it
performs on (large) purely propositional problems.

Best,
Tjark









När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

Jochen Hoenicke

unread,
Jul 20, 2021, 1:46:40 PM7/20/21
to Tjark Weber, smt...@googlegroups.com, smt-...@cs.nyu.edu
Hello Tjark,

Thanks for your feedback.
The problem here is where to draw a line, if you want commutative or,
you probably also want associative or, the same for and, xor, +, *,
bvmul, bvadd. Then maybe commutativity for fpmul and fpadd, which
aren't associative. Then you naturally also want to rewrite - to +
and negation so that associativity can also be used for that. Finally
you end up with an almost intractable problem for the proof checker to
determine what formula was proven. I don't want the resolution rule
to care about which builtin operator is associative or commutative,
especially when a full proof may take millions of resolution steps.
I'd rather put the burden on the solver author's who know what
simplification steps their solver may do and when the solver does
them. I don't want to write a universal simplifier that works for
most of the solvers and needs to be extended whenever a new solver
enters the field.

But after some discussions with Pascal, Clark, and Bruno it looks like
it would not be feasible for most solvers to produce fully verifiable
low-level proofs, especially when they use advanced techniques like
symmetry breaking. As a compromise, we could allow the solver to
assert arbitrary clauses where they just give a hint from which input
these clauses were generated and some solver-specific information,
that helps their own proof-checker to understand the proof. These
solver specific rules would be skipped by the proof-checker and it
could also produce a proof obligation in the form of an SMT-file to
check that the asserted clauses are indeed a consequence. Thus, if a
solver does some commutativity rewrite, it could just assert the
converted formula and state that it's a consequence of the input
formula. The end-goal would be to get rid of these trusted steps,
e.g., by a postprocessor, but this end-goal may take years to achieve.

Another raised concern is that some solvers want to introduce
auxiliary variables for which they don't know an equivalent SMT term
for. An auxiliary variable can always be introduced by "let", but
this requires knowing an equivalent SMT-LIB term. A solver that just
dumps it's internal converted CNF using the unchecked rule, may want
to use new uninterpreted symbols and only claim equisatisfiability of
the CNF with the original formula. I'm thinking about a kind of
"exists"-binder for the whole proof to create new uninterpreted
auxiliary variables.

A proof validation competition may be a very high goal for next year.
If you allow unexplained steps in your proof you cannot compare the
proofs of two solvers. One solver may use hundreds of unexplained
steps that just do a small rewrite that is easy to see while the other
solver may use a single unexplained step that derives false from the
input formula. In most metrics the second solver would win, even
though the first does a much better job. Maybe we can have a proof
exhibition, but this time with a fixed syntax so that an independent
proof checker can track the resolution steps of the proof and document
how many untrusted steps were needed and which. We could also
preprocess some benchmarks into a simple CNF form and run the solvers
on the simplified benchmark, so that no major pre-processing has to be
done.

Lastly, the common proof format doesn't have to become the default
proof format of every solver. If you like your proof format, then by
all means you can keep it. Just write a post-processor that
translates the easy steps into low-level proof steps and skips the
hard steps in your proof.

>
> My understanding is that the SAT community is using DRAT for good
> reasons (including performance). Given that SMT is an extension of SAT,
> and that SMT solvers often use a SAT solver internally, I think it
> might be worth looking into DRAT in more detail, and investigate if it
> wouldn't be beneficial to base a proof format for SMT on (a suitable
> extension of) DRAT. An easy test for any SMT proof format is how it
> performs on (large) purely propositional problems.

I definitely will look into DRAT and plan to write a translator for
these proofs at least to test the overhead this induces. The problem
with including the unmodified RAT rule is that it changes the meaning
of an atom. This is possible in the SAT context, where you know every
clause containing the atom and where an atom is uninterpreted. I don't
see how to do this in an SMT proof, especially when an atom is an SMT
term and you always want to be able to introduce arbitrary theory
lemmas for every literal in your proof. But it is possible to use
extended resolution to compactly represent a DRAT proof. There one
would explicitly redefine the meaning of literals and reproof the
known facts for the redefined literal.

Jochen

Simon Cruanes

unread,
Jul 20, 2021, 2:25:57 PM7/20/21
to smt...@googlegroups.com, Tjark Weber, smt-...@cs.nyu.edu
Hi Jochen, Tjark,

On Tue, 20 Jul 2021, Jochen Hoenicke wrote:
> On Tue, 20 Jul 2021 at 15:51, Tjark Weber <tjark...@it.uu.se> wrote:
> > There are trade-offs: a simple proof format makes it easier to
> > implement a proof checker but makes it more difficult/laborious to
> > generate proofs, and might also lead to larger proofs.

I'm going to go the other direction and suggest that proof formats
should favor solvers, not checkers. A strong benefit of DRUP/DRAT for
SAT solvers is that, while they're harder to check, they're very easy to
emit. Just dump clauses as you infer them, and maybe emit `d 1 2 3 0` when you
GC the clause. I argue that this is almost necessary for universal adoption:
the easier to emit, the more likely the adoption. Adopt it, and checkers
will come (if there is at least one initial checker).

For SMT this need is even more pressing, as it's a lot of work to
produce proofs for a solver. My intuition is that the solvers should be
able to emit a small number of "steps" for each clause it infers
internally (not necessarily one to one), at least for the common theories
where you have UF lemmas, arith lemmas, array axiom instantiations, etc.

If a step is non ambiguous and can be checked in O(n log(n)), then the
checker should bear the burden of reimplementing a congruence closure or
a RUP-like reconstruction of resolution proofs.

Another benefit of such high level proofs is performance and proof size
(smaller proofs means less IO and less time spent generating the proof).

> But after some discussions with Pascal, Clark, and Bruno it looks like
> it would not be feasible for most solvers to produce fully verifiable
> low-level proofs, especially when they use advanced techniques like
> symmetry breaking. As a compromise, we could allow the solver to
> assert arbitrary clauses where they just give a hint from which input
> these clauses were generated and some solver-specific information,
> that helps their own proof-checker to understand the proof. These
> solver specific rules would be skipped by the proof-checker and it
> could also produce a proof obligation in the form of an SMT-file to
> check that the asserted clauses are indeed a consequence. Thus, if a
> solver does some commutativity rewrite, it could just assert the
> converted formula and state that it's a consequence of the input
> formula. The end-goal would be to get rid of these trusted steps,
> e.g., by a postprocessor, but this end-goal may take years to achieve.

Such steps are checkable too if you state them as `t = u` with metadata
indicating that AC status of `or` was used (or, really, of a set of known
symbols). The checker can normalize both t and u and check their
equality (as a DAG). That seems reasonably efficient to me.


> Another raised concern is that some solvers want to introduce
> auxiliary variables for which they don't know an equivalent SMT term
> for. An auxiliary variable can always be introduced by "let", but
> this requires knowing an equivalent SMT-LIB term.

A problem with `let` as a means to introduce sharing in the proof
is that it means scoping to its body. I think it's cleaner to have
proof-level "definitions" (like `define-const` but extra-logical, i.e.
the newly defined symbol and the term are actually the same once parsing
is done).

> Lastly, the common proof format doesn't have to become the default
> proof format of every solver. If you like your proof format, then by
> all means you can keep it. Just write a post-processor that
> translates the easy steps into low-level proof steps and skips the
> hard steps in your proof.

If people go through the hard work of implementing a proof format, it'd
better be the one everybody standardizes on :-).

Cheers,

--
Simon Cruanes
signature.asc

Jochen Hoenicke

unread,
Jul 20, 2021, 5:03:06 PM7/20/21
to smt...@googlegroups.com
Hello list,

I accidentally hit reply to sender only, so here is my reply to Hans-Jörg.

---------- Forwarded message ---------

Hi Hans-Jörg,

On Tue, 20 Jul 2021 at 20:11, Hans-Jorg Schurr
<hans-jor...@inria.fr> wrote:
>
> Hi Jochen, Hello everyone,
>
> > Yesterday, I mentioned the plans for the Proof Validation Track. I
> > know that finding a good solver independent proof format is difficult.
>
> Thanks for taking on this certainly not easy task.
>
> I have some thoughts about all of this, mostly informed by our work
> on using SMT proofs in Isabelle and and subsequent effort of using the
> Alethe format in other contexts.
>
> I get the impression that we have some similar ideas. The QF_UF fragment
> of Alethe is very similar to what you described:

My proof format is a bit inspired by SMTCoq which I think was written
for veriT proofs. So I guess there are common ancestors.

> Nevertheless, I think one should think about an upgrade path for existing
> solvers. Even seemingly innocuous changes can be difficult to implement.
> For example, I spent two weeks last year trying to change veriT such
> that it doesn't always consider "a = b" and "b = a" as equal -- I had
> to give up in the end.

Yes, I think we have to include the "hole" rule that contains a user
defined annotation, so that solvers can at least verify their own
proofs.

> I think, it is also very useful to make a distinction between clauses
> and disjunctions. Otherwise it's not clear if "(or a b)" is a unit or
> binary clause. I'm not sure if this is part of your proposal.

Yes, this is a good point. The original goal was to never write these
clauses, only the axiom names and their arguments. This avoids the
need for a concrete syntax for clauses. With a "hole" rule one may
have to give a concrete syntax.

I'm still not sure whether one should also distinguish between negated
literals and "not"-terms. It would create even more overhead and
would require Tseitin axioms for not. It could be avoided by
automatically converting every term starting with not to a literal
without not, handling any number of nested not. Only, it's a little
against my philosophy of keeping it really strict :)

> > Finally, for a compact representation of the proof, you need to
> > abbreviate common subterms. For this, I would suggest using the
> > standard let syntax from SMT-LIB. A difficulty added by the let
> > encoding is that terms need to be considered equal if one is the let
> > expansion of the other term.
>
> In Alethe we use the standard SMT-LIB :named annotation. This draws
> a clear distinction between `let`, which is part of the logic, and
> the sharing. It solves the problem you mentioned.

I don't see how it solves the problem that terms need to be compared
modulo expansion. You still need to expand the symbol to its defining
term when necessary. It would work the same as for let variables,
except that it is technically now a define-fun expansion.

I personally dislike :named as just creating a named term has
side-effects. In our API this can lead to inconsistencies if a user
creates a named term and then never uses it. Or if the named term
doesn't appear in the right position in the term that is printed. It
also breaks commutativity, e.g. (+ (! (f x) :named s) (f s)) is fine
but (+ (f s) (! (f x) :named s)) complains about undefined symbol s.

Jochen

Jochen Hoenicke

unread,
Jul 20, 2021, 6:18:30 PM7/20/21
to smt...@googlegroups.com
Hello Simon,

On Tue, 20 Jul 2021 at 20:26, Simon Cruanes <simon.cru...@m4x.org> wrote:
>
> Hi Jochen, Tjark,
>
> On Tue, 20 Jul 2021, Jochen Hoenicke wrote:
> > On Tue, 20 Jul 2021 at 15:51, Tjark Weber <tjark...@it.uu.se> wrote:
> > > There are trade-offs: a simple proof format makes it easier to
> > > implement a proof checker but makes it more difficult/laborious to
> > > generate proofs, and might also lead to larger proofs.
>
> I'm going to go the other direction and suggest that proof formats
> should favor solvers, not checkers. A strong benefit of DRUP/DRAT for
> SAT solvers is that, while they're harder to check, they're very easy to
> emit. Just dump clauses as you infer them, and maybe emit `d 1 2 3 0` when you
> GC the clause. I argue that this is almost necessary for universal adoption:
> the easier to emit, the more likely the adoption. Adopt it, and checkers
> will come (if there is at least one initial checker).

I want to mention that a let-based format still allows to print the
proof on the fly to a log file:

(let ((p1 ...))
(let ((input1 (assert p999)))
(let ((consequence1 (res p999 input1 (axiom ...))))
...
(let ((emptyclause (res ...)))
emptyclause
)))))))))))))))))))))))))))))))

If you combine this with your own high-level proof rules, you can get
a compact log format that can be later post processed.

> If a step is non ambiguous and can be checked in O(n log(n)), then the
> checker should bear the burden of reimplementing a congruence closure or
> a RUP-like reconstruction of resolution proofs.

Here I strongly disagree. If the proof checker needs some of the same
components and complexity as the SMT solver (congruence closure,
unit-propagation engine), there is the danger that the code gets
reused and has exactly the same bugs. Who checks the proof checker?

> Another benefit of such high level proofs is performance and proof size
> (smaller proofs means less IO and less time spent generating the proof).

I think for high level proofs every solver needs its own proof
checker. A common proof format with high-level proofs would just end
up as the union of all the rules used by the solvers and you get a
mess. The "solver independent" proof validator would probably have
more bugs than any of the solvers. And it wouldn't be solver
independent, but just support the solvers whose rules were included.
I would rather have solver-specific high-level "proof explainers" that
replay the proofs of their solver using their own congruence closure
engine and then explain to the solver independent proof validator the
steps involved in this proof. This way if the proof explainer has a
bug, the validator would catch it. And it's much simpler to only
support the proof format of your own solver where you know exactly
which rules can be used at which time.

> > Thus, if a
> > solver does some commutativity rewrite, it could just assert the
> > converted formula and state that it's a consequence of the input
> > formula.
>
> Such steps are checkable too if you state them as `t = u` with metadata
> indicating that AC status of `or` was used (or, really, of a set of known
> symbols). The checker can normalize both t and u and check their
> equality (as a DAG). That seems reasonably efficient to me.

But that would mean that the solver independent proof checker needs to
implement a rule for one of the solvers that does AC rewriting, even
though other solvers may never make use of this rule. The next solver
wants to convert everything into negation normal form, so you also
need a rule for this.

> > Another raised concern is that some solvers want to introduce
> > auxiliary variables for which they don't know an equivalent SMT term
> > for. An auxiliary variable can always be introduced by "let", but
> > this requires knowing an equivalent SMT-LIB term.
>
> A problem with `let` as a means to introduce sharing in the proof
> is that it means scoping to its body. I think it's cleaner to have
> proof-level "definitions" (like `define-const` but extra-logical, i.e.
> the newly defined symbol and the term are actually the same once parsing
> is done).

This may lift a small burden from the validator, but the validator has
to handle the full let syntax anyway, since it needs to parse the
input formula. If you worry that the scope of "let" in the proof you
produce may be wrong, just use the proof logging syntax I sketched
above.

> > Lastly, the common proof format doesn't have to become the default
> > proof format of every solver. If you like your proof format, then by
> > all means you can keep it. Just write a post-processor that
> > translates the easy steps into low-level proof steps and skips the
> > hard steps in your proof.
>
> If people go through the hard work of implementing a proof format, it'd
> better be the one everybody standardizes on :-).

People have different needs. We want to compute interpolants from
proofs and have specific needs for this, e.g., the solver must not
create new terms in quantifier-free logics. I wouldn't want to force
our needs onto the community, I'm not sure how many solver authors
would agree that our extensionality lemma for arrays that avoids
creating the diff(a,b) term is the "right" proof format.

My goal of this solver independent proof format is to get more trust
in the results of SMT solvers that use it and also to decide the
status of unknown benchmarks by having independently verifiable
proofs.

Regards,

Jochen Hoenicke

>
> Cheers,
>
> --
> Simon Cruanes
>
> --
> You received this message because you are subscribed to the Google Groups "SMT-LIB" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to smt-lib+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/smt-lib/YPcVMYTSuMScGDtI%40church.

--
Jochen Hoenicke,
Email: hoen...@informatik.uni-freiburg.de Tel: +49 761 203 8243

Simon Cruanes

unread,
Jul 20, 2021, 6:25:48 PM7/20/21
to smt...@googlegroups.com
I'll add another bring-your-own-proof-format instance to this
discussion, because it illustrates some of the points Jochen makes.
I call it "Quip" (for "quick proofs").

I have a WIP checker [here](https://github.com/c-cube/quip)
and a WIP reference document
[here](https://c-cube.github.io/quip-book/).

A small example (eq_diamond2 in QF_UF):
https://c-cube.github.io/quip-book/reference/example.html

Roughly I'd say it's close to Alethe but with coarser grained
inferences, explicit polarities in clauses, and a lot of vaporware when
it comes to instantiations since I don't have any actual quantified
proofs in it yet.

On Tue, 20 Jul 2021, Jochen Hoenicke wrote:
> > I think, it is also very useful to make a distinction between clauses
> > and disjunctions. Otherwise it's not clear if "(or a b)" is a unit or
> > binary clause. I'm not sure if this is part of your proposal.
>
> Yes, this is a good point.
> [...]
> I'm still not sure whether one should also distinguish between negated
> literals and "not"-terms. It would create even more overhead and
> would require Tseitin axioms for not. It could be avoided by
> automatically converting every term starting with not to a literal
> without not, handling any number of nested not. Only, it's a little
> against my philosophy of keeping it really strict :)

Agreed that it's useful to have first-class clauses. For the polarities,
I currently have them; a clause looks like this:
`(cl (+ a) (+ b) (- (or a b)))`. However I've found the need for a rule
named "nn" (negation normalization) which turns `(cl (- (not a)) …)`
into `(cl (+ a) …)`, and it's a bit ugly and unclear when the solver
needs to produce it.

On the other hand it's very handy to have these literals when talking
about EUF lemmas, as it makes it clear what is proven from what.

> > > Finally, for a compact representation of the proof, you need to
> > > abbreviate common subterms. For this, I would suggest using the
> > > standard let syntax from SMT-LIB. A difficulty added by the let
> > > encoding is that terms need to be considered equal if one is the let
> > > expansion of the other term.

Assuming a proof checker represents terms as DAGs (as it should!
Efficiency requires it, I think), then the abbreviations can just be at
the syntax level and fully expanded before checking, just like SMT
solvers do.

I'm dubious my format will become popular in such a (suddenly) crowded
space, but hopefully it can illustrate a different point in the design
space.

Cheers,

--
Simon Cruanes
signature.asc

Jochen Hoenicke

unread,
Jul 21, 2021, 4:03:57 AM7/21/21
to smt...@googlegroups.com
On Wed, 21 Jul 2021 at 00:25, Simon Cruanes <simon.cru...@m4x.org> wrote:
>
> A small example (eq_diamond2 in QF_UF):
> https://c-cube.github.io/quip-book/reference/example.html

Thanks for the input. Your proof format is on the same level as the
format I have in mind. Converting between the formats should be
straight-forward (at least for the proof on this page). Your bool-c
rule corresponds to my Tseitin-axioms, the ccl rule in that example is
my transitivity axiom and would be written (transitivity x0 y0 x1). My
planned syntax is a bit less verbose, omitting the proved clause from
the stepc (in my syntax deft and stepc are just let). I'm not sure
if including the clause is helpful for debugging. I think if the
resolution rule takes the pivot literal and checks for the presence of
the pivot literal in the antecedents that already helps enough. The
proof checker could also dump the clauses for debugging purposes.

> Roughly I'd say it's close to Alethe but with coarser grained
> inferences, explicit polarities in clauses, and a lot of vaporware when
> it comes to instantiations since I don't have any actual quantified
> proofs in it yet.

I guess your ccl rule allows more than just transitivity. We have a
similarly complicated cc-lemma in our prover. It produces the same
clause as your ccl rule and annotates it with all the paths used in
the CC-graph (for fast proof checking; we have to traverse the paths
anyway to collect the equalities). When combining this with Array
theory (which is integrated into CC theory), the complexity became too
much to handle, especially for our interpolator. We found that
post-processing the cc-lemma to use only transitivity, congruence and
hyper-resolution is not so difficult and made interpolation of CC
lemmas and proof-checking almost trivial. The post-processor just
builds a transitivity axiom for each path, a congruence axiom for each
equality on a path that is not already in the clause, orders these
axioms by dependency, and then combines the axioms with a single
hyper-resolution rule.

> On Tue, 20 Jul 2021, Jochen Hoenicke wrote:
> > I'm still not sure whether one should also distinguish between negated
> > literals and "not"-terms. It would create even more overhead and
> > would require Tseitin axioms for not. It could be avoided by
> > automatically converting every term starting with not to a literal
> > without not, handling any number of nested not. Only, it's a little
> > against my philosophy of keeping it really strict :)
>
> Agreed that it's useful to have first-class clauses. For the polarities,
> I currently have them; a clause looks like this:
> `(cl (+ a) (+ b) (- (or a b)))`. However I've found the need for a rule
> named "nn" (negation normalization) which turns `(cl (- (not a)) …)`
> into `(cl (+ a) …)`, and it's a bit ugly and unclear when the solver
> needs to produce it.

The Tseitin-clauses for not are just "+(not a), +a" and "-(not a),-a",
and resolving with one of them is your nn step.

If "not" itself is used to represent negated literal in clauses, you get a
problem with double negation. Resolving "C \/ (not (not a))" with
"(not a) \/ a" on the pivot "(not a)" does not only look strange, but you
are resolving with a tautological clause containing a literal with both
polarities. To avoid this in this setting, a double-negation must be
removed automatically and silently.

I have a preliminary write-up of my axioms and overall format, but I
wanted to clean it up and incorporate the feedback I got so far. I
also haven't started with the proof checker yet. Although it wouldn't
differ so much from the proof-checker we have in our solver if you
would strip all the high-level proof rules from it.

Regards,
Jochen

Nikolaj Bjorner

unread,
Jul 21, 2021, 9:04:15 AM7/21/21
to smt...@googlegroups.com, Tjark Weber, smt-...@cs.nyu.edu
For proofs created during search, I am now adopting a form of DRUP(T).
This format does not cover pre-processing.
I found that creating low level proof objects during search imposes too much complication
when the main feature needed for proof checking is to mainly re-validate a sequence of claims.
The solver generates a proof log that extends DRUP (generating proper DRAT is possible but rare for uninterpreted propositional variables, I don't generate PR/SPR).
Recall that DRUP is just a sequence of claims: claim that a clause is a consequence or that future consequences don't depend on deleted clause).

1. declarations (declare constants, functions, sorts using SMTLIB2 conventions)
2. term definitions (declare AST nodes)
3. input assertions (basically the formula after pre-processing)
4. consequences (propositional or theory consequences)
5. Theory tautologies
6. deleted clauses
7. possibly also redundant clauses (that preserve satisfiability) with extra symbols.

For DIMACS input, the proof logs can be consumed by DRAT tools.

Currently, a similar format is produced for tracing search, and consumed by "axiom profilers".
Having both tracing and proof construction in one entity is preferrable from the point of view of maintaining such code
(the format consumed by current axiom profilers is prone to bit-rot).

I am using this format for self-validation (a proof checker using Z3 to check consequences),
but a preferred validator would be independent. It can verify theory tautologies on its own and
produce a residue of unverified tautologies.

Nikolaj


-----Original Message-----
From: smt...@googlegroups.com <smt...@googlegroups.com> On Behalf Of Jochen Hoenicke
Sent: Tuesday, July 20, 2021 10:46 AM
To: Tjark Weber <tjark...@it.uu.se>
Cc: smt...@googlegroups.com; smt-...@cs.nyu.edu
Subject: [EXTERNAL] Re: [SMT-COMP] [smt-lib] Proof Validation Track for SMT-COMP

Hello Tjark,

Thanks for your feedback.

On Tue, 20 Jul 2021 at 15:51, Tjark Weber <tjark...@it.uu.se> wrote:
>
> Jochen,
>
> On Mon, 2021-07-19 at 16:44 +0200, Jochen Hoenicke wrote:
> > Yesterday, I mentioned the plans for the Proof Validation Track. I
> > know that finding a good solver independent proof format is
> > difficult.
>
> Indeed it is. Nearly ten years ago, Sascha Böhme and I -- having
> worked extensively with existing proof formats at the time --
> discussed some high-level considerations that I think are still
> relevant today: see
> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fuser.
> it.uu.se%2F~tjawe125%2Fpublications%2Fboehme11designing.pdf&amp;data=0
> 4%7C01%7Cnbjorner%40microsoft.com%7Cf09f7791b87740e4393c08d94ba654cd%7
> C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637624000503558706%7CUnkno
> wn%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiL
> CJXVCI6Mn0%3D%7C1000&amp;sdata=R2HfDTKEw%2FAjCT4YVjvuN%2BtydsriD2KAkrC
> 6LlQ00lA%3D&amp;reserved=0
--
You received this message because you are subscribed to the Google Groups "SMT-LIB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smt-lib+u...@googlegroups.com.
To view this discussion on the web visit https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fd%2Fmsgid%2Fsmt-lib%2FCANYHNmKKAfi7iE%253DLtHo41QLOFJt674e8_tRM%252B3mDw7XLWAoodg%2540mail.gmail.com&amp;data=04%7C01%7Cnbjorner%40microsoft.com%7Cf09f7791b87740e4393c08d94ba654cd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637624000503558706%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=P7HcqP8krBK6YpLuhdcuZ2zr8X2ufvFrjsgnIAEdDOI%3D&amp;reserved=0.

BOBOT Francois

unread,
Jul 21, 2021, 9:04:15 AM7/21/21
to smt...@googlegroups.com, chantal...@lri.fr
Hi,

Le mercredi 21 juillet 2021 à 00:18 +0200, Jochen Hoenicke a écrit :
> On Tue, 20 Jul 2021 at 20:26, Simon Cruanes <
> simon.cru...@m4x.org> wrote:
>
> > If a step is non ambiguous and can be checked in O(n log(n)), then
> > the
> > checker should bear the burden of reimplementing a congruence
> > closure or
> > a RUP-like reconstruction of resolution proofs.
>
> Here I strongly disagree. If the proof checker needs some of the same
> components and complexity as the SMT solver (congruence closure,
> unit-propagation engine), there is the danger that the code gets
> reused and has exactly the same bugs. Who checks the proof checker?
>
>

The proof of congruence closure or unit-propagation engine is not state
of the art anymore. We have the tools to prove those questionable part
of the checker formally. The proof helps to bridge the gap between the
algorithm we are able to prove for all inputs and the one we prove for
each instance.

My two cents,

Best,

--
François

Haniel Barbosa

unread,
Jul 21, 2021, 7:14:52 PM7/21/21
to smt...@googlegroups.com
Hello,

So I see that we have many competing views, what a surprise. :)

I agree with the general feeling that a proof track in SMT-COMP, in
the interest of adoption, should not put a heavy (virtually
insurmountable?) burden on the developers. I believe that forcing the
solvers to implement their own elaborators increases this burden
prohibitively. Therefore my view is:

- one format is used for the proof track. Every entrant must print
proofs in that format. This format contains coarse-grained (hard to
check) and fine-grained (easy to check) rules.

- a tool, the validator, is used by the competition that can check
fine-grained steps, maybe can elaborate coarse-grained steps into
fine-grained ones to be checked.

There is a lot of unknowns for the above. For things the validator
can't elaborate (preprocessing passes may be a common instance for
this) one could, as Jochen suggested, generate SMT problems and throw
SMT solvers at them, like we do in the unsat-core track.

Defining ways to score solvers with the above unknowns would be hard
as well. Probably there could be different tracks, for "valid proofs"
vs "good proofs".

What is the format? What is the validator? I think it's not reasonable
to expect a lot of work on this solely for the purpose of running an
SMT-COMP track. Therefore I believe it makes sense to piggyback on
projects with a wider scope. The Alethe format (bias alert: I'm
involved with it) is supported by multiple SMT solvers (veriT and
cvc5), uses SMT-LIB as term language, has coarse- and fine-grained
steps, and is integrated into into Coq [1] and Isabelle/HOL [2]. The
project is actively developed and multiple ways of extending it are
being considered. An independent checker, in Rust, is under
development and is also indented to elaborate proofs into Alethe's
own fine-grained steps but also to other backends, like LFSC, Lean or
other low-level proof checkers people may have and want to contribute
to. It's not a closed project.

So it might be interest to consider it instead of new formats. Or
translations into it from your format of choice.

A note on François' two cents: I agree that one can handle
coarse-grained steps in a trusted way with certified complex checkers,
but given the sheer scope of SMT-LIB theories and all the moving parts
that go into them I believe a more flexible approach of pushing this
complexity into elaborators that do not need to be correct by
construction is more suitable.

Best,

[1] SMTCoq: A Plug-In for Integrating SMT Solvers into Coq
[2] Reliable Reconstruction of Fine-grained Proofs in a Proof Assistant
--
Haniel Barbosa
https://homepages.dcc.ufmg.br/~hbarbosa/

Viktor Kunčak

unread,
Jul 21, 2021, 7:14:52 PM7/21/21
to smt...@googlegroups.com
Nikolaj,
Does this format also work for quantified formulas?
Do you have implications representing quantifier instantiations, so they fall into 4. or 5. ?

Reply all
Reply to author
Forward
0 new messages