feeding ChatGPT the ATP paper

60 views

Skip to first unread message

dr.mt...@gmail.com

unread,

Sep 1, 2025, 2:07:08 AMSep 1

to Shen

Based on taking the path of the least resistance, I fed ChatGPT

the paper I wrote with it. It seems to have icked up and after

a few hiccups started proving RCC theorems. The first is

[all x [all y [all z [[[tpp x y] & [ec y z]] => [[dc x z] v [ec x z]]]]]].

(34+) (define plan_tpp_ec_derivable
{--> plan}
-> (let* G (rcc-axioms)
Goal [all x [all y [all z [[[tpp x y] & [ec y z]] => [[dc x z] v [ec x z]]]]]]
\\ Lemmas that are definitional consequences and help reduce search:
Lem1 [all x [all y [[tpp x y] => [pp x y]]]]
Lem2 [all x [all y [[pp x y] => [p x y]]]]
Lem3 [all x [all y [[ec x y] => [c x y]]]]
Lem4 [all x [all y [[ec x y] => [~ [o x y]]]]]

[G Goal (initialise-plan)
(comment "Step 1: Reduce TPP ⇒ PP, PP ⇒ P.")
(kb-> G)
(<-kb Lem1)
(kb-> [Lem1 | G])
(<-kb Lem2)
(kb-> [Lem2 Lem1 | G])
(comment "Step 2: Capture that EC implies C and not O.")
(<-kb Lem3)
(kb-> [Lem3 Lem2 Lem1 | G])
(<-kb Lem4)
(kb-> [Lem4 Lem3 Lem2 Lem1 | G])
(comment "Step 3: With these simplifications, attempt the goal.")
(<-kb Goal)
(end-of-plan)]))
(fn plan_tpp_ec_derivable) : (--> plan)

(35+) (succeeds? (plan_tpp_ec_derivable))

run time: 0.0 secs
95 inferences, equality = false
depth = 20, complexity = -1, timeout = 5 secs

run time: 0.0 secs
184 inferences, equality = false
depth = 20, complexity = -1, timeout = 5 secs

run time: 0.0 secs
152 inferences, equality = false
depth = 20, complexity = -1, timeout = 5 secs

run time: 0.0 secs
218 inferences, equality = false
depth = 20, complexity = -1, timeout = 5 secs

run time: 6.671875 secs
16994605 inferences, equality = false
depth = 20, complexity = -1, timeout = 5 secs
true : boolean

Just look at the figures in that plan; taking the product of the lemmas as a guide to the size of raw search space this baby would take 113 years of CPU time - reduced to 6.67 sec - a speedup around 10^8 (figures calculated by ChatGPT).

ChatGPT does take short cuts (e.g. adding the theorem as an axiom!) which you have to block BUT the insight is quite something. I could not have arrived at that plan w.o. days of thought. What is more impressive is that, AFAIK, there are no sample proofs in RCC from which ChatGPT can draw. RCC is a very obscure and difficult first-order theory.