ann: THORN 18

58 views
Skip to first unread message

dr.mt...@gmail.com

unread,
Jun 19, 2025, 7:43:11 AMJun 19
to Shen
This is a revamped version.  I edited the code on paramodulation and improved
proof generation for equality proofs.   THORN 18 accepts orthodox first-order 
notation, (so x = y is [x = y]) .  THORN requires the standard library to be loaded.You can download THORN here.

Other than that, the system is remarkably simple to use.  Simply download,
enter Shen, type (load "THORN 18.shen") w.o. type checking enabled and 
(load "datatypes.shen") and switch on the type checker.   Go to the folder
Problems and load any of the files there.  There's lots of examples of
THORN propositions there.  BTW prop is a type in THORN.   For any
successful run there is a file prf.txt in your home directory containing a proof.

The doc page for THORN is here.  It details the settings you can twiddle, including
time limits, complexity limits, depth of search etc.  

I'd be very interested in hearing the results from Mac users under Shen/Scheme on
entering (load "schubert.shen") in the Problems folder.   I get 

(<-kb (steamroller))
run time: 2.5940093994140625 secs
5088693 inferences, paramodulation = false
depth = 20, complexity = -1, timeout = 60 secs
true : boolean

under Windows SBCL/Shen.  I think you Mac+Shen/Scheme guys will beat this.
Have a look at prf.txt in your home directory for the gemerated proof.

Mark

Bruno Deferrari

unread,
Jun 19, 2025, 9:56:34 AMJun 19
to qil...@googlegroups.com
This is what I got (this is a quite fast macbook I bought last year, with and M3 Max CPU):

run time: 0.2779609999999999 secs

5088693 inferences, paramodulation = false
depth = 20, complexity = -1, timeout = 60 secs
true : boolean


--
You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+un...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/qilang/f80b8110-ee1e-4216-a2d6-7dd5ff62a31en%40googlegroups.com.


--
BD

dr.mt...@gmail.com

unread,
Jun 19, 2025, 10:48:11 AMJun 19
to Shen
An order of magnitude - about 20 MLips which is quite respectable.
There is an old paper from 1984 detailing Schubert's Steamroller
which was unsolved by computer at that time.  But the guy who
got the computer to produce a proof used a many sorted axiomatisation
to do it which effectively meant the problem was annotated with 
clues to its solution.  

Interestingly ChatGPT gets the history of this problem wrong.

The first attempt to tackle Schubert’s Steamroller using unsorted first-order logic was made in the late 1970s with the Markgraf Karl Refutation Procedure (MKRP)—a theorem prover in development at Karlsruhe. In 1978, Christoph Walther and colleagues fed the unsorted axiomatization into MKRP, but it failed to find a refutation sciencedirect.com+8swi-prolog.discourse.group+8d-nb.info+8.

So, to be clear:

  • MKRP (unsorted resolution) was the first system to try proving the problem without using many‑sorted logic—and it did not succeed en.wikipedia.org+6swi-prolog.discourse.group+6d-nb.info+6.

  • The first successful automated proof came only later—and only when sorted logic was employed (SAM and Walther’s work in the 1980s).

In short, no system without sorted logic ever solved Steamroller; MKRP was the first to try and the first to fail.

The statement in red is false.  I can't find a reference for the first ATP without using sorts to solve this problem; but Prover9 can and Vampire will I'm sure.  

Mark

dr.mt...@gmail.com

unread,
Jun 20, 2025, 10:40:05 AMJun 20
to Shen
Very good.  I think I'll take the MAC/Scheme timings as canonical for what THORN can do
in my write-up.  If you load eq.shen in Problems you'll also get a figure.  Mine under SBCL/Shen is

run time: 4.203125 secs
4144451 inferences, paramodulation = false
depth = 13, complexity = -1, timeout = 60 secs
true : boolean

run time: 4.578125 secs

How do you fare on the Mac? That is a humanly unsolvable problem BTW from equivalential calculus.

Mark  

On Thursday, 19 June 2025 at 14:56:34 UTC+1 Bruno Deferrari wrote:

Bruno Deferrari

unread,
Jun 20, 2025, 10:46:05 AMJun 20
to qil...@googlegroups.com
This is what I got:

run time: 0.46019299999999996 secs

4144451 inferences, paramodulation = false
depth = 13, complexity = -1, timeout = 60 secs
true : boolean

run time: 0.517352 secs



--
BD

dr.mt...@gmail.com

unread,
Jun 20, 2025, 11:07:41 AMJun 20
to Shen
Great; an order of magnitude.  In the file pelletier.shen, you will find
a list of all the Pelletier challenge problems for propositional calculus
including one that is commented out

\\ (<-kb [[[p <=> q] <=> r] <=> [p <=> [q <=> r]]])  blows SBCL stack

I think you can uncomment that problem.  If you do that and load
pelletier.shen what do you get?

I can't do this in SBCL/Shen as said.

Mark

Bruno Deferrari

unread,
Jun 20, 2025, 11:24:15 AMJun 20
to qil...@googlegroups.com
This is the result (in red the one that blows the stack in SBCL):

run time: 2.169999999999117e-4 secs
416 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 5.100000000002325e-5 secs
21 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 5.699999999997374e-5 secs
29 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 1.340000000000785e-4 secs
384 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 1.1899999999998023e-4 secs
303 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 3.4000000000089514e-5 secs
21 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 3.500000000000725e-5 secs
21 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 6.799999999995698e-5 secs
126 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 7.769999999998056e-4 secs
4212 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 0.002286999999999928 secs
9356 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 0.019735000000000058 secs
142158 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean


run time: 3.769999999998497e-4 secs
2494 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 5.450000000000177e-4 secs
4148 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 5.200000000016303e-5 secs
126 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 2.799999999991698e-5 secs
29 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 0.0012090000000000156 secs
7552 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

run time: 0.04309800000000008 secs



--
BD

dr.mt...@gmail.com

unread,
Jun 20, 2025, 2:40:35 PMJun 20
to Shen
Brilliant wrt 

run time: 0.019735000000000058 secs
142158 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

If you run this on its own how long is the proof in prf.txt in steps?
You should find that file in the home directory.

Mark

Bruno Deferrari

unread,
Jun 20, 2025, 3:03:24 PMJun 20
to qil...@googlegroups.com
540 steps, you can see the full output here: https://pastebin.com/raw/RFhyAt9S

Btw, do you see any difference in performance in SBCL/Shen if you disable this file I/O?
In the case of Shen/Scheme I/O operations are somewhat optimized and take advantage of buffering, IIRC SBCL/Shen flushed output for every byte, so that may be adding considerable overhead.



--
BD

dr.mt...@gmail.com

unread,
Jun 20, 2025, 3:31:52 PMJun 20
to Shen
Yes, massive.  THORN can solve problems quickly but sometimes the proofs are very large.
They are however, quite readable if not exactly exciting.  

In Prover9 they print times in the form x secs + y secs.  The x refers to the time taken to solve
the problem and the y the time to produce the proof.   Undoubtedly in many of these problems
y is much greater than x.  But I haven't bothered seperating the two as does Prover9, partly because
giving the proof is IMO essential in solving the problem.

FYI, THORN actually solves every problem twice.  The first time it carries around a free variable Prf
which is bound to the rules invoked in the course of finding a solution and their order.  Then it runs the problem
again with the bound value of Prf.  The Prolog engine is forced to behave deterministically by this
bound value.  So what we are pleased to call a proof in THORN is a trace of the execution of a 
deterministic logic program.

Well done optimising I/O.  You have a talent for this work.

Mark

dr.mt...@gmail.com

unread,
Jun 20, 2025, 5:18:06 PMJun 20
to Shen
The last two challenge problems; again from equivalential calculus.
Again load ec.shen.   Now enter

(kb-> [(mp) (yqf)])

and then (<-kb (ec)).

And 

(kb-> [(mp) (yqj)])

and again

(<-kb (ec)).   See what you get in terms of proofs and performance.

And that wraps it all up.  Thanks again.

Mark Tarver

unread,
Jun 20, 2025, 5:21:07 PMJun 20
to qil...@googlegroups.com
BTW, there are a dozen or so formulations of equivalential calculus given code names
YQF and YQJ are two of them.   There are also levels of difficulty for these systems,
because some of these formulations make it harder to prove the same theorems.

M.

Bruno Deferrari

unread,
Jun 20, 2025, 7:34:24 PMJun 20
to qil...@googlegroups.com
For the first I get: https://pastebin.com/raw/6Qutnts5

run time: 0.0073190000000000754 secs
29244 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean

and for the second: https://pastebin.com/raw/ZdfkxEgT

run time: 0.24884700000000004 secs
1745565 inferences, paramodulation = false

depth = 20, complexity = -1, timeout = 60 secs
true : boolean



--
BD

dr.mt...@gmail.com

unread,
Jun 20, 2025, 8:22:35 PMJun 20
to Shen
That wraps it up.  Here is a summary.

Problem                                Time (sec) Inferences MegaLIPS Steps in Proof
Schubert’s Steamroller 0.278         5,088,693 18.3         78
Pelletier (all)                   0.025         171,270         6.85 (longest) 540
Equivalential Calculus
YQL axiomisation                 0.46          4,144,451 9.09         75
YQF axiomisation              0.007          29,244         4.17         85
YQJ axiomisation                 0.249          1,745,565  7.01         69

thx 

Mark

Reply all
Reply to author
Forward
0 new messages