Prolog performance

185 views
Skip to first unread message

Leszek Buczkowski

unread,
May 25, 2025, 3:06:01 PMMay 25
to Shen
If anyone was curious, here is a quick speed comparison of Prolog in Shen (both SBCL and Scheme version) with SWI-Prolog and Trealla Prolog.
The test was performed on a MacBook Air M3.

The Prolog version of the program:

permute_with_insert([], []).
permute_with_insert([L|Ls0], Ps) :-
    permute_with_insert(Ls0, Ls),
    insert(L, Ls, Ps).

insert(X, Ys, [X|Ys]).
insert(X, [Y|Ys], [Y|Zs]) :-
    insert(X, Ys, Zs).    

benchmark(Ls) :-
    time((permute_with_insert(Ls, _), fail ; true)).

main :-
    benchmark([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]),
    halt.

The Shen version of the program (insert renamed to insert1 because Shen SBCL was complaining):

(defprolog permute_with_insert
  [] [] <-- ;    
  [L|Ls0] Ps <-- (permute_with_insert Ls0 Ls) (insert1 L Ls Ps);
)  

(defprolog insert1
  X Ys [X|Ys] <-- ;
  X [Y|Ys] [Y|Zs] <-- (insert1 X Ys Zs);
)

(defprolog benchmark
  Ls <-- (permute_with_insert Ls _) (when false);
  Ls <-- ;
)

(time (prolog? (benchmark [1 2 3 4 5 6 7 8 9 10 11 12])))


Trealla Prolog

~/prolog % tpl -g main benchmark.pro
% Time elapsed 92.965s, 1480959528 Inferences, 15.930 MLips

SWI-Prolog

~/prolog % swipl -g main benchmark.pro
% 522,956,326 inferences, 31.920 CPU in 31.965 seconds (100% CPU, 16383472 Lips)

Shen SBCL

~/shen % shen-sbcl
Shen, www.shenlanguage.org, copyright (C) 2010-2024, Mark Tarver
version: S39, language: Common Lisp, platform: SBCL 2.0.0
port 3.3, ported by Mark Tarver

(0-) (load "benchmark.shen")
(fn permute_with_insert)
(fn insert1)
(fn benchmark)

run time: 449.5423288717866 secs
true

Shen Scheme

~/shen % shen-scheme eval -l benchmark.shen
(fn permute_with_insert)
(fn insert1)
(fn benchmark)

run time: 173.675523 secs
true


Leszek


dr.mt...@gmail.com

unread,
May 25, 2025, 3:36:23 PMMay 25
to Shen
That's interesting.  By default Shen Prolog uses occrs check unification.  Did you
try entering (occurs-check -) and then compiling and running this benchmark?  
It might make a difference.

Mark

Leszek Buczkowski

unread,
May 25, 2025, 4:10:33 PMMay 25
to Shen
There is no timing difference between:

(occurs-check -)
(time (prolog? (benchmark [1 2 3 4 5 6 7 8 9 10 11 12])))

and:

(occurs-check +) 

(time (prolog? (benchmark [1 2 3 4 5 6 7 8 9 10 11 12])))

in either the Scheme or SBCL versions.

Leszek

dr.mt...@gmail.com

unread,
May 25, 2025, 4:19:56 PMMay 25
to Shen
OK; this is somewhat different from my experience.  Did you recompile your program
after disabling occurs checking?   Because if not, then you won't get any difference.
This is a compiler directive not a dynamic instruction.   Here is what I get.


Shen, www.shenlanguage.org, copyright (C) 2010-2024, Mark Tarver
version: S39.0, language: Scheme, platform: chez-scheme 10.0.0
port 0.38, ported by Bruno Deferrari


(0-) (defprolog permute_with_insert

  [] [] <-- ;
  [L|Ls0] Ps <-- (permute_with_insert Ls0 Ls) (insert1 L Ls Ps);
)
(fn permute_with_insert)

(1-)

(defprolog insert1
  X Ys [X|Ys] <-- ;
  X [Y|Ys] [Y|Zs] <-- (insert1 X Ys Zs);
)
(fn insert1)

(2-)

(defprolog benchmark
  Ls <-- (permute_with_insert Ls _) (when false);
  Ls <-- ;
)
(fn benchmark)

(3-) (time (prolog? (benchmark [1 2 3 4 5 6 7 8 9 10 11])))

run time: 62.28125 secs
true

(4-) (occurs-check -)
false

(5-) (defprolog permute_with_insert

  [] [] <-- ;
  [L|Ls0] Ps <-- (permute_with_insert Ls0 Ls) (insert1 L Ls Ps);
)
(fn permute_with_insert)

(6-)

(defprolog insert1
  X Ys [X|Ys] <-- ;
  X [Y|Ys] [Y|Zs] <-- (insert1 X Ys Zs);
)
(fn insert1)

(7-)

(defprolog benchmark
  Ls <-- (permute_with_insert Ls _) (when false);
  Ls <-- ;
)
(fn benchmark)

(8-) (time (prolog? (benchmark [1 2 3 4 5 6 7 8 9 10 11])))

run time: 37.359375 secs
true

Mark

dr.mt...@gmail.com

unread,
May 25, 2025, 8:33:58 PMMay 25
to Shen
(3-) (do (time (prolog? (benchmark [1 2 3 4 5 6 7 8 9 10 11 12]))) (inferences))

run time: 454.0625 secs
1001957928

w.o. occurs check.

Interesting that Trealla, Shen/Scheme Prolog and SWI all disagree on the LIPS.
Shen/Scheme Prolog is running about 2.25 MLips under my machine.  The Air Book 
processor looks much faster than my Intel processor.

M.


dr.mt...@gmail.com

unread,
May 25, 2025, 8:35:35 PMMay 25
to Shen
And last of all

(do (time (prolog? (benchmark [1 2 3 4 5 6 7 8 9 10 11 12]))) (inferences))

run time: 434.484375 secs
1001957928

This 20 second reduction comes from mode declarations.

(defprolog permute_with_insert
  [] [] <-- ;    
  (- [L|Ls0]) Ps <-- (permute_with_insert Ls0 Ls) (insert1 L Ls Ps);

)  

(defprolog insert1
  X Ys [X|Ys] <-- ;
  X (- [Y|Ys]) [Y|Zs] <-- (insert1 X Ys Zs);

)

(defprolog benchmark
  Ls <-- (permute_with_insert Ls _) (when false);
  Ls <-- ;
)

M.

Leszek Buczkowski

unread,
May 26, 2025, 5:46:01 AMMay 26
to Shen
Indeed, previously I had (occurs-check -) placed in code and ran the script as a whole.

Here is the timing for the shen-scheme and the 12 element list:

shen eval -l benchmark.shen
run time: 173.494235 secs

shen eval -l benchmark-modedecl.shen
run time: 172.569299 secs

shen eval -e '(occurs-check -)' -l benchmark.shen
run time: 94.312878 secs

shen eval -e '(occurs-check -)' -l benchmark-modedecl.shen
run time: 93.851341 secs

With occurs checking disabled the speed of the Shen + Chez duo is identical to Trealla Prolog, which is an interesting result.

Leszek

dr.mt...@gmail.com

unread,
May 26, 2025, 10:41:52 AMMay 26
to Shen
Yes, particularly because Shen Prolog runs under 14 languages
and was constrained to avoid any platform specific optimisations!
The pre S version would not run your benchmark because it lacked
an effective memory manager for Shen Prolog.   However only Scheme 
and SBCL AFAIK run the S version.  If I am wrong here, somebody can 
correct me.

Trealla Prolog is apparently a Prolog interpreter written in C.   That
sort of takes me back because the first widely available Prolog -
C Prolog - was the same.  It came out in 1981 and ran at 1 KLIP
on a VAX-750 if I remember.   How it would measure against Trealla
on modern hardware is an interesting guess.

Mark 

Leszek Buczkowski

unread,
May 26, 2025, 12:23:46 PMMay 26
to Shen

We’ve got an in-house Prolog that was written for the HP 9000 back in the 1980s and hasn’t really been touched since. The benchmark runs in about 200 seconds. I’m guessing C-Prolog would perform similarly. I tried building it once, but the only source I found seems to be broken: https://www.softwarepreservation.org/projects/prolog/edinburgh/src.

Leszek

Bruno Deferrari

unread,
May 26, 2025, 12:37:23 PMMay 26
to qil...@googlegroups.com
In Shen/Scheme, code loaded through the REPL runs slower than code precompiled into the boot image.

Also I never made any attempt to optimize the output of the Shen-Prolog->KLambda to Scheme compiler, so there is room for improvement.

One thing I remember is that Shen-Prolog produces a decent amount of intermediary lambdas: some called in the function's body,
others to be used as continuations that are passed around.
The ones that never leave the function's body could be processed to lift them and make sure they do not reference any free variables,
similar to what happens with function factorization.
This allows the Chez Scheme compiler to skip closure allocations and use direct jumps instead of closure calls.

For the continuations that are passed around I am not sure, maybe some inlining could help, or a conversion that 
uses trampolines and avoids allocating and passing closures around.

--
You received this message because you are subscribed to the Google Groups "Shen" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qilang+un...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/qilang/9fadc2b4-0d6b-4aa5-9579-20e1bd84d3f5n%40googlegroups.com.


--
BD
Reply all
Reply to author
Forward
0 new messages