:- set_prolog_flag(toplevel_print_anon, false).
call_time(G,T) :- statistics(runtime,[T0|_]), G, statistics(runtime,[T1|_]), T is T1 - T0.
gen_sample_tree_(nil, N) :- N =< 0.
gen_sample_tree_(node(_,X,X), N) :- N > 0, N0 is N-1, gen_sample_tree_(X, N0).
num_leavesA(Tree, N) :- num_leavesA_(Tree, 0, N).
num_leavesA_(nil, N0, N) :- N is N0 + 1.
num_leavesA_(node(_,Left,Right), N0, N) :- num_leavesA_(Left, N0, N1), num_leavesA_(Right, N1, N).
num_leavesB(Tree, N) :- phrase(num_leavesB_(Tree), [0], [N]).
num_leavesB_(nil), [N] --> [N0], { N is N0 + 1 }.
num_leavesB_(node(_,Left,Right)) --> num_leavesB_(Left), num_leavesB_(Right).
$ swipl
Welcome to SWI-Prolog (Multi-threaded, 64 bits, Version 7.3.17)
Copyright (c) 1990-2015 University of Amsterdam, VU Amsterdam
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to redistribute it under certain conditions.
Please visit http://www.swi-prolog.org for details.
For help, use ?- help(Topic). or ?- apropos(Word).
?- compile(semi_dcg).
true.
?- length(_, Depth),
Depth > 20,
gen_sample_tree_(_Tree, Depth),
call_time(num_leavesA(_Tree,_N1), T1),
call_time(num_leavesB(_Tree,_N2), T2),
( _N1 =:= _N2 -> true ; throw(up) ).
Depth = 21, T1 = 405, T2 = 279 ;
Depth = 22, T1 = 809, T2 = 563 ;
Depth = 23, T1 = 1576, T2 = 1095 ;
Depth = 24, T1 = 3143, T2 = 2190 ;
Depth = 25, T1 = 6301, T2 = 4378 ...
num_leavesB_/3
respectively
num_leavesA_/3. Since a binary
tree has 2^n leaves and 2^n-1 nodes. So the pressure
on the second clause is equally high as on the first clause.
> Maybe some argument copying problem of the interpreter deep down
> in the way it calls predicates, so that unintentionally the DCG solution is
> more efficient, since it somehow shares the [_] as if it would be a value holder.
Interesting! Sounds to me like your hypothesis could be tested fairly easily (using the right `statistics/2` keys).
Also, I ran the same test with SICStus and observed (1) it was a lot faster but (2) the semicontext DCG code was moderately slower than its explicit counterpart. But that could get faster with further JIT improvements... why not?
(I have not run that test with the JIT turned off, so I can't tell if the JIT is the main reason why SICStus is faster.)
Here's another possible use of that semicontext notation: handling state changes that update some part(s) of the state a lot more often than others---one part is hot (say for counting leaves), the other one is not hot (say for record bookkeeping of various statistics). What do you think about that?
thank you for sharing the responses / reactions you got!
What's next? Two things come to mind:
- the SICStus JIT could get even better performance once the right optimization(s) are in place; matsC or perM could probably hack that over the weekend:)
- updating different parts of the threaded state with different frequencies (some parts being hot, other not so) should be do-able with that notation quite easily and efficiently. Partial deforestation, kind of.
DCGs are exciting.
Best regards,
Stefan.
?- length(_, Depth),
Depth > 20,
gen_sample_tree_(_Tree, Depth),
call_time(num_leavesA(_Tree,_N1), T1),
call_time(num_leavesB(_Tree,_N2), T2),
( _N1 =:= _N2 -> true ; throw(up) ).
Depth = 21, T1 = 283, T2 = 278 ;
Depth = 22, T1 = 562, T2 = 549 ;
Depth = 23, T1 = 1109, T2 = 1096 ;
Depth = 24, T1 = 2201, T2 = 2195 ...
?- <the same query as above>.
% JIT=disabled %% JIT=on
Depth = 21, T1 = 80, T2 = 90 ? ; %% Depth = 21, T1 = 40, T2 = 40 ? ;
Depth = 22, T1 = 180, T2 = 190 ? ; %% Depth = 22, T1 = 70, T2 = 80 ? ;
Depth = 23, T1 = 350, T2 = 340 ? ; %% Depth = 23, T1 = 120, T2 = 160 ? ;
Depth = 24, T1 = 660, T2 = 670 ? ; %% Depth = 24, T1 = 250, T2 = 300 ? ;
Depth = 25, T1 = 1320, T2 = 1490 ? ; %% Depth = 25, T1 = 480, T2 = 650 ? ;
Depth = 26, T1 = 2610, T2 = 2590 ? ... %% Depth = 26, T1 = 960, T2 = 1280 ? ...
?- listing(num_leavesA_/3).
num_leavesA_(nil, B, A) :-
A is B+1.
num_leavesA_(node(_, A, C), B, E) :-
num_leavesA_(A, B, D),
num_leavesA_(C, D, E).
?- listing(num_leavesB_/3).
num_leavesB_(nil, A, D) :-
A=[B|C],
E is B+1,
F=C,
D=[E|F].
num_leavesB_(node(_, A, C), B, E) :-
num_leavesB_(A, B, D),
num_leavesB_(C, D, E).
?- vm_list(num_leavesA_/3).
----------------------------------------
clause 1 (<clause>(0x7fc21e3849e0)):
----------------------------------------
0 h_atom(nil)
2 i_enter
3 b_var2
4 b_functor((+)/2)
6 b_argvar(1)
8 b_smallint(1)
10 b_pop
11 i_depart((is)/2)
13 i_exit
?- vm_list(num_leavesB_/3).
----------------------------------------
clause 1 (<clause>(0x7fc21e38bea0)):
----------------------------------------
0 h_atom(nil)
2 i_enter
3 b_unify_var(1)
5 h_list_ff(3,4)
8 b_unify_exit
9 a_add_fc(5,3,1)
13 b_unify_fv(6,4)
16 b_unify_var(2)
18 h_list
19 h_var(5)
21 h_var(6)
23 h_pop
24 b_unify_exit
25 i_exit