Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Is it possible to count solutions that cannot fit in memory?

0 views
Skip to first unread message

killy971

unread,
Jan 9, 2009, 11:59:00 AM1/9/09
to
I have a predicate partition(N, P) that generate partitions for a
given N. For example, partition(5, P) would give:

?- partition(5, P).
P = [5] ;
P = [1, 4] ;
P = [2, 3] ;
P = [1, 1, 3] ;
P = [1, 2, 2] ;
P = [1, 1, 1, 2] ;
P = [1, 1, 1, 1, 1] ;
fail.

I would like to count all the solutions for N = 100. I tried to use
"findall", but it failed because of the global stack memory limit. Is
there a way just to count all the solutions, with a small memory
footprint?

Chip Eastham

unread,
Jan 9, 2009, 3:23:29 PM1/9/09
to

Yes, you could save the bulk of the memory by doing
something like:

partitionCount(N,P,_) :-
retractall(myCounter/1),
assert(myCounter(0)),
partition(N,P),
partitionIncFail.

partitionCount(_,_,X) :-
myCounter(X).

partitionIncFail :-
retract(myCounter(I)),
J is I+1,
assert(myCounter(J)),
fail.

Unfortunately this still consumes a lot of time
despite the space savings. For counting large
numbers of (unrestricted, aka unordered)
partitions without having to construct any of
them, a recursive formula for that purpose is
more practical:

http://mathworld.wolfram.com/PartitionFunctionP.html

http://en.wikipedia.org/wiki/Partition_(number_theory)

Some Java code for this and variations is presented
by Henry Bottomley here:

http://www.btinternet.com/~se16/js/partitionstest.htm


regards, chip

bart demoen

unread,
Jan 12, 2009, 3:06:23 PM1/12/09
to
On Fri, 09 Jan 2009 12:23:29 -0800, Chip Eastham wrote:

> On Jan 9, 11:59 am, killy971 <killy...@gmail.com> wrote:

[...]
> regards, chip


[What follows is not meant to rebuke Chip, but his/her contribution
sparked it]

To count the number of solutions to something, one strategy is to find
out what the recurrence equation is that generates that number. From
that recurrence equation, one can very often directly derive a program
that computes it, and often this program is quite inefficient because
it has the wrong complexity. If you have a system with tabling, you
are often in luck, and get almost for free an efficient version of
this inefficient program - just add a declaration and done: someone
from the tabling camp can show this for sure. I will just try to make
a reasonably efficient program in another way. Say we have the
fibonacci recurrence equation:

fib(1) = 1 | initial condition 1
fib(2) = 1 | initial condition 2
fib(N) = fib(N-1) + fib(N-2) for N > 2 | inductive definition

then the program that solves this equation is:

fib(N,Out) :-
(N =< 2 ->
Out = 1
;
N1 is N - 1,
N2 is N - 2,
fib(N1,Out1),
fib(N2,Out2),
Out is Out1 + Out2
).

This works, and is slow, and for even small N exhausts our patience.

Here is an alternative: it follows a general principle and I will show
later its application to the partition counting problem.

fib(N,Res) :- todo([1*fib(N)],0,Res).

todo([],I,I).
todo([X*fib(N)|R],I,O) :-
(N =< 2 ->
I1 is I + X,
todo(R,I1,O)
;
N1 is N - 1,
N2 is N - 2,
merge([X*fib(N1),X*fib(N2)],R,NewFibs),
todo(NewFibs,I,O)
).

merge(Fibs1,Fibs2,O) :-
(
Fibs1 == [] ->
O = Fibs2
;
Fibs2 == [] ->
O = Fibs1
;
Fibs1 = [F1|R1],
Fibs2 = [F2|R2],
F1 = X1*Fib1,
F2 = X2*Fib2,
(
Fib1 @> Fib2 ->
O = [F1|TailO],
merge(R1,Fibs2,TailO)
;
Fib1 == Fib2 ->
X is X1 + X2,
O = [X*Fib1|TailO],
merge(R1,R2,TailO)
;
O = [F2|TailO],
merge(Fibs1,R2,TailO)
)
).

The todo-predicate has a list of things we need to compute. We keep
this list ordered from most "difficult" to "easier".
The merge predicate makes sure of that.
We also have a multiplicity for things that need to be computed more
than once and we have detected that: it is our poor-man's tabling.

The first branch of the todo-predicate uses the base cases of the fib
definition.
The recursive branch of the todo-predicate uses the inductive part of
the fib definition.

Let's look at how the todo-list evolves at the call port of todo/3 for
the query ?- fib(7,X).

[1 * fib(7)]
[1 * fib(6),1 * fib(5)]
[2 * fib(5),1 * fib(4)]
[3 * fib(4),2 * fib(3)]
[5 * fib(3),3 * fib(2)]
[8 * fib(2),5 * fib(1)]
[5 * fib(1)]
[]

See how it never gets long and the role of the multiplicities ?
Computing fib(10000,X) is no problem now (if your Prolog system has
bigints - if it doesn't complain to the vendor, or change to an
open-source system :-)

Now apply the same principle to counting the number of partitions.
The recurrence equation is a bit more involved than for fib, but not
unduly so - note that I am not claiming the following is the only way
to attack the partition problem, it is just what came to my mind today.

Let part(N) denote the number of partitions of N, and let
part(N,I) denote the number of partitions of N which start with I.
Here are some equations:

part(N) == part(N+1,1)

part(N,I) == part(N-I,I) + part(N-I,I+1) + part(N-I,I+2) + ...

Note that part(N,I) will be zero if I is too large
and that

part(N,N) == 1


Here is the corresponding program (I think):

count_partition(N,Res) :-
Accu = 0,
M is N + 1,
todo([1*part(M,1)],Accu,Res).

todo([],Res,Res).
todo([X*part(N,I)|Todos],Accu,Res) :-
(N == I ->
Accu1 is Accu + X,
todo(Todos,Accu1,Res)
;
expand(N,I,X,MoreTodos),
merge_todos(MoreTodos,Todos,NewTodos),
todo(NewTodos,Accu,Res)
).

expand(N,I,X,MoreTodos) :-
findall(X*part(NewN,NewI),newpart(N,I,NewN,NewI),MoreTodos).

newpart(N,I,NewN,NewI) :-
NewN is N - I,
downto(NewN,I,NewI).

merge_todos(T1,T2,Out) :-
(
T1 == [] ->
Out = T2
;
T2 == [] ->
Out = T1
;
T1 = [F1|R1],
T2 = [F2|R2],
F1 = X1*P1,
F2 = X2*P2,
(
P1 @> P2 ->
Out = [F1|RestOut],
merge_todos(R1,T2,RestOut)
;
P1 == P2 ->
X is X1 + X2,
Out = [X*P1|RestOut],
merge_todos(R1,R2,RestOut)
;
Out = [F2|RestOut],
merge_todos(T1,R2,RestOut)
)
).

downto(J,I,X) :-
J >= I,
(
X = J
;
J1 is J - 1,
downto(J1,I,X)
).

And now the query ?- count_partition(100,Res). answers in a jiffy :-)

You can see that the most ugly thing - the merge - is the same in fib
and partition.

Why did we keep the todo-list ordered from difficult to easy ?
If you can answer that questionstraight away, you earn a bonus :-)
If not, try changing the merge to do the opposite in fib (make sure to
interchange the fib(N1) and fib(N2) as well) and observe how the
todo-list evolves - it is not a pretty sight, but worth the experience.

So far my Christmas present to you :-)

Cheers

Bart Demoen

ps. sorry for any mistakes above: most of it was tested, but one never
knows

killy971

unread,
Jan 15, 2009, 10:07:51 AM1/15/09
to

Thank you very much for all the answers, I learned a lot!
Still, it was true that implementing an algorithm only for counting
the number of partitions was really faster than creating the
partitions and counting them...

bart demoen

unread,
Jan 15, 2009, 2:30:57 PM1/15/09
to
On Thu, 15 Jan 2009 07:07:51 -0800, killy971 wrote:

> Thank you very much for all the answers, I learned a lot! Still, it was
> true that implementing an algorithm only for counting the number of
> partitions was really faster than creating the partitions and counting
> them...

Sorry, this may be a language thing (me not being native english) but I
don't understand what you mean.

Cheers

Bart Demoen

fodor...@gmail.com

unread,
Jan 15, 2009, 3:11:32 PM1/15/09
to
> If you have a system with tabling, you
> are often in luck, and get almost for free an efficient version of
> this inefficient program - just add a declaration and done: someone
> from the tabling camp can show this for sure.

:- table(count_partition/2).
count_partition(N,R):-
M is N+1,
count_partition(M,1,R). % part(N) = part(N+1,1)

count_partition(N,N,1):-
!.
count_partition(N,I,0):-
I > N,
!.
count_partition(N,I,R):-
N > I,
N1 is N - I,
count_partition_downto(N1,I,0,R). % part(N,I) == part(N-I,I) + part
(N-I,I+1) + part(N-I,I+2) + ...

:- table(count_partition_downto/4).
count_partition_downto(N1,M,R,R):-
M > N1,
!.
count_partition_downto(N1,I,PartialR,R):-
count_partition(N1,I,R1),
R2 is PartialR + R1,
I1 is I + 1,
count_partition_downto(N1,I1,R2,R).

xsb -e "['count_partition.pl'],cputime(T0),count_partition
(100,X),cputime(T1),DT is T1-T0,write(X),nl,write(DT),nl."
190569292
0.0720

xsb -e "['count_partition_Bart.pl'],cputime(T0),count_partition
(100,X),cputime(T1),DT is T1-T0,write(X),nl,write(DT),nl."
190569292
0.6160

fodor...@gmail.com

unread,
Jan 15, 2009, 3:15:21 PM1/15/09
to
I made a mistake in the previous post:
:- table(count_partition/2).
instead of
:- table(count_partition/3).

xsb -e "['count_partition.pl'],cputime(T0),count_partition
(100,X),cputime(T1),DT is T1-T0,write(X),nl,write(DT),nl."
190569292

0.0560

On Jan 15, 3:11 pm, "fodor.p...@gmail.com" <fodor.p...@gmail.com>
wrote:

fodor...@gmail.com

unread,
Jan 15, 2009, 7:57:28 PM1/15/09
to
While comparing the tabled version of the partition predicate with the
non-tabled version, I found that combining difference lists and
tabling produces very bad results. I cannot yet explain why is this.

Example 1: the partition(N, P) predicate that generate partitions for
a given size N.
xsb yap
without_tabling 0.4040 0.2
with_tabling 2.4080 3.536

partition(N,P):-
partition(N,N,[],P).

:- table(partition/4). % comment out for without_tabling
partition(0,_,P,P).
partition(N,Max,P,R):-
downto(I,N,1),
I =<Max,
N2 is N - I,
partition(N2,I,[I|P],R).

:- table(downto/3).
downto(I,I,J) :- I >= J.
downto(K,I,J) :- I > J,
I1 is I - 1,
downto(K,I1,J).

?- partition(5, P).
P = [5] ;
P = [1, 4] ;
P = [2, 3] ;
P = [1, 1, 3] ;
P = [1, 2, 2] ;
P = [1, 1, 1, 2] ;
P = [1, 1, 1, 1, 1] ;
fail.

xsb -e "['p_01.P'],cputime(T0),(partition(45,P),fail;write
(finish)),cputime(T1),DT is T1-T0,nl,write(DT),nl,halt."
# 0.4040

yap -g "['p_01.P'],T0 is cputime,(partition(45,P),fail;write
(finish)),T1 is cputime,DT is T1-T0,nl,write(DT),nl,halt."
# 0.2

xsb -e "['p_02.P'],cputime(T0),(partition(45,P),fail;write
(finish)),cputime(T1),DT is T1-T0,nl,write(DT),nl,halt."
# 2.4080

yap -g "['p_02.P'],T0 is cputime,(partition(45,P),fail;write
(finish)),T1 is cputime,DT is T1-T0,nl,write(DT),nl,halt."
# 3.536

Another examples:
********************************************************************************
Example 2: Pre-order tree parsing with difference lists

:- table(pre_ord_dif/3).
pre_ord_dif(nil,L,L).
pre_ord_dif(t(Left,Nod,Right),Lp,Lu):-
pre_ord_dif(Left,Lpst,Lust),
pre_ord_dif(Right,Lpdr,Ludr),
Lp=[Nod|Lpst],
Lust=Lpdr,
Ludr=Lu.

numTree(I,J,Tree) :-
( I =< J ->
Tree = t(nil,I,Rest),


I1 is I + 1,

numTree(I1,J,Rest)
;
Tree = nil
).

xsb -e "['p_07.P'],numTree(1,400,T),cputime(T0),pre_ord_dif(T,Y,
[]),cputime(T1),DT is T1-T0,nl,write(DT),nl,halt."
# 0

yap -g "['p_07.P'],numTree(1,400,T),T0 is cputime,pre_ord_dif(T,Y,
[]),T1 is cputime,DT is T1-T0,nl,write(DT),nl,halt."
# 0

xsb -e "['p_08.P'],numTree(1,400,T),cputime(T0),pre_ord_dif(T,Y,
[]),cputime(T1),DT is T1-T0,nl,write(DT),nl,halt."
# 0.012

yap -g "['p_08.P'],numTree(1,400,T),T0 is cputime,pre_ord_dif(T,Y,
[]),T1 is cputime,DT is T1-T0,nl,write(DT),nl,halt."
# 0.016


********************************************************************************
Example 3: A cartesian product of two lists

:- table(cart_prod/4).
cart_prod([],_L,L1,L1).
cart_prod([H1|T1],L1,L,R):-insertElem(H1,L1,L,R1),cart_prod
(T1,L1,R1,R).

:- table(insertElem/4).
insertElem(X,L,L1,R1):-insertElem(X,L,R),append(L1,R,R1).

:- table(insertElem/3).
insertElem(_X,[],[]):-!.
insertElem(X,[H|T],[[X|H]|R]):-insertElem(X,T,R).

:- table(append/3).
append([],L,L).
append([H|T],L,[H|R]):-
append(T,L,R).

numlist(I,J,List) :-
( I =< J ->
List = [I|Rest],


I1 is I + 1,

numlist(I1,J,Rest)
;
List = []
).

xsb -e "['p_05.P'],numlist(1,200,L),cputime(T0),cart_prod(L,L,
[],Y),cputime(T1),DT is T1-T0,nl,write(DT),nl,halt."
# 0.4300

yap -g "['p_05.P'],numlist(1,200,L),T0 is cputime,cart_prod(L,L,
[],Y),T1 is cputime,DT is T1-T0,nl,write(DT),nl,halt."
# 0.21

xsb -e "['p_06.P'],numlist(1,200,L),cputime(T0),cart_prod(L,L,
[],Y),cputime(T1),DT is T1-T0,nl,write(DT),nl,halt."
# stack overflow on my machine

yap -g "['p_06.P'],numlist(1,200,L),T0 is cputime,cart_prod(L,L,
[],Y),T1 is cputime,DT is T1-T0,nl,write(DT),nl,halt."
# stack overflow on my machine
********************************************************************************
Example 4: Flattening a list with difference lists.

flatList(X,Y):-flatList(X,Y,[]).

:- table(flatList/3).
flatList([],N,N).
flatList(H,[H|T],T):-atomic(H).
flatList([H|T],Np,Nu):-flatList(H,Np,Nn),flatList(T,Nn,Nu).

Bart Demoen

unread,
Jan 16, 2009, 3:20:39 AM1/16/09
to
On Thu, 15 Jan 2009 16:57:28 -0800, fodor...@gmail.com wrote:

> While comparing the tabled version of the partition predicate with the
> non-tabled version, I found that combining difference lists and tabling
> produces very bad results. I cannot yet explain why is this.

Just about the partition problem: don't table downto: it is cheaper to
recompute - but that's not the big problem of course: change the tailcall
partition(N2,I,[I|P],R).
into
partition(N2,I,X,R), X = [I|P].
and observe the speedup.
Also compare the table space with and without tabling and/or the latter
change.

Now generalize and tell us about it.

Cheers

Bart Demoen

Bart Demoen

unread,
Jan 16, 2009, 7:37:30 AM1/16/09
to
On Thu, 15 Jan 2009 16:57:28 -0800, fodor...@gmail.com wrote:


********************************************************************************
> Example 2: Pre-order tree parsing with difference lists
>
> :- table(pre_ord_dif/3).
> pre_ord_dif(nil,L,L).
> pre_ord_dif(t(Left,Nod,Right),Lp,Lu):-
> pre_ord_dif(Left,Lpst,Lust),
> pre_ord_dif(Right,Lpdr,Ludr),
> Lp=[Nod|Lpst],
> Lust=Lpdr,
> Ludr=Lu.
>
> numTree(I,J,Tree) :-
> ( I =< J ->
> Tree = t(nil,I,Rest),
> I1 is I + 1,
> numTree(I1,J,Rest)
> ;
> Tree = nil
> ).

Tabling during one query in which all left and right subtrees are
different, is expected to be slower because it only adds overhead: at now
point can you reuse the result of an earlier computation. The difference
list accounts for about 40% of the space over head (putting the answers
in the tries), but little in the time overhead, as you can check by
deleting the last two arguments of pre_ord_diff.

So, different reason than with the accumulating parameter :-)

Cheers

Bart Demoen

fodor...@gmail.com

unread,
Jan 19, 2009, 2:45:48 PM1/19/09
to
On Jan 16, 3:20 am, Bart Demoen <b...@cs.kuleuven.be> wrote:

Hi Bart,
You are right, the execution is much faster because there are less
entries in the call-answer table (since less call variants).
In general, I can say that tabling predicates taking lists as
arguments has often very bad results because many calls (of possible
long list sizes) are saved/searched into memory.
Regards,
Paul.

bart demoen

unread,
Jan 19, 2009, 3:01:29 PM1/19/09
to
On Mon, 19 Jan 2009 11:45:48 -0800, fodor...@gmail.com wrote:


> Hi Bart,
> You are right, the execution is much faster because there are less
> entries in the call-answer table (since less call variants). In general,
> I can say that tabling predicates taking lists as arguments has often
> very bad results because many calls (of possible long list sizes) are
> saved/searched into memory. Regards,
> Paul.

Yes - and sometimes one can do something about it. For instance, tabling
plain calls/answers for reverse/2 can result in very bad behaviour, but
tabling the most general answers to reverse/2 is a good thing. E.g.,
remembering that

the goal ?- reverse([1,2,3],L). yields L = [3,2,1]

is practically worthless, but remembering that

the goal ?- reverse([A,B,C],L). yields L = [C,B,A]

is more generally applicable and can improve efficiency.

I vaguely remember having done something like that in a paper with Paul
Tarau ... On Delphi lemmas and other memoing techniques for deterministic
logic programs, also with Koen De Bosschere; J. Logic Programming 30 (2),
pp. 145-164, Februari, 1997.

Cheers

Bart Demoen

Chip Eastham

unread,
Jan 27, 2009, 1:33:21 PM1/27/09
to
On Jan 12, 3:06 pm, bart demoen <b...@cs.kuleuven.be> wrote
[in part]:

> To count the number of solutions to something,
> one strategy is to find out what the recurrence
> equation is that generates that number. From
> that recurrence equation, one can very often
> directly derive a program that computes it, and
> often this program is quite inefficient because

> it has the wrong complexity...

I wrote some code that tries to reduce the
complexity from O(n^2) to O(n^1.5). Euler's
recurrence relation expresses the unrestricted
partitions p(n) of a number n in terms of some
O(sqrt(n)) terms p(m) where 0 < m < n. It has
only nonzero coefficients that are +/-1, and
we can get by with just adding/subtracting for
arithmetic.

I checked p(10) = 42 and p(100) = 190569292 are
correct. Actually the output lists p(m) for all
m = 1 to n. But I haven't fitted enough points
to suggest a time complexity yet.

regards, chip

/*
Compute count of unrestricted partitions of a number
p(n) using Euler's recurrence relation:

p(n) = -SUM (-1)^k [p(n-k(3k-1)/2) + p(n-k(3k+1)/2)]
k>0

for n > 0, subject to initial conditions:

p(0) = 1 and p(n) = 0 for n < 0.
*/

partEuler(N,PartList) :-
rowEuler(N,1,state(1,1,1,2),RowEuler),
toeplitz(RowEuler,RowEuler,PartList).

rowEuler(N,I,_,[]) :- N < I, !.
rowEuler(N,I,state(J,U,M,P),[H|T]) :-
K is I+1,
( I=P -> nextState(state(J,U,M,P),State)
; State = state(J,U,M,P)
),
( (I=M;I=P) -> H = U
; H = 0
),
!,
rowEuler(N,K,State,T).

nextState(state(J,U,M,P),state(O,V,E,R)) :-
C is J + J + J,
O is J + 1,
V is -U,
E is M + C + 1,
R is P + C + 2.

toeplitz([],_,[]).
toeplitz([A|P],X,[A|Q]) :-
row_AXPY(A,X,P,Y),
toeplitz(Y,X,Q).

row_AXPY(_,_,[],[]) :- !.
row_AXPY(A,[0|X],[H|P],[H|Y]) :-
!,
row_AXPY(A,X,P,Y).
row_AXPY(A,[1|X],[H|P],[K|Y]) :-
K is H+A,
!,
row_AXPY(A,X,P,Y).
row_AXPY(A,[-1|X],[H|P],[K|Y]) :-
K is H-A,
!,
row_AXPY(A,X,P,Y).

regards, chip

0 new messages