how to run swi-prolog on multi-core machine

333 views
Skip to first unread message

Kuniaki Mukai

unread,
Jan 28, 2018, 1:19:20 AM1/28/18
to SWI-Prolog
Hi,

Is there a simple way to run a SWI-Prolog program on multi-core machine (e.g. iMac with 4 cores).
Am I asking a difficult question which should not be asked easily here ?

Kuniaki Mukai




Jan Wielemaker

unread,
Jan 28, 2018, 4:14:53 AM1/28/18
to Kuniaki Mukai, SWI-Prolog
Not really sure what you mean. SWI-Prolog runs fine on multi-core
hardware as you probably know. If you actually want to profit, you
have to use the thread predicates. These create multiple solvers
on a shared program. This model is not very useful to compute on a
large data structure. It mostly aims at servers or the need to
execute many small and (mostly) independent tasks.

Cheers --- Jan

Kuniaki Mukai

unread,
Jan 28, 2018, 5:42:59 AM1/28/18
to Jan Wielemaker, SWI-Prolog
Hi Jan,

Thanks for reply. It is interesting that thread programming could control
multi-cores.

I am running my codes “sat_count” on the “notable” benchmark problem(8,_) for ROZDD of Jan Burse
for almost one day long after slightly modifying on hashing. It is still running… Now I am
afraid it keeps running till the next Olympic Game at Tokyo Japan 2020. According to
Activity monitor, it uses always only one core out of 4; the other three are idling. So I asked
the question. Unfortunately, for my problem, multi-core programming seems not appropriate because it is difficult
to divide the codes into independents parts.

Kuniaki Mukai

> On Jan 28, 2018, at 18:14, Jan Wielemaker <j...@swi-prolog.org> wrote:
>
> On 28/01/18 07:19, Kuniaki Mukai wrote:
>> Hi,
>> Is there a simple way to run a SWI-Prolog program on multi-core
>> machine (e.g. iMac with 4 cores). Am I asking a difficult question
>> which should not be asked easily here ?
>
> Not really sure what you mean. SWI-Prolog runs fine on multi-core
> hardware as you probably know. If you actually want to profit, you
> have to use thHie thread predicates. These create multiple solvers

Harry Stoteles

unread,
Jan 28, 2018, 3:26:10 PM1/28/18
to SWI-Prolog
Well you could use the SWI menu item "run | new thread". It will
then show a new window with something like:

SWI-Prolog console for thread 3
?-

And then you could open 4 such windows, and start the same
problem, but with different labeling:

Console 1: label([0,0|...])

Console 2: label([0,1|...])

Console 3: label([1,0|...])

Console 4: label([1,1|...])

If you see library(clpb) or SWI crashing, you might need to
file a bug report, which could delay the goal of Tokyo Japan 2020.

Harry Stoteles

unread,
Jan 28, 2018, 3:29:05 PM1/28/18
to SWI-Prolog
Oops, you don't use CLP(B), maybe doing somewhere:

    Console 1: X=0, Y=0

    Console 2: X=0, Y=1

    Console 3: X=1, Y=0

    Console 4: X=1, Y=1

Might slice the search.

Kuniaki Mukai

unread,
Jan 30, 2018, 2:11:45 AM1/30/18
to Harry Stoteles, SWI-Prolog

I am surprised to have found  that my “sat_count” terminates for the problem(8, R) before the Tokyo 2020.

For now, I only hope the result is correct.

%@ R = 185794560.


% ?- problem(8, P), profile(time(solc(P, R))).
%@ % 55,679,808,628 inferences, 203331.156 CPU in 204389.336 seconds (99% CPU, 273838 Lips)
%@ P =  (#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false))))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))*((#(*)=:=true)*((#(*)=:=false)*(#(*)=:=false))*((#(*)=:=true)*(#(*)=:=false)*(#(*)=:=true))))))),
%@ R = 185794560.

I feel my codes somewhat ad-hoc about hashing.  I will read
about the recent assoc library of SWI to replace the ad-hoc hashing with
“assoc” if it is better for my purpose.  

Kuniaki Mukai.




-- 
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+...@googlegroups.com.
Visit this group at https://groups.google.com/group/swi-prolog.
For more options, visit https://groups.google.com/d/optout.

Kuniaki Mukai

unread,
Jan 30, 2018, 11:47:21 AM1/30/18
to SWI-Prolog
I  made a quick comparison between hash and assoc.
The result says hash is about two times faster than assoc.

[assoc]
% ?- profile((problem(6, P), time(solc(P, R)))).
%@ % 35,183,129 inferences, 5.803 CPU in 5.836 seconds (99% CPU, 6062986 Lips)
%@ R = 23040.

[hash]
%@ % 12,924,098 inferences, 1.880 CPU in 1.907 seconds (99% CPU, 6875749 Lips)
%@ R = 23040.

[assoc]
% ?- profile((problem(7, P), time(solc(P, R)))).
%@ % 801,611,011 inferences, 144.847 CPU in 145.890 seconds (99% CPU, 5534193 Lips)
%@ R = 1451520.

[hash]
% ?- profile((problem(7, P), time(solc(P, R)))).
%@ % 213,707,830 inferences, 71.280 CPU in 72.431 seconds (98% CPU, 2998141 Lips)
%@ R = 1451520.

library(assoc) is easy to use, so I like it. On the other hand, a hash way 
is not easy for me  to find a suitable size of the  hash table and the depth parameter 
for term_hash/4.  I feel like Buridan's ass for now, which way to go.

Kuniaki Mukai



I am surprised to have found  that my “sat_count” terminates for the problem(8, R) before the Tokyo 2020.

For now, I only hope the result is correct.

%@ R = 185794560.


% ?- problem(8, P), profile(time(solc(P, R))).
%@ % 55,679,808,628 inferences, 203331.156 CPU in 204389.336 seconds (99% CPU, 273838 Lips)
%@ P =  (#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false))))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))*((#(*)=:=true)*((#(*)=:=false)*(#(*)=:=false))*((#(*)=:=true)*(#(*)=:=false)*(#(*)=:=true))))))),
%@ R = 185794560.

I feel my codes somewhat ad-hoc about hashing.  I will read
about the recent assoc library of SWI to replace the ad-hoc hashing with
“assoc” if it is better for my purpose.  

Kuniaki Mukai.



On Jan 29, 2018, at 5:26, Harry Stoteles <in...@xlog.ch> wrote:

Harry Stoteles

unread,
Jan 30, 2018, 1:17:50 PM1/30/18
to SWI-Prolog
Hash is always faster. Hash is O(1) and trees are O(log n).

So far I only found trees useful when I need some sorted
result. But I guess for the cache you are using, it is irrelevant

to sort the items. You only want to get a hit or not.

Harry Stoteles

unread,
Jan 30, 2018, 1:23:08 PM1/30/18
to SWI-Prolog
The only thing that can slow down hash, is the hash
computation function. If this is not fast, then you migt
need to fall back to trees. Namely if comparing items
is easier than computing a hash value. In clause

indexing so far I use a hash. But there are other factors
which can make it slow, just-in-time selection of the hashes,
and copying of the clause to mention a few issues.
Something that might prevent you from further using a hash,

is the ration between insert/delete (modification) and
get (access). If there are few accesses and a lot of modification
the amortization of the hash table resizing when the hash
table reaches some threshold might not function properly.

What kind of hash table library do you use. Something
home grown? Or something from SWI-Prolog?

Harry Stoteles

unread,
Jan 30, 2018, 1:29:12 PM1/30/18
to SWI-Prolog
Whats also prohibitive for a hash is a degenerate
hash function. For example the Float.hashCode() of
Java, gives you something that is most of the time
zero in the lower bits, has to do with the IEEE float

representation, so you need to apply a
scrambler. Murmur is very good:

http://jekejeke.ch/idatab/doclet/blog/en/docs/05_run/01_kernel/data/HashScrambler.html

You could make a histogram of your hash values,
and the look at it, and see whether hash values
are evenly distributed. If they are not, maybe work
on some additional scrambling. But what might work

for one example, might not work for another example...

Kuniaki Mukai

unread,
Jan 31, 2018, 2:45:23 AM1/31/18
to Harry Stoteles, SWI-Prolog

Hi  Harry,

Thank you for clarification, which has made me clear
on what I had in mind  vaguely on difference between 
hash and assoc. Many thanks.

This is my current use of term_hash/4 for ROZDD building.

%
init_hash(N, Hash):-
 length(L, N),
 maplist(=([]), L),
 Hash=..[(#)|L].
%
get_bdd_hash(X, V):-
b_getval(rozdd, R),
arg(2, R, HashTbl),
functor(HashTbl, _, S),
term_hash(X, 3, S, I0),
succ(I0, I),
arg(I, HashTbl, U),
( memberchk(X-V, U) -> true
; setarg(I, HashTbl, [X-V| U])
).

Possible input key X is limited to only the following three simple key terms.

1.    if(X, I, J)
2.    meet(I, J)
3.    sum(I, J)

 where X, I, J are positive integers supposed to be  1 =< X =< 2000, and
   1 =< I, J =< 10^7.

I think there may be a suitable hash function
for such  purpose, but for which I have no idea.

The  posted comparison of mine  is on the following replacement of the get_bdd_hash/2
with using library(assoc).

init_hash(_, Hash):- empty_assoc(Hash).


get_bdd_hash(X, V):-
b_getval(rozdd, R),
arg(2, R, HashTbl),
( get_assoc(X, HashTbl, V) -> true
; put_assoc(X, HashTbl, V, HashTbl0),
setarg(2, R, HashTbl0)
).


Kuniaki Mukai

Jan Wielemaker

unread,
Jan 31, 2018, 3:08:15 AM1/31/18
to Kuniaki Mukai, Harry Stoteles, SWI-Prolog
On 31/01/18 08:45, Kuniaki Mukai wrote:
>
> Hi  Harry,
>
> Thank you for clarification, which has made me clear
> on what I had in mind  vaguely on difference between
> hash and assoc. Many thanks.
>
> This is my current use of term_hash/4 for ROZDD building.
>
> %
> init_hash(N, Hash):-
>  length(L, N),
>  maplist(=([]), L),
>  Hash=..[(#)|L].
> %
> get_bdd_hash(X, V):-
> b_getval(rozdd, R),
> arg(2, R, HashTbl),
> functor(HashTbl, _, S),
> term_hash(X, 3, S, I0),
> succ(I0, I),
> arg(I, HashTbl, U),
> (memberchk(X-V, U) -> true
> ;setarg(I, HashTbl, [X-V| U])
> ).
>
> Possible input key X is limited to only the following three simple key
> terms.
>
> 1.    if(X, I, J)
> 2.    meet(I, J)
> 3.    sum(I, J)
>
>  where X, I, J are positive integers supposed to be  1 =< X =< 2000, and
>    1 =< I, J =< 10^7.
>
> I think there may be a suitable hash function
> for such  purpose, but for which I have no idea.

The code above is pretty much ok. I'd typically represent the hash
table with a main term hash(Entries, Size) and update Size. If size
exceeds the size of Entries, allocate a new Entries of twice the
size and migrate all the values you have. That makes your insert
a little slower but avoids the need to estimate the required number
of entries in the table. term_hash/4 is quite suitable as a hash
function. Its operation is based on the MurmurHash function that
is used for all internal hashing.

> The  posted comparison of mine  is on the following replacement of the
> get_bdd_hash/2
> with using library(assoc).
>
> init_hash(_, Hash):- empty_assoc(Hash).
>
>
> get_bdd_hash(X, V):-
> b_getval(rozdd, R),
> arg(2, R, HashTbl),
> (get_assoc(X, HashTbl, V) -> true
> ;put_assoc(X, HashTbl, V, HashTbl0),
> setarg(2, R, HashTbl0)
> ).

That will always loose from the above, both in terms of performance
and memory usage. If you use the latest version from git get_assoc/3
relies on a C-based implementation that may change things somewhat.

Cheers --- Jan

>
>
> Kuniaki Mukai
>
>
>> On Jan 31, 2018, at 3:29, Harry Stoteles <in...@xlog.ch
>>>> <https://groups.google.com/group/swi-prolog>.
>>>> For more options,
>>>> visithttps://groups.google.com/d/optout
>>>> <https://groups.google.com/d/optout>.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "SWI-Prolog" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to swi-prolog+...@googlegroups.com
>> <mailto:swi-prolog+...@googlegroups.com>.
>> Visit this group at https://groups.google.com/group/swi-prolog.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google
> Groups "SWI-Prolog" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to swi-prolog+...@googlegroups.com
> <mailto:swi-prolog+...@googlegroups.com>.

Kuniaki Mukai

unread,
Jan 31, 2018, 5:09:51 AM1/31/18
to Jan Wielemaker, Harry Stoteles, SWI-Prolog
Hi Jan,

One question:  Are you saying that  you call term_hash/4 with
a fixed big number (e.g. 2147483647 ; the max range allowed) for hash range  ?
If so I could  understand  your idea, similar idea to which I use for array for ROZDD starting 
with size 0. 

Otherwise, I am lost.

Kuniaki Mukai

Jan Wielemaker

unread,
Jan 31, 2018, 5:18:13 AM1/31/18
to Kuniaki Mukai, Harry Stoteles, SWI-Prolog
On 01/31/2018 11:09 AM, Kuniaki Mukai wrote:
> Hi Jan,
>
> One question:  Are you saying that  you call term_hash/4 with
> a fixed big number (e.g. 2147483647 ; the max range allowed) for hash
> range  ?
> If so I could  understand  your idea, similar idea to which I use for
> array for ROZDD starting
> with size 0.
>
> Otherwise, I am lost.

I'm lost a litte as well. The idea of an insert is

if number of elements in the hash > number of buckets
resize the table

insert into the table

You call term_hash/4 with a range that equals the number of buckets
in your table (the arity of your #/N term).

Cheers --- Jan


>
> Kuniaki Mukai
>
>> On Jan 31, 2018, at 17:08, Jan Wielemaker <j...@swi-prolog.org
>>>>>> <http://xlog.ch/>>
>>>>>> toswi-prolog+...@googlegroups.com <http://googlegroups.com/>.
>>>>>>            Visit this group
>>>>>> athttps://groups.google.com/group/swi-prolog
>>>>>>            <https://groups.google.com/group/swi-prolog>.
>>>>>>            For more options,
>>>>>> visithttps://groups.google.com/d/optout
>>>>>>            <https://groups.google.com/d/optout>.
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "SWI-Prolog" group.
>>>> To unsubscribe from this group and stop receiving emails from it,
>>>> send an email toswi-prolog...@googlegroups.com
>>>> <mailto:swi-prolog+...@googlegroups.com><mailto:swi-prolog+...@googlegroups.com>.
>>>> Visit this group athttps://groups.google.com/group/swi-prolog.
>>>> For more options, visithttps://groups.google.com/d/optout.
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "SWI-Prolog" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>>> send an email toswi-prolog...@googlegroups.com
>>> <mailto:swi-prolog+...@googlegroups.com><mailto:swi-prolog+...@googlegroups.com>.
>>> Visit this group athttps://groups.google.com/group/swi-prolog.
>>> For more options, visithttps://groups.google.com/d/optout.

Harry Stoteles

unread,
Jan 31, 2018, 5:32:43 AM1/31/18
to SWI-Prolog
The usual approach (in the Java) is to not provide a predicate
term_hash/4, with a range parameter. But to return a
large hash value,

which is then either moded to the range via:

      InRange is HashValue mod Range

Or if the Range is a power of 2, you can also do:

     InRange is HashValue /\ (Range - 1)

But its also possible to have a predicate term_hash/4.
It depends a little bit whether you store the hashvalue
also inside the hashtable, and whether you store a moded

or large hashvalue in the hashtable. If you store the
hashvalue also in the hashtable as a large hashvalue,
then you don't need to recalculate the hashvalue,

each time you resize the hashtable. But the hashtable
then obviously needs more space. Otherwise you
simply do a recalculate during resize.

One further advantage of not providing a predicate
term_hash/4, and doing the modding later is that you
can easily add some extra scrambling, I effectively do:

     InRange is murmur(HashValue) /\ (Range - 1)

Thats a little separation of concern. But if your
term_hash/4 is already enough scrambled, you
might not need such a extra step.
>>>> Visit this group athttps://groups.google.com/group/swi-prolog.
>>>> For more options, visithttps://groups.google.com/d/optout.
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "SWI-Prolog" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>>> send an email toswi-prolog...@googlegroups.com
>>> Visit this group athttps://groups.google.com/group/swi-prolog.
>>> For more options, visithttps://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google
> Groups "SWI-Prolog" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to swi-prolog+...@googlegroups.com

Kuniaki Mukai

unread,
Jan 31, 2018, 5:40:35 AM1/31/18
to Jan Wielemaker, Harry Stoteles, SWI-Prolog

Hi Jan,

I am sorry but I am lost.

> On Jan 31, 2018, at 19:18, Jan Wielemaker <j...@swi-prolog.org> wrote:
>
> On 01/31/2018 11:09 AM, Kuniaki Mukai wrote:
>> Hi Jan,
>> One question: Are you saying that you call term_hash/4 with
>> a fixed big number (e.g. 2147483647 ; the max range allowed) for hash range ?
>> If so I could understand your idea, similar idea to which I use for array for ROZDD starting
>> with size 0.
>> Otherwise, I am lost.
>
> I'm lost a litte as well. The idea of an insert is
> if number of elements in the hash > number of buckets

Aren’t the the following two conditions equivalent ?

1. number of elements in the hash > number of buckets

2. At least one bucket has two different elements.

I am reading a wikipedia page for hash.

Kuniaki Mukai

Jan Wielemaker

unread,
Jan 31, 2018, 5:47:33 AM1/31/18
to Kuniaki Mukai, Harry Stoteles, SWI-Prolog
On 01/31/2018 11:40 AM, Kuniaki Mukai wrote:
>> I'm lost a litte as well. The idea of an insert is
>> if number of elements in the hash > number of buckets
> Aren’t the the following two conditions equivalent ?
>
> 1. number of elements in the hash > number of buckets
>
> 2. At least one bucket has two different elements.

Nope. Only if the distribution would be perfect. Normally, if the number
of objects in the table is the number of buckets you'll have some empty
ones and some ones with a list of a couple of elements. Anyway, you do
not want to walk along all buckets to see whether the table is too
crowded for every insert, so you compare the number of elements to the
number of buckets and hope the distribution is fairly ok. The average
number of objects per bucket that decides for a resize of the table is a
design choice between performance and space.

Cheers --- Jan

Harry Stoteles

unread,
Jan 31, 2018, 5:48:21 AM1/31/18
to SWI-Prolog
Doing it like below, is a little faster, you don't murmur all
the time. You use some cheap algorithm to get a large hash value,
and the murmur only works on a word or a double word, thats all.
Also Java uses some more tricks, the cheap large hash value is

cached inside a Java String:

/** Cache the hash code for the string */
private int hash; // Default to 0

So rehashing strings is especially cheap, since you readily
have the large hash value. For BigInteger this is not done in
Java, and in a Prolog system, one could do this for compounds,
maybe imposing some ground constraints, fully ground or ground

to some level. But I didn't do this. I don't have a clear picture
of a solution yet, how the constraints are handled. What do you
hash exactly Kuniaki Mukai? I didn't pay attemtion You use
term_hash/4 on the tree or on some signature of it?

Ok you wrote:


Kuniaki Mukai wrote:
> Possible input key X is limited to only the following three simple key terms.
> 1.    if(X, I, J)
> 2.    meet(I, J)
> 3.    sum(I, J)

So I and J are trees? Hm, is there some literature about hashing
trees that represent boolean functions. Thats a strange problem
somehow related to the SAT solving problem itself. Imagine you could
have a hash, with hash(I) = 0 if I is unsatisifiable, this would

save a lot of time. He He The RZODD paper contains some
Histograms for the RZODD size. What would be interesting would
be some histograms for the hash values. That would help the
discussion I guess.

Harry Stoteles

unread,
Jan 31, 2018, 5:54:14 AM1/31/18
to SWI-Prolog
If you double the hash table size, like if you go from Range to
Range' = 2*Range, the histogram changes. Usually the threshold
to do this move is not that the hash table is already fully

filled, i.e. has itself Range many entries. Usually something lower
is used, because this increases the probability that the hash table
doesn't contain collisions.

This is the fill factor of a hash table. Java uses something like 70%,
I use around 75%. Which is implemented as follows:

size++;
if (size > table.length * 3 / 4)
resize(table.length * 2);

If you don't resize, but initialize a hash table fixed, you should maybe
also respect some fill factor, to lower the probability of a collision. So
if you are expecting 1000 elements, maybe use a hash table of size
1333 = 1000*4/3.

Harry Stoteles

unread,
Jan 31, 2018, 6:00:09 AM1/31/18
to SWI-Prolog
You handle collisions here :


Kuniaki Mukai wrote:
> arg(I, HashTbl, U),
> ( memberchk(X-V, U) -> true
> ; setarg(I, HashTbl, [X-V| U])
> ).

Collision happens if the same InRange hash value,
your variable I, is found for the same two elements.
So if you have Element1 and Element2 with:

    InRange1 = InRange2

This is a bad situation, because the you will need
memberchk which is linear search, which is O(n).
If you have a lot of collision, your

hashtable degrades from O(1) to O(n). This is two
avoid collisions:
- Some fill factor <100% for the hash table
- Good scrambling property of the hash function

If the fill factor is near 100%, then you have the
Tokyo train effect. The propability that you will
stand on some others passengers feet is much
high when the train is crowded.

Kuniaki Mukai

unread,
Jan 31, 2018, 6:07:41 AM1/31/18
to Jan Wielemaker, Harry Stoteles, SWI-Prolog
Hi Jan,

Thanks. I got it.

I will try your idea.

Kuniaki Mukai

Harry Stoteles

unread,
Jan 31, 2018, 6:10:40 AM1/31/18
to SWI-Prolog
Or maybe examine the bouqet size, i.e. the length
of the list U, to decide when to resize your hashtable.
But then you might need to resize indefinitely, when
the hash function is degenerated.

Another approach is also to use a tree for the
bouquets, instead of a list, so you would not anymore
use memberchk to search the bouquet, but you would
use library(assoc) there. This is done in the

newest version of Java HashMap, since they cannot
avoid the problem of degenerated hash function.
Its a problem of the end-user of the HashMap, he
might have keys with degenerate hash function values.

In the newest Java HashMap a decision is made
based on the bouquet size whether to use memberchk
or library(assoc), so to speak. This is controlled by
two need constants:

static final int TREEIFY_THRESHOLD = 8;
static final int UNTREEIFY_THRESHOLD = 6;

Quite complex code meanwhile:
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/HashMap.java/

Harry Stoteles

unread,
Jan 31, 2018, 6:14:02 AM1/31/18
to SWI-Prolog
Corr.:
Java uses the following fill factor:

static final float DEFAULT_LOAD_FACTOR = 0.75f;

But I like the below code, since it can be implemented with
arithmetic shift. The x * 3/4, is just:

     ((x << 1) + x) >> 2

No floats needed.

Kuniaki Mukai

unread,
Jan 31, 2018, 6:14:16 AM1/31/18
to Harry Stoteles, SWI-Prolog
Hi Harry,

Kuniaki Mukai wrote:
> Possible input key X is limited to only the following three simple key terms.
> 1.    if(X, I, J)
> 2.    meet(I, J)
> 3.    sum(I, J)

So I and J are trees?

No.   X, I , J, are not trees, but integers as I wrote.


Kuniaki Mukai


Harry Stoteles

unread,
Jan 31, 2018, 6:23:18 AM1/31/18
to SWI-Prolog
Are they generate just as you go? How are these integers
generated? Maybe you even don't need term_hash/4,

just make up some function on the given term, like for example
here is a full value hash code generator:

my_hash_code(if(X, I, J), K) :- H is X*31+I, K is H*31+J.
my_hash_code(meet(I, J), K) :- K is I*31+J.
my_hash_code(sum(I, J), K) :- K is I*31+J.

Maybe this is faster...You still need to mod it,
and/or murmur it.

Kuniaki Mukai

unread,
Jan 31, 2018, 6:26:14 AM1/31/18
to Harry Stoteles, SWI-Prolog
Hi Harry,

Thanks. Great. I will also your idea, together with Jan’s one.

Kuniaki Mukai


--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+...@googlegroups.com.

Harry Stoteles

unread,
Jan 31, 2018, 6:29:06 AM1/31/18
to SWI-Prolog
The factor is interesting, since it can again be implemented
with bitshift, x*31 is the same as:

     x << 4 + x << 3 + x << 2 + x << 1 + x

And you see that no bits get lost. So lowerbits of the argument
will stay in the hash value.

But I guess it was more chosen for ASCII values. Other
suggestions to combine hash values are XOR them, so

you would compute:

    my_hash_code(sum(I, J), K) :- K is xor(I, J).
   https://stackoverflow.com/q/5889238/502187

xor/2 is part of ISO standard:

     https://www.complang.tuwien.ac.at/ulrich/iso-prolog/dtc2#xor

with my_hash_code/2 you are free to experiment.

Harry Stoteles

unread,
Jan 31, 2018, 6:32:29 AM1/31/18
to SWI-Prolog
> 2.    meet(I, J)
> 3.    sum(I, J)

Maybe order arguments. Only compute meet or sum,
when I<J. Don't need to compute meet(I, J) and meet(J, I).

Not sure whether this is relevant. But it would half the
size of your hashtable.

Kuniaki Mukai

unread,
Jan 31, 2018, 6:35:15 AM1/31/18
to Harry Stoteles, SWI-Prolog

On Jan 31, 2018, at 20:32, Harry Stoteles <in...@xlog.ch> wrote:

> 2.    meet(I, J)
> 3.    sum(I, J)

Maybe order arguments. Only compute meet or sum,
when I<J. Don't need to compute meet(I, J) and meet(J, I).


Yes.  In fact, I took care of that commutativity.  

Kuniaki Mukai





--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+...@googlegroups.com.

Harry Stoteles

unread,
Jan 31, 2018, 6:45:33 AM1/31/18
to SWI-Prolog
:-) I fear, with all these tricks, you will soon
compute problem(9, P) before Tokyo 2020.

Kuniaki Mukai

unread,
Jan 31, 2018, 10:42:29 AM1/31/18
to Jan Wielemaker, Harry Stoteles, SWI-Prolog
Hi Jan,

Iterated extending hash tables seems to work also for my rozdd builder.
Roughly it runs twice faster than the previous version, whose hash table has a fixed size.

[2018/02/01]
% ?- profile((problem(6, P), time(solc(P, R)))).
%@ % 17,883,718 inferences, 2.406 CPU in 2.461 seconds (98% CPU, 7431481 Lips)
%@ R = 23040.

% ?- profile((problem(7, P), time(solc(P, R)))).
%@ % 336,967,427 inferences, 49.877 CPU in 50.751 seconds (98% CPU, 6755987 Lips)
%@ R = 1451520.

I am not sure I understand exactly what you suggests, but I set
the trigger condition of hash table extension as follows without confidence.

number of entries > 2 * number of buckets.

Now I need not care of hash table initial size, which I like it very much.


get_bdd_hash(X, V):-
b_getval(rozdd, R),
arg(2, R, HashTbl),
functor(HashTbl, _, S),
term_hash(X, 3, S, I0),
succ(I0, I),
arg(I, HashTbl, U),
( memberchk(X-V, U) -> true
; setarg(I, HashTbl, [X-V| U]),
b_getval(card_of_hash_entries, C),
succ(C, C0),
b_setval(card_of_hash_entries, C0),
( U = []
-> b_getval(card_of_hash_buckets, B),
succ(B, B0),
b_setval(card_of_hash_buckets, B0)
; check_and_rehash
)

).

check_and_rehash:- b_getval(rozdd, R),
arg(2, R, HashTbl),
b_getval(card_of_hash_buckets, B),
b_getval(card_of_hash_entries, C),
( C > B+B
-> iterated_hash_extending(HashTbl, HashTbl0, C0, B0),
setarg(2, R, HashTbl0),
b_setval(card_of_hash_entries, C0),
b_setval(card_of_hash_buckets, B0)
; true
).



iterated_hash_extending(H, H0, CE, CB):-
extend_hash(H, H1, CE0, CB0),
( CE0 < CB0 + CB0 -> CE = CE0, CB = CB0, H0 = H1
; iterated_hash_extending(H1, H0, CE, CB)
).
%
extend_hash(H, H0, CE, CB):-
functor(H, F, N),
plus(N, N, N0),
length(L, N0),
maplist(=([]), L),
H0=..[F|L],
migrate_hash(H, H0, CE, CB).

%
migrate_hash(H, H0, CE, CB):-
functor(H, _, N),
migrate_hash(N, H, H0, 0, CE, 0, CB).
%
migrate_hash(0, _, _, CE, CE, CB, CB):-!.
migrate_hash(I, H, H0, CE, CE0, CB, CB0):-
arg(I, H, B),
migrate_bucket(B, H0, CE, CE1, CB, CB1),
succ(I0, I),
migrate_hash(I0, H, H0, CE1, CE0, CB1, CB0).
%
migrate_bucket([], _, CE, CE, CB, CB).
migrate_bucket([X-V|U], H0, CE, CE0, CB, CB0):-
functor(H0, _, S),
term_hash(X, 3, S, K),
succ(K, K0),
arg(K0, H0, D),
setarg(K0, H0, [X-V|D]),
succ(CE, CE1),
( D = [] -> succ(CB, CB1); CB1 = CB),
migrate_bucket(U, H0, CE1, CE0, CB1, CB0).

Thank you for your iterative deepening wisdom !!

Kuniaki Mukai


> On Jan 31, 2018, at 19:47, Jan Wielemaker <j...@swi-prolog.org> wrote:
>

Jan Wielemaker

unread,
Jan 31, 2018, 10:56:19 AM1/31/18
to Kuniaki Mukai, Harry Stoteles, SWI-Prolog
On 01/31/2018 04:42 PM, Kuniaki Mukai wrote:
> Hi Jan,
>
> Iterated extending hash tables seems to work also for my rozdd builder.
> Roughly it runs twice faster than the previous version, whose hash table has a fixed size.

Which means it was too small.

> [2018/02/01]
> % ?- profile((problem(6, P), time(solc(P, R)))).
> %@ % 17,883,718 inferences, 2.406 CPU in 2.461 seconds (98% CPU, 7431481 Lips)
> %@ R = 23040.
>
> % ?- profile((problem(7, P), time(solc(P, R)))).
> %@ % 336,967,427 inferences, 49.877 CPU in 50.751 seconds (98% CPU, 6755987 Lips)
> %@ R = 1451520.
>
> I am not sure I understand exactly what you suggests, but I set
> the trigger condition of hash table extension as follows without confidence.
>
> number of entries > 2 * number of buckets.

The factor is a tradeoff between speed and memory. I'd experiment a
little to see when performance starts dropping.

> Now I need not care of hash table initial size, which I like it very much.

That is the idea. Even using a really small value has little impact as
you have at most lg(N) resizes.

> check_and_rehash:- b_getval(rozdd, R),
> arg(2, R, HashTbl),
> b_getval(card_of_hash_buckets, B),
> b_getval(card_of_hash_entries, C),

I'd put these two values in the hash table compound term and use
arg/3 and setarg/3 on them. That keep things local while
it is also faster.

Cheers --- Jan
Message has been deleted

Harry Stoteles

unread,
Jan 31, 2018, 11:55:40 AM1/31/18
to SWI-Prolog
If somebody would remove me from some blacklist, I could post
again under my normal name, Jan Burse, and my posts wouldn't
get automatically deleted. If I assume number of entries = size (from
my code) and if I assume number of buckets = table.length (from my

code), then your trigger below coresponds to a fill factor of 200%.
But usually or paradoxically fill factors < 100% are used, since
there is always a slight chance of collisions, so for a fill
factor of 75% you would use this condition:

     number of  entries  >  number of buckets * 3 / 4.

You can use a low and a high water mark, so if you delete
also from a hashtable, you can also resize it to shrinking.
The usual java.util.HashMap doesn't do shrinking, so pretty
soon one starts developing his own datastructures...

Harry Stoteles

unread,
Jan 31, 2018, 12:12:33 PM1/31/18
to SWI-Prolog
Synonymous to "fill factor" is "load factor". You don't give
away much memory if you use a "load factor" < 100%,
its only the table and not the non-empty bouquets lists

itself. So it doesn't matter if the fill factor is < 100%.
But you get for sure less collisions. So you will get more
probably less bouquet lists that have more than one items.

The goal is to have each bouquet list either empty or
maximally one item. If a bouquet list has already two
items, a small search is involved. With a fill factor of

200% you have many bouquet lists that have already
two items or more. But it wont cost you much reducing
the fill factor and using a bigger table. On the other hand

you will gain some speed. The best would be to make
a histogram and look at the distribution. And experiment
this way. A histogram would just show you each bouquet

and the length of a bouquet. The
bins would be the hash codes:
https://www.mathsisfun.com/data/histograms.html

Peter Ludemann

unread,
Jan 31, 2018, 12:37:23 PM1/31/18
to SWI-Prolog
​This thread appears to be turning into a tutorial on hash tables. For that, you might want to look at Python's implementation of "dictionaries" (or "dicts"), because Python uses dicts everywhere, including the local and global symbol tables (when I gave a quick description of Python's implementation to a fellow Prolog-er, he remarked that "it's dicts all the way down")
Here​'s a slightly out-of-date tutorial on Python's implementation of dicts: https://www.laurentluce.com/posts/python-dictionary-implementation/ and there are probably other tutorials which you can find with your favourite search engine. Also, there have been some recent improvements to the algorithms, for example keeping track of insertion order so that iteration is deterministic.

--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+unsubscribe@googlegroups.com.

Jan Wielemaker

unread,
Jan 31, 2018, 12:57:15 PM1/31/18
to Peter Ludemann, SWI-Prolog
On 31/01/18 18:36, Peter Ludemann wrote:
> ​This thread appears to be turning into a tutorial on hash tables. For
> that, you might want to look at Python's implementation of
> "dictionaries" (or "dicts"), because Python uses dicts everywhere,
> including the local and global symbol tables (when I gave a quick
> description of Python's implementation to a fellow Prolog-er, he
> remarked that "it's dicts all the way down")
> Here​'s a slightly out-of-date tutorial on Python's implementation of
> dicts:
> https://www.laurentluce.com/posts/python-dictionary-implementation/ and
> there are probably other tutorials which you can find with your
> favourite search engine. Also, there have been some recent improvements
> to the algorithms, for example keeping track of insertion order so that
> iteration is deterministic.

I think it is ok to have a discussion how to implement hash tables in
(SWI-)Prolog here. Hash tables are not compatible with pure logical
data structures and therefore their implementation in Prolog is a
little different to what one finds in the literature. It would be
nice if we could restrict this to these differences though, instead
of endless stuff Java hash functions.

Cheers --- Jan

Jan Wielemaker

unread,
Jan 31, 2018, 1:00:41 PM1/31/18
to Harry Stoteles, SWI-Prolog
On 31/01/18 17:55, Harry Stoteles wrote:
> If somebody would remove me from some blacklist, I could post
> again under my normal name, Jan Burse, and my posts wouldn't
> get automatically deleted.

I didn't blacklist anyone, but apparently Google decided to do
so. Dunno why. Possibly too many self-replies :)

Cheers --- Jan
Message has been deleted

Peter Ludemann

unread,
Jan 31, 2018, 1:16:46 PM1/31/18
to Jan Wielemaker, SWI-Prolog
​Oh, I wasn't trying to stop discussion ... just wanted to point out a place where a fair amount of thought has already gone into the implementation. If O(1) algorithms can be used, that would remove a big objection to using Prolog, where N log(N) algorithms often seem to be the best available.

​Isn't SWI-Prolog's "dict" compatible with pure logic programming? Presumably its implementation could change to something more efficient if needed.
Which raises the question: how important is it for this particular problem, that the assoc (or hash) be fast? A bit of profiling might reveal that it's a minor detail.
- p 

 

        Cheers --- Jan

Message has been deleted

Jan Wielemaker

unread,
Jan 31, 2018, 1:28:47 PM1/31/18
to Peter Ludemann, SWI-Prolog
On 31/01/18 19:16, Peter Ludemann wrote:
>
>
> On 31 January 2018 at 09:57, Jan Wielemaker <j...@swi-prolog.org
I think Kuniaki already went through the steps verifying this was the
bottleneck. If you want to stay in the pure logical paradigm where the
only possible change to a data structure is instantiating it (further),
you are bound to O(log(N)) (I think). Modern Prolog systems have
backtrackable and non-backtrackable assignment that allows implementing
data structures that are O(1) at the price that data structure change.

Dicts stay within the normal Prolog world and are in fact completely
copied on an insert or change, i.e., O(n). Their lookup is O(log(N)).
The way they are implemented though comes with a low constant factor.
Insertion finds the insertion point (O(log(N)) and then requires a
memcpy() (actually two, copying before and after). The tree search
compares simple integers during its descent.

It seems a inevitable that Prolog comes with more choices than
other high level languages when it comes to data structures :(

Cheers --- Jan

Harry Stoteles

unread,
Jan 31, 2018, 1:35:20 PM1/31/18
to SWI-Prolog
Dict doesn't imply hash table.

You see this in the SWI-Prolog implementation, there its an array
of pairs, even a key sorted array packed into a compound. Dict is a
hypernym, it doesn't say more then key-value pairs. But there are
a dozen datastructures to implement key-value pairs:

- Trees
- Hash tables
- Arrays (as in SWI-Prolog)
- Etc.. what else

Arrays makes perfect sense for small dicts. But when you have
a couple of thousend entries, like in the case of Kuniaki Mukai you
need something more than plain lists. There seems to be a library(assoc)
in SWI-Prolog, which provides tree based dicts, AVL trees,

A.3 library(assoc): Association lists

http://www.swi-prolog.org/pldoc/man?section=assoc

Other Prolog systems have natively
some hash tables, here is an example:

ECLiPSe Prolog - library(hash)
http://www.eclipseclp.org/doc/bips/lib/hash/index.html

It seems that SWI-Prolog doesn't have such a library. At least I don't
find a predicate hash_add/3. Maybe there is some pack somewhere.
Now Kuniaki Mukai is rolling his own hash table, directly in Prolog.
When done with Prolog, as Kuniaki Mukai is doing it, by use of the

predicate setarg/3, the data structure becomes backtrackable. But he
might also think about doing it non-backtrackable, but backtrackable
is safer, since variable numbers in his solver is probably also back-
trackable in
Kuniaki Mukai implementation. The ECLiPSe hash table

is not backtrackable. At least the ECLiPSe
Prolog documentation says:

Chapter 10
Non-logical Storage and References
-  ECLiPSe provides several facilities to store information across backtracking.
http://www.eclipseclp.org/doc/userman.pdf

but using setarg is only seemingly safer. It is less safe as the AVL tree
implementation. The AVL tree implementation is the most cleanest and
logical implementation since after an operation a new term is returned.
But for hash table, this is not really feasible. So the setarg solution is a

compromise, and as a result one will not be able to do the same as with
the AVL trees, for example use multiple versions of the same dict
in the same Prolog continuation.

Harry Stoteles

unread,
Jan 31, 2018, 1:46:17 PM1/31/18
to SWI-Prolog
Given your explanation of instantion only datastructures, I wonder
whether Kuniaki could use this data structure instead:

      table = compound(X1,...,Xn)

X1,..,Xn variable.

      bouquet = [E1,...,En|R]

So bouquet could grow by instantiating R. And a new non empty
bouquet could be added to the table by instantiating Xi.
But this doesn't give a multi-version data structure. For a multi-version
data structure, as the AVL tree is, you would need to make copies

of the table. I was once thinking of just using a two level table:

     table = compound(compound(X11,..,X1n),..,compound(Xn1,..,Xnm))

So when "setting" a bouquet change, you would only need to copy
a sub compound and the main compound. So "insert" would get more
expensive, but this would be completely logical and multi-version capabable.
On the other hand instantiation only datastructures are not really

multi-version capable. If you instantiate a variable, you get a side effect
in the continuation. So you need to copy something for multi-version.

Harry Stoteles

unread,
Jan 31, 2018, 1:51:19 PM1/31/18
to SWI-Prolog
Ok,  there is already a pack:

Title: Pure and impure hash tables
Author:Gergö Barany

This library provides two implementations of hash tables in Prolog.
Hash tables store key-value pairs and provide predicates to add,
look up, remove entries as well as conversions to and from lists
and higher order predicates for mapping, folding, and iterating
over hash tables. The hash table variants differ in their approach to
purity, i.e., what kind of side effects their implementation uses.

http://www.swi-prolog.org/pack/list?p=hashtbl

Harry Stoteles

unread,
Jan 31, 2018, 7:59:10 PM1/31/18
to SWI-Prolog
The python link describes an interesting hash table variant.
"open adressing". I guess it doesn't make the hash table faster,

it only reduces the memory consumption, in that the bouquets
are created inside the table itself and not by extra lists, by

spreading from the hash value to other hash values. It is also
described here where a much lower load factor ca. 50% is

recommended for "open adressing":

Hashing – Collision Resolution
https://www.cs.auckland.ac.nz/courses/compsci105s2c/lectures/Damir/L26_Hashing2_4SPS.pdf
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+...@googlegroups.com.

Harry Stoteles

unread,
Jan 31, 2018, 8:04:17 PM1/31/18
to SWI-Prolog
What you mean by "keeping track of insertion order so that
iteration is deterministic", I don't understand completely.
There are hash table variants, with extra list overlayed,
so that the input order is preserved, in Java its called

https://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashMap.html

If you then iterate over the input order overlay, you
get the input order back. Without an input order overlay,
a hash table spills out the elements in a kind of garbage
order, thats true. If you want a different order, you

can use a tree with some comparator, or the linked
hash table, which will give you back the input order.

Kuniaki Mukai

unread,
Jan 31, 2018, 10:37:52 PM1/31/18
to SWI-Prolog

Hi Jan W.,

I have applied to the problem(8, _)  the rehashing  with trigger condition

 number of entries > 5 * (number of buckets)

with initial hash table size = 512.

% ?- profile(time((problem(8, P), solc(P, R)))).
%@ hashtble_size=2048
%@ hashtble_size=8192
%@ hashtble_size=32768
%@ hashtble_size=131072
%@ hashtble_size=524288
%@ hashtble_size=2097152
%@ hashtble_size=8388608
%@ % 2,149,145,475 inferences, 760.609 CPU in 766.331 seconds (99% CPU, 2825558 Lips)
%@ ERROR: Out of global stack
%@ ^  Exception: (16) setup_call_catcher_cleanup(system:true, prolog_statistics:catch(bdd:(problem(8, _20), solc(_20, _22)), _134,  (report(t(1517453624.372568, 0.912051, 7337942), 10), throw(_134))), _444, prolog_statistics:(_156=true)) ? creep
%@ ^  Call: (18) call(prolog_statistics:(_156=true)) ? abort
%@ % Execution Aborted

I take  this  quick "Out of global satck” rather as  a strong symptom that hash is somewhat essential for 
rozdd building.  I am feeling difficulty to catch up with discussions on such advanced topics on hash,
but at the same time now I feel I  have come to  the central place of building large rozdd in SWI-Prolog.
I will take time  to consider in my own favorite pace, hoping on time for "Tokyo 2020."

Kuniaki Mukai

On Feb 1, 2018, at 10:04, Harry Stoteles <in...@xlog.ch> wrote:

What you mean by "keeping track of insertion order so that
iteration is deterministic", I don't understand completely.
There are hash table variants, with extra list overlayed,
so that the input order is preserved, in Java its called

https://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashMap.html

If you then iterate over the input order overlay, you
get the input order back. Without an input order overlay,
a hash table spills out the elements in a kind of garbage
order, thats true. If you want a different order, you

can use a tree with some comparator, or the linked
hash table, which will give you back the input order.

Am Donnerstag, 1. Februar 2018 01:59:10 UTC+1 schrieb Harry Stoteles:
The python link describes an interesting hash table variant.
"open adressing". I guess it doesn't make the hash table faster,

C > 5*B

Harry Stoteles

unread,
Feb 1, 2018, 5:45:14 AM2/1/18
to SWI-Prolog
Thats because you use a backtrackable datastructure,
you will commulate all the hashtable in memory, they will
probably not go away on the trail.

As soon as you do setarg, the compund will be locked
and possibly not anymore be garbage collected.
Depends on the setarg implementation

and whether the rest of your code is derministic.

Harry Stoteles

unread,
Feb 1, 2018, 5:49:58 AM2/1/18
to SWI-Prolog
On solution would be to use nb_setarg for the hash ops, and
only a global backtracking once, when you enter solc.

So when you enter solc, you could establish via backtracking
an empty hashtable, but then you would need to use some ultra

fast destructive hash table à la lib(hash) from ECLiPSe Prolog.
Resizes would be hopefully then garbage collected.

Jan Wielemaker

unread,
Feb 1, 2018, 5:55:03 AM2/1/18
to Harry Stoteles, SWI-Prolog
On 01/02/18 11:45, Harry Stoteles wrote:
> Thats because you use a backtrackable datastructure,
> you will commulate all the hashtable in memory, they will
> probably not go away on the trail.
>
> As soon as you do setarg, the compund will be locked
> and possibly not anymore be garbage collected.
> Depends on the setarg implementation
>
> and whether the rest of your code is derministic.

setarg/3 (and b_setval/2) are subject to garbage collection:
if multiple assignments happen between two choice points only
the oldest is kept and if the oldest predates the creation of
the structure it is entirely deleted.

It is a good idea to run ?- prolog_ide(thread_monitor). if you
have graphics so you can follow stack sizes graphically.

Cheers --- Jan
>> <https://groups.google.com/group/swi-prolog>.
>> For more options, visit
>> https://groups.google.com/d/optout
>> <https://groups.google.com/d/optout>.
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "SWI-Prolog" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to swi-prolog+...@googlegroups.com <javascript:>.
>> <https://groups.google.com/group/swi-prolog>.
>> For more options, visit https://groups.google.com/d/optout
>> <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "SWI-Prolog" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to swi-prolog+...@googlegroups.com
> <mailto:swi-prolog+...@googlegroups.com>.

Harry Stoteles

unread,
Feb 1, 2018, 6:16:51 AM2/1/18
to SWI-Prolog
But you might not detect the situation, when something is
already trailed, and then a choice point is removed.

I read something in the SWI-Prolog docu:

"
If multiple TrailAssignment() calls happen on the same address within a
choicepoint we only need to keep the first. Therefore we scan the trail
for this choicepoint from the mark to the top and mark (using the FIRST
mark) the (global stack) addresses trailed. If we find one marked we can
delete the trail entry. To avoid a second scan we store the marked
addresses on the argument stack.
"

This is the TrailAssigment() operation. But if you cut a choice point,
new opportunities to merge trail assignments open up.

But I dunno whether this applies for Kuniaki Mukai code. Where
he puts his cuts. Best is I guess to use neck cuts, not to allow

any code at all to run with a new choice point.


Harry Stoteles

unread,
Feb 1, 2018, 6:22:54 AM2/1/18
to SWI-Prolog
Not sure whether if-then-else are a problem. Usually they introduce
a small local choice point, which is cutted away. Not sure

what SWI-Prolog exactly does. Kuniaki Mukai uses a lot of
if-then-else, if there would be a setarg in an if-part of an

if-then-else, then an ugly situation maybe already appears. The
if-part could use an innocent looking hash lookup...

Kuniaki Mukai

unread,
Feb 1, 2018, 6:28:47 AM2/1/18
to Harry Stoteles, SWI-Prolog

On Feb 1, 2018, at 19:49, Harry Stoteles <in...@xlog.ch> wrote:

On solution would be to use nb_setarg for the hash ops, and
only a global backtracking once, when you enter solc.



I am afraid that  nb_setarg/3 is not applicable because my hash entry may have 
a non-ground value, though I  may be missing something in your comments.

?- X = f(A),  nb_setarg(1, X, h(C)), C=hello.
%@ X = f(h(_58)),
%@ C = hello.

Currently I am thinking something to be called  "multiple-layered" assoc as 
hash table for rozdd.  I am not sure it works well as expected. 

Kuniaki Mukai

Harry Stoteles

unread,
Feb 1, 2018, 6:29:14 AM2/1/18
to SWI-Prolog
Thats actualy the standard pattern how a hash is used,
in an imperative programming language:

     value = hash.get(key);
     if (value == null) {
           value = expensive_computation_from_key(key);
           hash.put(key, value);
     }

Usually get() is ready only, without any assignment.
But Kuniaki Mukai uses setarg already during get:


get_bdd_hash(X, V):-
b_getval(rozdd, R),
arg(2, R, HashTbl),
functor(HashTbl, _, S),
term_hash(X, 3, S, I0),
succ(I0, I),
arg(I, HashTbl, U),
( memberchk(X-V, U) -> true
; setarg(I, HashTbl, [X-V| U])
).

Now it depends how he does the check "value==null",
does he check simply var(V)? Or is the get_bdd_hash

that already uses a setarg inside some choice point?

Harry Stoteles

unread,
Feb 1, 2018, 6:35:39 AM2/1/18
to SWI-Prolog
The usual imperative programming pattern is not
optimal, since the hash is searched twice, during get
and during put. Therefore new APIs use a more
functional programming pattern, where the

"expensive_computation_from_key" can be given as
a closure, and the hash is searched only once.

This is seen here:

public V computeIfAbsent(K key,
                         Function<? super K,? extends V> mappingFunction)
If the specified key is not already associated with a
value (or is mapped to null), attempts to compute its
value using the given mapping function and enters
it into this map unless null.

https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#computeIfAbsent-K-java.util.function.Function-
Using this pattern would save some space, since
it would safe an uninstantiated variable, in the
case of Kuniaki Mukai. Returning uninstantiated
variables can add up space.

Harry Stoteles

unread,
Feb 1, 2018, 9:45:59 AM2/1/18
to SWI-Prolog
There might be a problem with the computation of the
key figures that are need for load factor balancing. You
use the following code:


get_bdd_hash(X, V):-
        b_getval(rozdd, R),
        arg(2, R, HashTbl),
        functor(HashTbl, _, S),
        term_hash(X, 3, S, I0),
        succ(I0, I),
        arg(I, HashTbl, U),
        (        memberchk(X-
V, U) -> true
        ;        setarg(I, HashTbl, [X-V| U]),
                b_getval(card_of_hash_entries, C),
                succ(C, C0),
                b_setval(card_of_hash_entries, C0),
                ( U = []
                ->        b_getval(card_of_hash_buckets, B),
                        succ(B, B0),
                        b_setval(card_of_hash_buckets, B0)
                ;        check_and_rehash
                )

        ).

But usually this done much simpler.
card_of_hash_entries = size, is computed correctly.
But then card_of_hash_buckets <> table.length, you

don't need to do anything, you simple get it via:

     functor(HashTbl, _, S)

And you can then check (for a fill factor of %75):

    C0 > S*3 div 4

I am not sure what you are computing by card_of_hash_buckets.
It will be something smaller than S, i.e. card_of_hash_buckets < S,
this explains why you used C0 >5*card_of_hash_bucket (originally
C0 > card_of_hash_bucket +card_of_hash_bucket).

If you use S for resizing, you spare one variable card_of_hash_buckets.
The usual implementations of hash table use S for resizing. It might
be that card_of_hash_bucket is used in some implementations,

but card_of_hash_bucket works exactly the other way around
than what we want for a resizing criteria. The more clustering the
lower card_of_hash_bucket will be. Check for yourself:

High Clustering:

        A
        B [E1, E2, E2]
        C

       Your card_of_hash_bucket = 1

Low Clustering:

        A [E1]
        B [E2]
        C [E3]

       Your card_of_hash_bucket = 3

But we want resizing when high clustering and not when
low clustering. You cannot do resizing when high clustering with
condition based on card_of_hash_bucket as you did.

What a fill factor condition on the other hand will do, it
will avoid high clustering irrespective of whether there
is really some high clustering, namely by using this condition:

     C0 > S*3 div 4

If you want to use card_of_hash_bucket as a condition,
you could use:

     C0 > (S - card_of_hash_bucket) * factor

I don't know what the exact effect was in the error of
using the key figures the way you used them. Looking
at a hash table graphically via a histogram would help you.

When you look graphically at a hash table as a histogram,
for example by feeding the data into an Excel sheet,
and using the bar chart function there, you

would see everything, like high or low clustering.

Kuniaki Mukai

unread,
Feb 1, 2018, 10:46:21 AM2/1/18
to Harry Stoteles, SWI-Prolog

Hi Harry,

Thank you for looking into my codes with comments. I will 
read them  after I finish my current simple-minded 
try  on hybrid use of hash table with very lite-weight assoc. 

Thanks.

Kuniaki Mukai

Kuniaki Mukai

unread,
Feb 1, 2018, 12:58:13 PM2/1/18
to Harry Stoteles, SWI-Prolog

I am new to  the rehash suggested by Jan W.  as well as well as other discussions on hash.
In fact I have coded rehashing without knowing exactly what I am doing. Surely I agree on that there are 
strange points in my posted codes.  However the posted codes shows best performance
at least for Jan B.’s  problem(K,_)   for i = 5, 6, 7  ( for 8 with global stack error, but terminates after three days running
for my previous  sat_count as I posted ).

Also I have made a quick try  according to your  comments. Strange enough,  the result 
shows only several times slower than the posted codes. Maybe I am missing something.

Anyway I need to take time to see for myself on rehashing so that I am confident on what I am doing.

Thank you for your interests, and sorry for giving confusions.

Kuniaki Mukai


On Feb 1, 2018, at 23:45, Harry Stoteles <in...@xlog.ch> wrote:

Kuniaki Mukai

unread,
Feb 2, 2018, 1:40:46 AM2/2/18
to SWI-Prolog
Hi, 

To avoid the the rapid growing of the hash table,  I tried 
a hybrid use of hash and assoc for large  rozdd building in on SWI. It  turned out 
not to work well, as others pointed out in this thread.  In fact the “hybrid” version seems 
not terminate even for Jan B.’s bench mark problem(7, _) at least  in half hour.  

The “hybrid" above means that a  bucket is an assoc tree, which is handled by 
calling  this lite predicates.

empty_assoc(-).

assoc(-, X, Y, t(X, Y, -, -)):-!.
assoc(t(X, Y, L, R), X, Y, t(X, Y, L, R)):- !.
assoc(t(U, V, L, R), X, Y, t(U, V, L0, R0)):-
( X @< U
-> R0 = R,
assoc(L, X, Y, L0)
; L0 = L,
assoc(R, X, Y, R0)
).

The failure of the  quick test has pushed me  to take a way of  hehashing  to go, though
it seems take time for hobbyist like me.

Kuniaki Mukai


On Feb 1, 2018, at 23:45, Harry Stoteles <in...@xlog.ch> wrote:

Message has been deleted

Harry Stoteles

unread,
Feb 2, 2018, 5:44:52 AM2/2/18
to SWI-Prolog
Probably you don't need a cache search at all. Did you ever measure the
number of cache hits and the number of cache misses.

If the number of cache hits is low, then you don't need a cache, then
you only need a key value table.

I am suspicious that the problem, that you notably attribute to me, and that
was deviced by me, doesn't generate any cache hits at all.

I wrote this already somewhere else. The problem has a lot of variables,
but very few common expressions. Very very few, practically none.

Check for yourself...

Harry Stoteles

unread,
Feb 2, 2018, 5:55:59 AM2/2/18
to SWI-Prolog
Whats the difference? Well difficult to say. Where do you generate
the keys if there is a hash miss? I don't see the code for that.

Here is the difference:

Cache, I think this is basically your code, whereby you
don't have really put and get, only one get that returns a var:

     value_and_id = hash.get(key);
     if (value_and_id == null) {
           value = expensive_computation(key);
           id = genysm();
           value_and_id = tupple(value, id);
           hash.put(key, value);
     }

Table, would be more direct, you could simply expand a
compound, no need to search:

    value = expensive_computation(key);
    id = genysm();
    value_and_id = tupple(value, id);
    hash.put(key, value);

But if your cache, is only a table, and only used for gensym,
then you could completely eradicate it. You would only
need the cache/gensym for variables. There you might
have cache hits. But for sum/meet, you wouldn't need anymore:

    keys sum(I, J) /* not needed anymore */
    keys meet(I, J) /* not needed anymore */

Just compute directly sum/meet. Only use cache for variable
indexes, but not for the operations. This would fit my problem.
It wont fit other problems. But for Tokyo 2020 challenge, anything
goes, maybe it would give you some empirical data...

That there are no common subexpressions, sum/meet, is
only an observation for my problem, i.e. problem(7, P) etc..

Harry Stoteles

unread,
Feb 2, 2018, 6:49:14 AM2/2/18
to SWI-Prolog
But variable gensym is usually done simpler, you don't
need a table for that. Only attributed variables. The gensym
then works as follows:
    
      id = var.getattr(gensym_key);
      if (id==null) {
          id = gensym();
          var.putattr(gensym:key, id);
      }

Thats all to generate keys for variables. You find this in Markus
Triska code of CLP(B) and also in my code of CLP(B). So you
could use such a gensym, and try measure:

   - With cache
   - Without cache

Giving my short term brain memory, I don't find such a comparison
of yours. Am I right that you switched to algebraic treatment
of RZODD together with introducing a cache?

I bet without cache is faster than with cache for my problem.
I would bet a Swiss chocolate for that. You know:

You cant do it better with Ovo, but longer
https://www.youtube.com/watch?v=94x5C4yIXVE

Kuniaki Mukai

unread,
Feb 2, 2018, 7:27:59 AM2/2/18
to Harry Stoteles, SWI-Prolog

On Feb 2, 2018, at 19:44, Harry Stoteles <in...@xlog.ch> wrote:

Probably you don't need a cache search at all. Did you ever measure the
number of cache hits and the number of cache misses.

I am not sure what you mean by a cache search.   In fact, 
recent versions of mine don’t have a cache search for common subexpressions  of the input.
I did so because they  seem overlapped  with hash on  meet/sum operations on BDDs.

Kuniaki Mukai

Harry Stoteles

unread,
Feb 2, 2018, 7:31:49 AM2/2/18
to SWI-Prolog
Searching for keys meet(I, J) and sum(I, J), is the same as searching
for common subexpressions. You don't explicitly search for the same
form of a common subexpression, since I and J already represent

normalized BDD or RZODD trees. So you search for common
BDD or RZODD trees. But in effect this covers also common sub-
expressions and a little bit more.

My speculation for problem(7, P):

There are no common BDD or RZODD trees. I made some tests
by myself. I didn't find some.

Harry Stoteles

unread,
Feb 2, 2018, 7:34:12 AM2/2/18
to SWI-Prolog
Lets say its a hybrid form of search of BDD or RZODD trees
and a common subexpression. You search for expressions
A /\ B or A \/ B where A and B are normalized.

Nevermind, I don't think this happens in my problem(7, P).

Harry Stoteles

unread,
Feb 2, 2018, 9:40:59 AM2/2/18
to SWI-Prolog
Structure sharing, common subexpressions, operator caching is
very important for other SAT problems. For example for the
cardinaility constraint as found in SWI-Prolog CLP(B).

So when you write a constraint such as:
     card(3,[X,Y,...,Z])
     http://www.swi-prolog.org/pldoc/man?section=clpb

You can see this if you would draw a graph of the resulting
BDD or ROZDD tree. If you remove meet(I, J) or sum(I, J)
caching, you still have if(X, I, J) caching.

This would be basically an encoding of a BDD or ROZDD tree
with node numbers and linearized and with sharing.
So for example a BDD tree as follows:

       if(X, if(Y, _, _), if(Z, _, _))

Would be encodeded:

       0: if(Y, _, _)
       1: if(Z, _, _)
       2: if(X, 0, 1)

If you only use if(X, I, J) keys, you get already structure sharing,
common subexpressions. And you could experiment with whether
a problem needs much of structure sharing or not.

My hypothesis is that my problem(7, P), in constrast to problems
such as card(3, X,Y,...,Z]) doesn't need much structure sharing.
I am currently planning to show you my claim.

But I have first to adapt my SAT solver code. This will take me
a couple of ours. I am planning to additionally make a counting
experiment. So please be patient. I now take it as my own

burden to show the claim that problem(7, P) is free of structure
sharing, that the BDD or ROZDD tree is a true tree and not
a DAG respectively mesh.

Harry Stoteles

unread,
Feb 2, 2018, 5:12:11 PM2/2/18
to SWI-Prolog
Hi Kuniaki Mukai,

As promised I am trying to make strides towards counting without
hashing. Its actually quite difficult to beat your hasing solution.Here
are some first results (without any hashing, without RZODD), its only
a little bit slower than your result (with some hashing, with RZODD):

?- dim(6,6,X), time((basis(X,E), expr_tree(E,T), expr_vars(T, N, _))).
% Up 2,150 ms, GC 166 ms, Thread Cpu 1,954 ms (Current 02/02/18 22:48:40)
N = 23040

?- dim(7,7,X), time((basis(X,E), expr_tree(E,T), expr_vars(T, N, _))).
Error: Execution aborted since memory threshold exceeded.

I want to revise a little bit my opinion about hashing. Although I
think the likelyhood of "meet" or "sum" cache hits is smaller for
my problem (except for internal "meet" and "sum" during normalization),
I want to retract my clain for "if" cache hits. Since the tree operations

re-order the tree, I guess there will be a lot of cache hits for "if". But
still I will next try something else. In the above I am calculating a single
tree, but I want to do also something with a problem description via
multiple not necessarely disjoint trees... work in progress.

Source code is here, I did not yet run it with SWI-Prolog,
it might be even a little bit faster with SWI-Prolog:
https://gist.github.com/jburse/6bafeca6a4ac95a7d9075af4ee1d072a#gistcomment-2340158

Am Dienstag, 30. Januar 2018 17:47:21 UTC+1 schrieb Kuniaki Mukai:
I  made a quick comparison between hash and assoc.
The result says hash is about two times faster than assoc.

[assoc]
% ?- profile((problem(6, P), time(solc(P, R)))).
%@ % 35,183,129 inferences, 5.803 CPU in 5.836 seconds (99% CPU, 6062986 Lips)
%@ R = 23040.

[hash]
%@ % 12,924,098 inferences, 1.880 CPU in 1.907 seconds (99% CPU, 6875749 Lips)
%@ R = 23040.

[assoc]
% ?- profile((problem(7, P), time(solc(P, R)))).
%@ % 801,611,011 inferences, 144.847 CPU in 145.890 seconds (99% CPU, 5534193 Lips)
%@ R = 1451520.

[hash]
% ?- profile((problem(7, P), time(solc(P, R)))).
%@ % 213,707,830 inferences, 71.280 CPU in 72.431 seconds (98% CPU, 2998141 Lips)
%@ R = 1451520.

library(assoc) is easy to use, so I like it. On the other hand, a hash way 
is not easy for me  to find a suitable size of the  hash table and the depth parameter 
for term_hash/4.  I feel like Buridan's ass for now, which way to go.

Kuniaki Mukai



I am surprised to have found  that my “sat_count” terminates for the problem(8, R) before the Tokyo 2020.

For now, I only hope the result is correct.

%@ R = 185794560.


% ?- problem(8, P), profile(time(solc(P, R))).
%@ % 55,679,808,628 inferences, 203331.156 CPU in 204389.336 seconds (99% CPU, 273838 Lips)
%@ P =  (#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false))))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false))))*((#(*)=:=true)*((#(*)=:=false)*((#(*)=:=false)*(#(*)=:=false)))*((#(*)=:=true)*((#(*)=:=false)*(#(*)=:=false))*((#(*)=:=true)*(#(*)=:=false)*(#(*)=:=true))))))),
%@ R = 185794560.

I feel my codes somewhat ad-hoc about hashing.  I will read
about the recent assoc library of SWI to replace the ad-hoc hashing with
“assoc” if it is better for my purpose.  

Kuniaki Mukai.




On Jan 29, 2018, at 5:26, Harry Stoteles <in...@xlog.ch> wrote:

Well you could use the SWI menu item "run | new thread". It will
then show a new window with something like:

SWI-Prolog console for thread 3
?- 

And then you could open 4 such windows, and start the same
problem, but with different labeling:

Console 1: label([0,0|...])

Console 2: label([0,1|...])

Console 3: label([1,0|...])

Console 4: label([1,1|...])

If you see library(clpb) or SWI crashing, you might need to
file a bug report, which could delay the goal of Tokyo Japan 2020.

Am Sonntag, 28. Januar 2018 11:42:59 UTC+1 schrieb Kuniaki Mukai:
Hi Jan, 

Thanks for reply.   It is interesting that thread programming could  control 
multi-cores. 

I am running my codes “sat_count” on  the “notable” benchmark problem(8,_) for ROZDD of Jan Burse 
for almost one day long after slightly modifying on hashing. It is still running…   Now I am 
afraid it keeps running till the  next Olympic Game at Tokyo Japan 2020. According to 
Activity monitor, it uses always only one core out of 4; the other three are idling.  So I asked 
the question.  Unfortunately, for my problem, multi-core programming seems not appropriate because it is difficult 
to divide the codes into independents parts. 

Kuniaki Mukai 

> On Jan 28, 2018, at 18:14, Jan Wielemaker <j...@swi-prolog.org> wrote: 
> 
> On 28/01/18 07:19, Kuniaki Mukai wrote: 
>> Hi, 
>> Is there a simple way to run a SWI-Prolog program on multi-core 
>> machine (e.g. iMac with 4 cores). Am I asking a difficult question 
>> which should not be asked easily here ? 
> 
> Not really sure what you mean.  SWI-Prolog runs fine on multi-core 
> hardware as you probably know.  If you actually want to profit, you 
> have to use thHie thread predicates.  These create multiple solvers 
> on a shared program.  This model is not very useful to compute on a 
> large data structure.  It mostly aims at servers or the need to 
> execute many small and (mostly) independent tasks. 
> 
>         Cheers --- Jan 

Kuniaki Mukai

unread,
Feb 2, 2018, 11:27:29 PM2/2/18
to Harry Stoteles, SWI-Prolog
Hi Harry,

I am surprised to know  that my rozdd builder is a target of comparison. I never 
have compared with others but always only admire them on some good points.
But anyway, thanks for information.

I believe efficiency of the builder, if any,  comes from its use of array, setarg, and hash.
I learned there is issue on using those functions. However, my favorite computation model
for ROZDD is A. Colmerauer’s  solved form of his theory on rational tree unification, more 
formally,  __initial coalgebra__ (not terminal one) for a simple operation on a set of positive integers. 
So use of array is rather natural  and essential for the current model.

As for use of setarg/3,  I have edited check_and_rehash/0, for example,  following Jan W.’s advice on 
keeping locality:

check_and_rehash :- b_getval(rozdd, R),
arg(2, R, #(C, B, HashTbl)),
( C > 2*B   %  better than 4/3 * B;  10 * B;   5*B.
-> iterative_rehash(HashTbl, HashTbl0, C0, B0),
  setarg(2, R, #(C0, B0, HashTbl0))
; true
).

For me, use of setarg/3 is quite safe. ( I don’t like to  use  “destructive” for setarg/3 in somewhat negatively).

As for rehashing, no progress, but I am using the following trigger condition for resizeing hash table (rehash)

The cardinality of the union of all current buckets > 2* the number of the current buckets

The factor 2 works better than others, as far as I tried, I don’t know why.


Kuniaki Mukai

Harry Stoteles

unread,
Feb 3, 2018, 6:55:49 AM2/3/18
to SWI-Prolog
Well the problem is an algorithmic one, not whether to use
setarg/3 or not. I don't have any objections to using setarg/3.
The reason why I am currently researching hash free counting is
the following. Let #(A) denote the sat count of A.

Now there is a nice observation, for #(A /\ B) when A and B
are independent:
- When A and B are independent for example when var(A)
   intersect var(B) = empty, then: #(A /\ B) = #(A) * #(B).
   Now assume #(A) resp #(B) takes time and space n,m
   Then to compute #(A /\ B) by this multiplication,
   would take time and space

             n+m

- I call the previous approach the separate tree approach.
  Now the combined tree approach would be first to computed
   the combined tree meet(A,B), and then read off the counting
   there, so we would approach it as #(A /\ B) = #(meet(A,B)),
   Usually meet(A,B) gets very big, assume the variables of B
   are further down in the tree, and we get a lot of copies, the
   effort would be now:

             n*m

- Now with hashing, the effort can be a little bit lowered. We
  would stil have n references inside the new A. But there would
  be only a single copy of B, which is now shared. But these
  n references have to be found. Question is whether you also
  hash the SAT count of these copies. I am speculating that
  the effort will be a mixture:

            alfa * n*m  + beta * (n + m)

So the best algorithm for independent A and B, would
be not to compute any meet and use the first approach.
What I am currently researching, is to use a compromise,
when A and B are dependent, to turn them into two

independent A' and B' and the  use the first approach.
I think the Markus Triska implementation of CLP(B)
uses such an algorithm. Not 100% sure. Give me some
time, and I will show you new figures, when I have

replicated the original Markus Triska idea. I am currently
stuck with a silly problem of enumerating the 1's bit of
a big number efficiently, which I use to represent variable
index sets. But I can present you soon some results

based on the above algorithm analysis that says we should
avoid meet(A,B). At least this holds for sure for BDD.
Maybe the situation for ROZDD is a little different. But
comming from BDD, the above describes the main problem

with computing meet(A,B), when A,B independent...

Harry Stoteles

unread,
Feb 3, 2018, 7:07:34 AM2/3/18
to SWI-Prolog
For example the units vectors in my problem are already
independent, they are defined as follows as the xi.xi = 1
constraints in my problem you Kuniaki Mukai are currently
working, dating back to some in this google group.

I dont have just the link at hand where I introduce this
problem to this google group or to you. But it shouldn't
come to a surprise to you that I am comparing with your
ROZDD, since you compare with my orthonormal

basis problem. ;-) Right? Anyway the problem we are
working on, from different sides, is orthonormal basis:
 
    Given n bit vectors x1,..,xn each of length n.
    The product of two bit vectors xj and xi is
    defined as follows:

          xj.xi := xor_k=1^n (xj[k] and xi[k])

     The problem is to find bit vectors x1,..,xn so
     that the following constraint is satisfied:
                     
                      /  1        i=j
         xj.xi =  <
                      \  0        i<>j

Now if we look at the constraint for i=j, this will
be n constraints of the form:

      x1.x1 = 1 /\ ... /\ xn.xn = 1

But they are all independent! So for example only
counting the above constraints, we could easily
use the separate tree approach. Unfortunately
the other constraints break the independence,

But independence can be reestablished, by
partially labeling the problem. So the main idea
of my hash free algorithm is as follows:

- Partially label the problem, to get independent
  sub problems and sum over them
- Count the independent sub problems with the
   fast separate tree approach

Harry Stoteles

unread,
Feb 3, 2018, 3:57:59 PM2/3/18
to SWI-Prolog
I am lost. The below idea doen't work. Maybe I should try hash
or assoc. But its against the overal design of having multiple constraints
around. But its very difficult to partially label them, take such an

example here: 

      R(X,Y) S(Y,Z) T(X,Z)

How should such a double join be splitted into multiple independent
constraints, or partially labeled. The only thing that can be done is
for example partially label X, and we get two sub problems:

      R(0,Y) S(Y,Z) T(0,Z)

      R(1,Y) S(Y,Z) T(1,Z)

But we don't have the meet/join of R,S,T already computed, and
it doesn't scale for the orthonormal basis problem, we need
to partially label very much to get something independent or

lets say a set of constraints, which have a master constraint,
and all the other slave constraints are subsumed, in the sense that
they have a subset variable set of the master variable set.

So maybe I should implement  something else, maybe use
hash as Kuniaki Mukai did. Very annoying these SAT constraints.

P.S.: On the other hand I could solve another problem,
meanwhile the card constraint works as expected:

?- card(2,[X,X,Y,Z]), labeling([X,Y,Z]), 
write(X-Y-Z), nl, fail; true.
0-1-1 1-0-0

Yeah!

Harry Stoteles

unread,
Feb 3, 2018, 4:18:46 PM2/3/18
to SWI-Prolog
Its really tempting to also conduct a hash experiment. Kuniaki
Mukai, you say you use the following keys:

       meet(I, J)
       sum(I, J)

I think meet is clear, this is join/intersection. But what do you
use sum for? Is it union or xor, which of the two?


I guess union wouldn't work, is not complete enough. You
could do the problem with union, you need xor...

Harry Stoteles

unread,
Feb 3, 2018, 4:32:24 PM2/3/18
to SWI-Prolog
Well here is a problem for next week then, in case
there will be a moment of boredom:

SAT solving - An alternative to brute force bitcoin mining
Jonathan Heusser - 03 February 2013
http://jheusser.github.io/2013/02/03/satcoin.html

Enjoy!

Harry Stoteles

unread,
Feb 4, 2018, 8:39:53 AM2/4/18
to SWI-Prolog
So we have reached a dead end. What could be done next?
One idea I posted elsewhere was refinding backtracking again.
I made the following experiment:

I added some simple slicing, via backtracking. The idea originally
deviced to use multiple cores. And I get twice the time but half the space.
Without slicing I get these timings:

?- dim(6,6,X), time((basis(X,E), expr_tree(E,T))), nl, fail; true.
% Up 1,194 ms, GC 0 ms, Thread Cpu 1,188 ms (Current 02/04/18 14:15:51)

?- dim(6,6,X), time((basis(X,E), expr_tree(E,T), tree_count(T,_,N))).
% Up 2,161 ms, GC 243 ms, Thread Cpu 1,922 ms (Current 02/04/18 14:13:29)
23040

With slicing I get these timings:

?- dim(6,6,X), term_variables(X,[A,B|_]), time((labeling([A,B]), 
basis(X,E), expr_tree(E,T))).
% Up 910 ms, GC 142 ms, Thread Cpu 766 ms (Current 02/04/18 14:19:22) % Up 962 ms, GC 0 ms, Thread Cpu 969 ms (Current 02/04/18 14:19:23) % Up 1,071 ms, GC 102 ms, Thread Cpu 968 ms (Current 02/04/18 14:19:24) % Up 970 ms, GC 0 ms, Thread Cpu 969 ms (Current 02/04/18 14:19:25) % Up 205 ms, GC 0 ms, Thread Cpu 203 ms (Current 02/04/18 14:19:25) Yes

?- dim(6,6,X), term_variables(X,[A,B|_]), time((labeling([A,B]), 
basis(X,E), expr_tree(E,T), tree_count(T,_,N))).
% Up 946 ms, GC 0 ms, Thread Cpu 937 ms (Current 02/04/18 14:23:37) 5760 % Up 1,365 ms, GC 227 ms, Thread Cpu 1,157 ms (Current 02/04/18 14:23:39) 5760 % Up 1,359 ms, GC 242 ms, Thread Cpu 1,109 ms (Current 02/04/18 14:23:40) 5760 % Up 1,090 ms, GC 0 ms, Thread Cpu 1,094 ms (Current 02/04/18 14:23:41) 5760 % Up 212 ms, GC 0 ms, Thread Cpu 218 ms (Current 02/04/18 14:23:41) Yes

The result is correct, we have 5760+5760+5760+5760 = 23040.

But there is a nice observation, the problem is highly symmetric. All the
counts for the different slices are the same. What if we built-in symmetry
into the cache? Like for example these symmetry rules:

      xor(A,B) = not(xor(A, not(B))

      count(A) = 2^n - count(not(A))

So for example if we lookup a sum(I,J) in the cache, we would also
try other terms that are symmetric as above. Would this speed up the
problem in some way? And maybe lead to a new SAT solver by a

very simple trick?

P.S.: Since my caches are thread_local predicates, these additional
cache lookups would be simple goal sequences written in pure Prolog,
like for the counting symmetry above, I would just add new checks:

     cache_not(P,Q), cache_count(Q,L,N), do the 2^N thingy

Am Sonntag, 28. Januar 2018 21:29:05 UTC+1 schrieb Harry Stoteles:
Oops, you don't use CLP(B), maybe doing somewhere:

    Console 1: X=0, Y=0

    Console 2: X=0, Y=1

    Console 3: X=1, Y=0

    Console 4: X=1, Y=1

Might slice the search.

Harry Stoteles

unread,
Feb 26, 2018, 6:53:04 PM2/26/18
to SWI-Prolog
Hi Kuniaki Mukai,

I got many questions now. Do you know whether problem(6), problem(7), ...
is NP complete. Do you think we can reduce it to NAE(X,Y,Z)?

       NAE(X,Y,Z) = (X xor Y) or (Y xor Z) [ or (X xor Z) ]

The last xor is redundant:

       Welcome to SWI-Prolog (threaded, 64 bits, version 7.7.8)
       SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.

       ?- use_module(library(clpb)).
       true.

       ?- sat((X#Y)+(Y#Z)).
       sat(X=\=X*Y#X*Z#Y#Y*Z#Z).

       ?- sat((X#Y)+(Y#Z)+(X#Z)).
       sat(X=\=X*Y#X*Z#Y#Y*Z#Z).

Actually I wanted first check whether problem(6), problem(7), can
be reduced to Horn clauses. Which would show that it is

not NP complete. But NAE(X,Y,Z) is a nice little bastard which
refuses to be a Horn clause set:

https://stackoverflow.com/a/48998548/502187

https://en.wikipedia.org/wiki/Schaefer's_dichotomy_theorem#Properties_of_Polymorphisms

Last question, since you worked a lot with RZODD, do you
think RZODD have an affinity to Horn clauses?

Kuniaki Mukai

unread,
Feb 26, 2018, 9:04:03 PM2/26/18
to Harry Stoteles, SWI-Prolog
Hi Harry,

problem(6), problem(7), … are all blackboxes for me. I have no idea in them.  
In fact, they are just copies from Jan Burse for a benchmark test.  
So I am not able to to reply your questions.  Sorry for that.

I have not a lot of work on rozdd, but just a little bit of a hobbyist effort to understand
the notion rozdd within SWI-Prolog programming.  

IMHO, an efficient  library to manipulate rozdd should be prepared in SWI.
Although we have already a nice and great clpb library, rozdd is hidden behind 
attributed variables and boolean constraints, which of course is an excellent idea of Markus T. using
familiar boolean form for interface.  

M., Kuniaki

Kuniaki Mukai

unread,
Feb 26, 2018, 11:45:13 PM2/26/18
to Harry Stoteles, SWI-Prolog

I have uploaded a couple of jpgs for rozdd for problem4 and problem5 to 
express thanks to Jan Burse  for my using it, without knowing what they are.



K. Mukai


Hi Harry,

problem(6), problem(7), … are all blackboxes for me. I have no idea in them.  
In fact, they are just copies from Jan Burse for a benchmark test.  
So I am not able to to reply your questions.  Sorry for that.

I have not a lot of work on rozdd, but just a little bit of a hobbyist effort to understand
the notion rozdd within SWI-Prolog programming.  

IMHO, an efficient  library to manipulate rozdd should be prepared in SWI.
Although we have already a nice and great clpb library, rozdd is hidden behind 
attributed variables and boolean constraints, which of course is an excellent idea of Markus T. using
familiar boolean form for interface.  

M., Kuniaki


On Feb 27, 2018, at 8:53, Harry Stoteles <in...@xlog.ch> wrote:

Harry Stoteles

unread,
Feb 27, 2018, 4:46:13 AM2/27/18
to SWI-Prolog
What do the numbers on the edges mean?

Kuniaki Mukai

unread,
Feb 27, 2018, 5:40:39 AM2/27/18
to Harry Stoteles, SWI-Prolog

On Feb 27, 2018, at 18:46, Harry Stoteles <in...@xlog.ch> wrote:

What do the numbers on the edges mean?

Integers assigned internally to the linearly ordered boolean variables.

—km



Harry Stoteles

unread,
Feb 27, 2018, 6:55:47 AM2/27/18
to SWI-Prolog
Maybe the solving of the problem(6), problem(7), etc... could
be made faster, if the problem is cut up into sub problems
of the folloiwing type:

- Horn Clause subproblem
- dual Horn Clause subproblem
- Rest of the problem

Francois Bry and Rainer Manthey did something similar in
their SATCHMO: a theorem prover implemented in Prolog,
1988, they called the dual Horn Clauses negative Clauses.

See also:
https://www.researchgate.net/publication/44790956_SATCHMO_a_theorem_prover_implemented_in_Prolog

Both Horn Clauses and dual Horn Clauses would be of
complexity P I guess, so if the sub problem are solved
fast, the main problem could be also solved faster.

But this is just an idea in its early fancy.
Reply all
Reply to author
Forward
0 new messages