Automatic differentiation in CHR

162 views
Skip to first unread message

Samer Abdallah

unread,
Apr 11, 2017, 7:52:42 PM4/11/17
to SWI-Prolog
Hello all,

I recently noticed that automatic differentiation would be a really neat solution to
a problem I’m working on. Rather than rushing straight into the horrible Pythony arms
of Theano, TensorFlow etc, I thought perhaps I could do it in Prolog using CHR. I tried a 
month or two ago but only got into a mess because I have no clue what I’m doing with
CHR, though I did get a sense that my problem was with aggregation, which is apparently 
a known difficulty with CHR (see ‘Aggregates for Constraint Handling Rules’ by Jon Sneyers, 
Peter Van Weert, and Tom Schrijvers).
Anyway, I tried again yesterday a got something that sort of works. Here it is (this is
reverse mode differentiation, which is basically the same as gradient back-propagation
in neural networks):

:- chr_constraint deriv/3, agg/2, acc/1, acc/2, mul/3, add/3, log/2, exp/2, exp1/2, log1/2, mul1/3, add1/3, go/0.

remove_dup  @ deriv(L,X,DX) \ deriv(L,X,DX1) <=> DX=DX1.
identity    @ deriv(L,L,DL) <=> DL=1.
decl_accum  @ deriv(_,_,DX) ==> acc(DX). % declares DX as something that needs to be summed
deriv_log   @ deriv(L,X,DX), log(X,Y)   ==> deriv(L,Y,DY), mul1(1/X,DY,Z), agg(Z,DX).
deriv_exp   @ deriv(L,X,DX), exp(X,Y)   ==> deriv(L,Y,DY), mul1(Y,DY,Z),   agg(Z,DX).
deriv_add_l @ deriv(L,X,DX), add(X,_,Y) ==> deriv(L,Y,DY),                 agg(DY,DX).
deriv_add_r @ deriv(L,X,DX), add(_,X,Y) ==> deriv(L,Y,DY),                 agg(DY,DX).
deriv_mul_l @ deriv(L,X,DX), mul(K,X,Y) ==> deriv(L,Y,DY), mul1(K,DY,Z),   agg(Z,DX).
deriv_mul_r @ deriv(L,X,DX), mul(X,K,Y) ==> deriv(L,Y,DY), mul1(K,DY,Z),   agg(Z,DX).

remove_dup  @ acc(X) \ acc(X) <=> true.

accum_recur @ acc(S1,X), agg(Z,X) <=> add1(Z,S1,S2), acc(S2,X).
accum_base  @ acc(S,X) <=> S=X.

mul(X,Y,Z) ==> mul1(X,Y,Z).
add(X,Y,Z) ==> add1(X,Y,Z).
log(X,Y)   ==> log1(X,Y).
exp(X,Y)   ==> exp1(X,Y).

log1(X,Y)   <=> ground(X) | Y is log(X).
exp1(X,Y)   <=> ground(X) | Y is exp(X).
mul1(X,Y,Z) <=> ground(X-Y) | Z is X*Y.
add1(X,Y,Z) <=> ground(X-Y) | Z is X+Y.

go \ acc(DX) <=> acc(0,DX).

As you can see, as well as the house-keeping rules and self-explanatory arithmetic, there
are rules implementing the differentiation chain-rule for sums, produces, log and exp.
The constraint deriv(L,X,DX) means 'the derivative of L with respect to X is DX’.
The constraint agg(Z,DX) means ‘Z is an additive contribution to DX’. The system
of constraints acc/1, acc/2 and go/0 handle the summation of agg/2 constraints: acc(X)
means ‘X will be a sum of values found in agg(_,X) constraints’ and acc(S,X) means ‘X is S plus
the sum of values in any remaining agg(_,X) constraints’.
Here’s an example call:

?- add(X,Y,L), mul(Z,V,X), mul(Z,W,Y), log(L,LL), deriv(LL,Z,DZ), go, V=4, Z=2, W=3.
X = 8,
Y = 6,
L = 14,
Z = 2,
V = 4,
W = 3,
LL = 2.6390573296152584,
DZ = 0.5,
deriv(2.6390573296152584, 8, 1/14),
deriv(2.6390573296152584, 14, 1/14),
deriv(2.6390573296152584, 6, 1/14),
deriv(2.6390573296152584, 2, 0.5),
mul(2, 3, 6),
mul(2, 4, 8),
add(8, 6, 14),
log(14, 2.6390573296152584),
go.

It works, but it’s less than ideal for the following reasons:
1. The handling of deriv/3 and acc/{1,2} is only correct before the various variables are instantiated. 
    This means the whole thing can go wrong if the operands receive their values before go/0 is called,
    even for values which are logically constants, e.g. foldl(mul, [X1,X2], P0, P1), deriv(P1,X1,DX1)
    doesn’t work if P0 is replaced with the constant 1.
2. The constraint go/0 has to be invoked to trigger the accumulation of sums. I
   couldn’t think of any way to make the process completely automatic.

Problem 2 is probably liveable with (and could conceivably be unavoidable), but Problem
1 points to a logical failing (I suspect each node in the computation graph needs some sort
of ground unique identifier). So, is there anyone here with more CLP/CHR-fu than me that would 
be interested in helping to make it better? Automatic differentiation is all the rage in machine 
learning at the moment, and I don’t see why those Python programmers should have all 
the fun…

cheers,
Samer



signature.asc

Jan Burse

unread,
Apr 15, 2017, 5:57:09 AM4/15/17
to SWI-Prolog
deriv was part of the thesis of Warren to show that Prolog can
be used for the same things as LISP, since the very original
deriv dates back to

Warren, D.H.D. (1978): Applied Logic - Its Use and
Implementation as a Programming Tool,

@Samer Abdallah If you add subst and eval, you can also do taylor
expansion. I understand that you try to have eval via CHR instantiation.
The example you show indicates this.

But somehow taylor expansion is a use case where this doesn't
might work. There are two conflicting requirements in taylor expansions
which breaks the CHR instantiaion view.

a) Requirement 1: For each summand of the taylor expansion
    we need the n-th derivative. So typically when we have compute
    a derivative, we will keep it and not instantiate it, to be in position
    of computing the next derivatve.

b) Requirement 2: For each summand of the taylor expansion,
     or for simplicity look at Maclaurin series where a=0, we need
     to evaluate the derivative at the point x=0. This calls for instantiation,
     but maybe its in conflict with requirement 1.

Maybe you are lucky that you don' have this conflict. But many
derivation rules leave some sub expressions unchanged. For
example the derivation rule for F*G is:

     (F*G)'  = F'*G + F*G'

So you have a term G and F unchanged in the result. So that
indeed an instantiation conflict might arise. The workaround is to
have a subst predicate, which does substitution, and creates a copy.

If the subst predicate is applied for an expression that is later
free of variables, the subst predicate could do evaluation at the
same time, so that the attribuite variable instantiation mechanism

of CHR doesn't come into play. But for more advanced taylor
expansion, where parameters are involved, the subst predicate
would perform on a symbolic level, and return a new expression.

I have already implemented deriv and subst in my december 2016
prototypes, but I didn't do yet taylor, since it seems that taylor
also needs limes. To evaluate the derivatives at point x = 0,

this sometimes does't work so easily, and it seems that existing
CAS systems such as Wolfram Alpha or symbollab do limes
at point x = 0 and not only simple evaluation, one more issue that

is probably not easily solve by the CHR variable instantiation view...

Jan Burse

unread,
Apr 15, 2017, 10:35:08 AM4/15/17
to SWI-Prolog
Ok I made a prototype since I could resist. For tayler it goes recursively
down, computes all the derivs and then while going leaving the recursion
it applies a horner schema. Here is an example run:

?- X is 1/(1+A), Y is taylor(X, A, 5).
X is 1/(1+A),
Y is 1-A+A^2-A^3+A^4-A^5

But its a little fragile, somehow the derivations get quite large, so that
I finaly end up with a memory error for large number of requested
summands:

?- X is 1/(1+A), Y is taylor(X, A, 50).
Error: Execution aborted since memory threshold exceeded.
    functor/3
    is_abnormal/1
    (\+)/1
    is/2
    sys_poly_add/3

I dunno yet what can be done against this. I thought the CHR code
could be a nice device to experiment with evaluation strategies, in
understanding them and in further developing them. There seems
to be a forward and backward mode of computing derivations,

which I do not yet understand. Also using duals numbers, already used
by the German mathematician Eduard Study have a certain appeal
to me since they can combine derive and subst in one step, but I didn't
figure out yet how to effectively do the n-the derivative.

For more introductory information see for example here:

Automatic differentiation - H°avard Berland 2016
http://www.robots.ox.ac.uk/~tvg/publications/talks/autodiff.pdf

Jan Burse

unread,
Apr 15, 2017, 10:40:42 AM4/15/17
to SWI-Prolog
The prototype code is open source on gist and it depends on a
further polynomial gist and the Jekejeke wrapped attribute variables:
https://gist.github.com/jburse/d8be109a24bdd52218472ecb7e68b28a#file-series-p

Samer Abdallah

unread,
Apr 15, 2017, 5:30:59 PM4/15/17
to Jan Burse, SWI-Prolog
On 15 Apr 2017, at 10:57, Jan Burse <burs...@gmail.com> wrote:

deriv was part of the thesis of Warren to show that Prolog can
be used for the same things as LISP, since the very original
deriv dates back to

Warren, D.H.D. (1978): Applied Logic - Its Use and
Implementation as a Programming Tool,

Interesting - I hadn’t seen Warren’s thesis before, from back when
Prolog was brand new! It looks like his ‘deriv’ is essentially
symbolic differentiation, as it builds a big unevaluated expression
for the entire derivative.

When I started looking into AD, I didn’t fully appreciate the difference
between symbolic differentiation (which I had experimented with in OCaml,
writing a little test program that derives the equations of motion for a gravity 
simulation by differentiating the system Hamiltonian) and AD, which is more 
of a composition of numerical operations, avoiding the building of a big 
symbolic expression of the whole derivative, as described in those PDF slides 
you linked. Of the two main approaches to AD, reverse mode is more suitable 
for my application than forward mode with dual numbers - dual numbers are 
neat, and can be implemented by overloading the arithmetic operators, but it’s
more computationally expensive than reverse mode when you want the gradient 
of a scalar valued function with respect to many parameters. Of course, reverse mode 
is recognisable as a generalisation of back-propagation.


@Samer Abdallah If you add subst and eval, you can also do taylor
expansion. I understand that you try to have eval via CHR instantiation.
The example you show indicates this.

I’ve not tried doing more than one derivative as I don’t need them just yet,
but I think other reverse mode AD libraries let you do multiple derivatives,
eg to get the Hessian for numerical optimisation.

I should probably explain the other trick I’m using to avoid doing an
explicit subst operation - the result of my CHR autodiff is a set of 
attributed variables with delayed goals like a data flow graph, 
which is defined once but used many times by (the horror) copying the whole 
lot before unifying the ‘inputs’ and reading off the ‘outputs’, e.g. taking my 
previous example:

?- add(X,Y,L), mul(Z,V,X), mul(Z,W,Y), log(L,LL), deriv(LL,Z,DZ), go, 
   T = t(V,S,W,LL,DZ),
   copy_term(T, t(4,2,3,LL2,DZ1)),
   copy_term(T, t(2,4,5,LL2,DZ2)),
   copy_term(T, t(3,6,7,LL3,DZ3)).

This does the differentiation once, creates a load of delayed additions
and multiplications (essentially, the CHR part ‘compiles' the differentiation 
down to a graph of primitive numerical operations) and then applies them three 
times to three different lots of values -  this is just what you need for an iterative 
optimisation. There is no subst because the variables are simply instantiated, and 
there is no eval because numerical operations are already there waiting to be 
triggered. The CHR part is rather slow, but the repeated evaluations are not that 
much slower than normal arithmetic in SWI Prolog.

I’ll have to see if I can get it to work with multiple derivatives as you describe
below.


Samer.
limes?

--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+...@googlegroups.com.
Visit this group at https://groups.google.com/group/swi-prolog.
For more options, visit https://groups.google.com/d/optout.

signature.asc

Samer Abdallah

unread,
Apr 19, 2017, 5:47:26 PM4/19/17
to Jan Burse, SWI-Prolog
Hi Jan,

I thought a bit more about what you said about multiple derivatives,
eg for a Taylor series, and tried modifying my code to support it.
I started with multiple for derivatives for X^K, with K an integer constant.
I made it so that the derivative propagation rules expanded the computation 
graph using the same mul/3, add/3, pow/3 etc operators as the top level 
interface, so that the results of these added operations could
themselves be differented. This worked for positive powers,
since this created a recursion which stopped at K=0, since X^0 = 1 for all
X. For negative powers, it entered an infinite loop: X^(-1) triggers computation
of X^(-2), which triggers X^(-3) etc. Hence, more generally, this approach
is not going to work.

The problem was that, although the algorithm was notionally reverse mode
differentiation, a request for dY/dX using deriv(Y,X,DYDX) was requesting
dY/dZ for *all* nodes Z that X is an input to, including nodes Z created by
the request for dY/dX, even when the forward path through Z did not reach Y.
The CHR execution was thus rippling *forwards* through the computation
graph, not backwards, a bit like forward chaining in logic programming.

So I tried another approach, where the API includes back/1, such that
back(Y) declares that dY/dY = 1, thus triggering the computation of dY/dZ for 
all nodes Z reached by travelling backwards through the computation graph. 
The user’s request for, e.g. deriv(Y,X,DZDX)  simply ‘catches’ the result for dY/dX 
that was or will be produced as the deriv/3 recursion ripples back through the graph.

Requesting second or higher derivatives in this framework is not a problem
and I was able to get the Taylor series coefficients without any problem.

Bearing in mind that I still don’t really know what I’m doing with CHR, some code (enough to 
replicate the Taylor series example -- full gist with log/2 and exp/2 is here:
below. It’s a bit sensitive to the order in which certain constraints are invoked - 
CHR is a bit of a strange beast in this regard: in one way it’s very high level code,
but in another, it feels quite imperative in the way that the constraint store has
to be massaged by doing things in the right order.
 
:- module(autodiff, [mul/3, add/3, pow/3, deriv/3, back/1, go/0, compile/0, derivs/3, taylor/4]).
:- use_module(library(chr)).
:- chr_constraint add(?,?,-), mul(?,?,-), pow(+,-,-), deriv(?,-,?), agg(?,-), acc(?,-), acc(-), go, back(-), compile.

% operations interface with simplifications
mul(0.0,_,Y) <=> Y=0.0.
mul(_,0.0,Y) <=> Y=0.0.
mul(1.0,X,Y) <=> Y=X.
mul(X,1.0,Y) <=> Y=X.
mul(X,Y,Z1) \ mul(X,Y,Z2) <=> Z1=Z2.
pow(1,X,Y) <=> Y=X.
pow(0,_,Y) <=> Y=1.
add(0.0,X,Y) <=> Y=X.
add(X,0.0,Y) <=> Y=X.
add(X,Y,Z1) \ add(X,Y,Z2) <=> Z1=Z2.

% initiatiate back-propagation starting from Y
back(Y) <=> var(Y) -> deriv(Y,Y,1.0); true.

% propagate derivatives through unary and binary operators
deriv(L,X,DX) \ deriv(L,X,DX1) <=> DX=DX1.
deriv(L,_,DX) <=> ground(L) | DX=0.0.
deriv(_,_,DX) ==> var(DX) | acc(DX).
deriv(L,Y,DY), pow(K,X,Y)   ==> deriv(L,X,DX), pow_contrib(K,X,DY,Z), agg(Z,DX).
deriv(L,Y,DY), add(X1,X2,Y) ==> maplist(add_contrib(L,DY),[X1,X2]).
deriv(L,Y,DY), mul(X1,X2,Y) ==> maplist(mul_contrib(L,DY),[X1,X2],[X2,X1]).
deriv(L,Y,DY), agg(X1,Y)    ==> add_contrib(L,DY,X1). % allows multiple derivs before go/0 is called.

pow_contrib(K,X,DY,Z)   :- K1 is K - 1, KK is float(K), pow(K1,X,XpowK1), mul(KK,XpowK1,W), mul(DY,W,Z).
mul_contrib(L,DY,X1,X2) :- var(X1) -> deriv(L,X1,DX1), mul(X2,DY,T1), agg(T1,DX1); true.
add_contrib(L,DY,X1)    :- var(X1) -> deriv(L,X1,DX1), agg(DY,DX1); true.

acc(X) \ acc(X) <=> true.

% reduce deriv/3, agg/2 and acc/{1,2} to set of add/3 constraints
go \ deriv(_,_,_) <=> true.
go \ acc(DX) <=> acc(0.0,DX).
go <=> true.

acc(S1,X), agg(Z,X) <=> add(Z,S1,S2), acc(S2,X).
acc(S,X) <=> S=X.

% convert arithmetic constraints to frozen goals.
compile \ add(X,Y,Z) <=> delay(X+Y,Z).
compile \ mul(X,Y,Z) <=> delay(X*Y,Z).
compile \ add(X,Y,Z) <=> delay(X+Y,Z).
compile \ pow(K,X,Y) <=> delay(X**K,Y).
compile <=> true.

delay(Expr,Res) :- when(ground(Expr), Res is Expr).

% ------------ multiple derivatives and Taylor series ----------

derivs(X,Y,[1.0|Ds]) :- foldl(d(X),Ds,Y,_).
d(X,DYDX,Y,DYDX) :- back(Y), deriv(Y,X,DYDX), go.

taylor(N,X,Y,Cs) :-
   length(Ds,N), derivs(X,Y,Ds),
   compile,                 % ack! this destroys the computation graph!
   copy_term(X-Ds,0.0-Ds0), % NB. like a lambda call but needs attributes to be copied too
   numlist(1,N,Is),
   foldl(nth_deriv_coeff,Is,Ds0,Cs,1.0,_). 

nth_deriv_coeff(I,D,C,P1,P2) :- P2 is P1*I, C is D/P1.

With this loaded in SWI Prolog, try these (the X=0.0 in the second goal is not 
necessary, it is only there to get rid of a load of frozen goals which would otherwise
obscure the result when printed):

?- add(1.0,X,X1), pow(-1,X1,Y), time(taylor(5,X,Y,Cs)).
?- add(1.0,X,X1), pow(-1,X1,Y), time(taylor(50,X,Y,Cs)), X=0.0, writeln(Cs).

As you can see, there is no problem with 50 or more terms in the expansion!
If you run only the first two CHR goals, you will see the CHR representation
of the computation Y = (1+X)^(-1). Unfortunately, the use of compile/0 in taylor/4 
destroys this computation graph, so it cannot be used for any more differentiations
afterwards. I suppose the simpagation rules for compile could be replaced with
propagation rules, but I don’t have a good sense of how best to manage the
accumulation of constraints in the store that would result from this.

If there any CHR experts listening, I would love to get your take on this!

cheers,
Samer

ps. [below]

On 15 Apr 2017, at 15:35, Jan Burse <burs...@gmail.com> wrote:

understanding them and in further developing them. There seems
to be a forward and backward mode of computing derivations,
which I do not yet understand.

Forward-mode differentiation computes d <Everything in the graph> / d <selected inputs>,
and can be done by overloading ordinary arithmetic operators to work with dual-numbers,
without creating a data structure to represent the computation graph
The derivatives are computed exactly in parallel with the forward computation.

Reverse mode computes d <selected outputs> / d<Everything in the graph>.
It requires a representation of the computation graph in order to collect together,
for each node, all the 'downstream’ nodes that depend on it directly.

Forward mode is generally cheaper when the number of inputs is less than the number
of outputs, while reverse mode is cheaper when we want the derivative of a small number
of outputs wrt many inputs, and hence is good for optimising loss functions with many
parameters.

Also using duals numbers, already used
by the German mathematician Eduard Study have a certain appeal
to me since they can combine derive and subst in one step, but I didn't
figure out yet how to effectively do the n-the derivative.

For more introductory information see for example here:

Automatic differentiation - H°avard Berland 2016
http://www.robots.ox.ac.uk/~tvg/publications/talks/autodiff.pdf

Am Samstag, 15. April 2017 11:57:09 UTC+2 schrieb Jan Burse:
a) Requirement 1: For each summand of the taylor expansion
    we need the n-th derivative. So typically when we have compute
    a derivative, we will keep it and not instantiate it, to be in position
    of computing the next derivatve.

b) Requirement 2: For each summand of the taylor expansion,
     or for simplicity look at Maclaurin series where a=0, we need
     to evaluate the derivative at the point x=0. This calls for instantiation,
     but maybe its in conflict with requirement 1.


signature.asc

Samer Abdallah

unread,
Apr 19, 2017, 6:32:22 PM4/19/17
to Jan Burse, SWI-Prolog
p.s.

There was a bug in my code that was masked in the example Y = 1/(1+X).
The zeroth order derivative of Y is Y, not 1.0:

-derivs(X,Y,[1.0|Ds]) :- foldl(d(X),Ds,Y,_).
+derivs(X,Y,[Y|Ds]) :- foldl(d(X),Ds,Y,_).

Samer
signature.asc

Jan Burse

unread,
Apr 19, 2017, 7:15:41 PM4/19/17
to SWI-Prolog
Hi,

Cool!

In case I would use CLP(*) the same code wouldn't work for
my system, since I have implememted copy_term/2 without
copying the attribute variables.

Your result isn't more expensive, than a dedicated subst
which does anyway also a copy. So basically there is proof
that taylor can be also done with CLP(*).

Bye

BTW: I fixed the performance problem, I introduce some
structure sharing into the d (U/V)/dx rule. You don't have
such a rule, you seem to use U^(-1).

The interesting thing of this structure sharing is, that it
is different from the structure sharing for d (U*V)/dx where
we would typically share U and V.

Here I am sharing the whole U/V, not sure how this would
translate to U^(-1), something like -(U^(-1))^2 maybe?
So that I have implemented

     d (U/V)/dx = (d U/dx - d V/dx *(U/V)) / V

Since weekend on Github and with API docu as well:
https://github.com/jburse/jekejeke-devel/blob/master/jekmin/headless/jekmin/frequent/leibniz/deriv.p#L139
Message has been deleted

Samer Abdallah

unread,
Apr 20, 2017, 8:37:41 AM4/20/17
to Jan Burse, SWI-Prolog
On 20 Apr 2017, at 00:15, Jan Burse <burs...@gmail.com> wrote:

BTW: I fixed the performance problem, I introduce some
structure sharing into the d (U/V)/dx rule. You don't have
such a rule, you seem to use U^(-1).

So that I have implemented

     d (U/V)/dx = (d U/dx - d V/dx *(U/V)) / V

I tried adding a div/3 constraint with a deriv/3 rule something like this:

deriv(L,Y,DY), div(X1,X2,Y) ==> % ie Y = X1/X2
   (var(X1) -> deriv(L,X1,DX1), div(DY,X2,T1), agg(T1,DX1); true),
   (var(X2) -> deriv(L,X2,DX2), mul(X2,X2,X2sq), mul(-1.0,X2sq,NX2sq), 
               div(DY,NX2sq,T2), agg(T2,DX2)
   ; true).

but it requires a bunch of extra rules to handle div/3 everywhere else, and
also suggests of adding more rules for sub/3 (subtraction)
and possibly neg/2 for negation. Seeing as pow/3 is quite useful anyway, sticking 
with just add/3, mul/3,  and pow/3 seems like a more minimal but complete set 
of primitives. Sharing of subterms is not such a big issue because the
nodes in the computation are just variables, eventually instantiated to numbers. 
Negative powers resulting from differentiation are indeed computed (by is/2) as (X**K) for
negative K, rather than by repeated division (Numerator/X/X/X …) but I don’t think
that’s a bad thing.

Samer.

p.s. have also added documentation to the gist here:

signature.asc

Jan Burse

unread,
Apr 20, 2017, 9:31:08 AM4/20/17
to SWI-Prolog
I have a next challenge. But its not really differentiation, but more
dealing with symbolic expressions in general, but still relevant. I
dunno how much has done in Prolog already.

Lets look at algebraic numbers, they are zeros of a polynomial
p(x). For a polynomial p(x) of degree n, we can define a new
one p*(x) as follows:

    p*(x) = x^n * p(1/x)

If a is a root of p(x), then 1/a is a root of p*(x). So for algebraic
numbers, p*(x) is the multiplicative inverse of p(x). I get p*(x)
via substitution(*), namely with this code:

      star(P, X, N, Q) :- Q is X^N*subst(P,X,1/X).

For the golden ratio, phi = 1.618033..., which has the polynomial
p(x) = x^2-x-1, I get the polynomial p*(x) = -x^2-x+1, which has
the positive root Phi = 0.618033...:

      ?- P is X^2-X-1, Q is star(P,X,2).
      P is - 1-X+X^2,
      Q is 1-X-X^2

We see indeed Phi = 1/phi. Can we extend this to do arithmetic
over non-constant radicals. i.e. phi = 1.618033... = (1 + sqrt(5))/2
is a constant radical. But maybe we compute then with

expressions such as a+sqrt(b) symbolically? (**)

(*)
The clever observer might notice that we can simply reverse
list the coefficients. But maybe we need other methods to scale
up when coefficients are symbolic etc...

(**)
I would like to do taylor expansion of sqrt(1+x). But I didn't
introduce radicals already. A cludge might be to view it as (1+x)^(1/2),
which could already work for Samer Abdallah approach.

Jan Burse

unread,
Apr 20, 2017, 9:49:08 AM4/20/17
to SWI-Prolog
In one of his works Isaac Newton did a series development
symbolically for things like sqrt(a^2+x^2) and some such.
This would be the next step...to have this working...

Method of Fluxions - Sir Isaac Newton 1671

http://gymportalen.dk/sites/lru.dk/files/lru/2_1_a_treatise.pdf

You find the example sqrt(a^2+x^2) on page 12 where it
says "Examples of Reduction by Extraction of Roots". One
must get the same result with taylor,

since taylor expansions are unique. So can Prolog 2017 be
as intelligent as Sir Isaac Newton 1671 ?


Am Donnerstag, 20. April 2017 15:31:08 UTC+2 schrieb Jan Burse:
expressions such as a+sqrt(b) symbolically? (**)

Jan Burse

unread,
Apr 20, 2017, 10:01:47 AM4/20/17
to SWI-Prolog
Disclaimer: Hopefull this also holds for symbolic expansions,
besides the problem of comparing coefficients symbolically.
We have infinite series, and if we develop them up to n-summands.

We stil are not sure what the rest of the summands is. So
to complete the exercise, we might also need an infinite
sum notation, and reasoning about this.

Problems, problems, ...

Samer Abdallah

unread,
Apr 21, 2017, 4:58:44 PM4/21/17
to Jan Burse, SWI-Prolog
On 20 Apr 2017, at 14:31, Jan Burse <burs...@gmail.com> wrote:

I have a next challenge. But its not really differentiation, but more 
dealing with symbolic expressions in general, but still relevant. I 
dunno how much has done in Prolog already.

Lets look at algebraic numbers, they are zeros of a polynomial
p(x). For a polynomial p(x) of degree n, we can define a new 
one p*(x) as follows:


I might have to leave this till later as I have to work on my application now!
BTW, for an entertaining comparison, take a look at this:


(**)
I would like to do taylor expansion of sqrt(1+x). But I didn't
introduce radicals already. A cludge might be to view it as (1+x)^(1/2),
which could already work for Samer Abdallah approach.

Indeed it does!

?- add(1.0,X,X1), pow(-0.5,X1,Y), time(taylor(50,X,Y,Cs)), X=0.0.
% 793,170 inferences, 0.233 CPU in 0.236 seconds (99% CPU, 3405186 Lips)
X = 0.0,
X1 = Y, Y = 1.0,
Cs = [1.0, -0.5, 0.375, -0.3125, 0.2734375, -0.24609375, 0.2255859375, -0.20947265625, 0.196380615234375|...].

After making changes to avoid conversions to floats (replacing floating point constants in the code 
with integers, plus changing two other lines) we get:

?- add(1,X,X1), pow(-1 rdiv 2,X1,Y), time(taylor(50,X,Y,Cs)), X=0.
% 971,174 inferences, 0.196 CPU in 0.198 seconds (99% CPU, 4966118 Lips)
X = 0,
X1 = Y, Y = 1,
Cs = [1, -1 rdiv 2, 3 rdiv 8, -5 rdiv 16, 35 rdiv 128, -63 rdiv 256, 231 rdiv 1024, -429 rdiv 2048, ...rdiv...|…].

Samer



signature.asc

Jan Burse

unread,
Apr 29, 2017, 9:52:56 AM4/29/17
to SWI-Prolog
In  the past there was probably a primacy of numerical methods. I don't
have yet a complete grip of the advantages of symbolic methods, especially
in the demain of deep leraning.

But here have a little sample of some advantage of symbolic methods,
I discovered yesterday. Take the problem of generating a two dimensional
plot of a function g(x,y). We just want to assign color values to g(x,y)

in a grid. Here are two methods, assume the function is given by a
symboic expressions F and two variables A,B:

1) Without currying:
    At each grid cell evaluate g(x,y), in my system this is done via
    subst(subst(F,A,X),B,Y). I guess there is also a CHR/CLP(*) variant
    to do this?

2) With currying:
    At each row pre-evaluate h= lambda x.g(x,y), in my system this is
    done via H is subst(F,B,Y). Then in each row for each column
    evaluate h(x), in my system this is done via subst(H,A,X). I guess
    there is also a CHR/CLP(*) variant to do this?

I made a little test, and plotted some functions this way. And measured
the time taken to generate a SVG file. Here are the results:

Without currying: plot(X^2+Y^2-1,X,Y) --> ca 2 1/2 secs
With currying: plot(X^2+Y^2-1,X,Y) --> ca 1 secs

This is a nice benefit! Can this be reproduced in CHR/CLP(*)?

See also: What do have Currying and Plotting in Common?
https://plus.google.com/+JekejekeCh/posts/PiM9EfkTniQ

Jan Burse

unread,
May 14, 2017, 11:27:02 AM5/14/17
to SWI-Prolog
During my work on implicit plotting, I stepped over a new
challenge. I wanted to make a contour plot of sqrt(X^2+Y^2)

and noticed I cannot yet exactly compute with radical
expressions. Happily sums of square roots work now.
We also solved comparison of radical expressions, a
challenging example is this pair:

    a = sqrt(98428513)+sqrt(101596577)

    b = sqrt(400025090)

Be careful a simple poket calculator wont help you!
Which lead me to thid post:

Where Computer Algebra Systems beat Constraint Logic Programming
http://stackoverflow.com/a/43965110/502187

But its a little bit unfair to say CAS beats CLP, but the
above is the twitter headline and I had only 150 chars available.
More precisely its a case where CAS is better than CLP(FD).
But the CHR demonstrations here show also that

CAS and CHR are not necessarely disjoint...

Samer Abdallah

unread,
Jun 5, 2017, 7:17:00 AM6/5/17
to SWI-Prolog, Jan Burse
Hello all,
I uploaded a technical note on this differentiation work to arXiv:

cheers,
Samer
signature.asc

Samer Abdallah

unread,
Jun 5, 2017, 8:00:57 AM6/5/17
to Jan Burse, SWI-Prolog
Hi Jan,
I would say this is orthogonal to the problem of automatic differentiation,
but I would guess that this kind of partial evaluation is perfectly doable in
in a CLP-based numeric system. In my autodiff system, we build what is
essentially a single-use push-driven data flow graph by setting up delayed
goals like 

when(ground(X,Y), Z is X+Y), etc.

Alternatively one could use CLP(R) or CLP(Q). I’m using when(ground, _) because
of numerical accuracy problems with CLP(R).

To use the graph multiple times, we must copy all the attributed variables,
before unifying the inputs with ground values, eg

copy_term(t(X,Y,Z), t(X1,Y1,Z1)) , X1=2, Y1=3.
% Z1=5

or just
copy_term(t(X,Y,Z), t(2,3,Z1)).
% Z1=5

Note that, if we had a lambda library that used copy_term/2 and not
copy_term_nat/2, then these semantics would be captured as a lambda term
with an empty body, since the copy and unify would do the whole job:

call(\X^Y^Z^true, 2, 3, Z1).
% Z1=5
To use partial evaluation, just unify one of the variables. This will trigger
any computations that can be done immediately, and, if using CLP(Q/R)
might result in a simplification of the remaining computation graph.
Then use copy_term with remaining variables to evaluate at different
values of the other argument.

copy_term(t(X,Y,Z), t(2,Y1,Z1)),   % partial eval at X=2
copy_term(t(Y1,Z1), t(1,Z2_1)),  % complete eval at Y=1
copy_term(t(Y1,Z1), t(2,Z2_2)),  % … Y=2
copy_term(t(Y1,Z1), t(3,Z2_3)). % … Y=3

Using an attribute-copying lambda, this would be

call(\X^Y^Z^true, 2, Y1, Z1),
maplist(\Y1^Z1^true, [1,2,3], Zs)

This kind of partial application could conceivably be built into the lambda library -
term structures of \X^Y^Z^true and \Y^Z^true look like they are vaguely compatible
with the idea, since \X^Y^Z^true unifies with \X^Rest, leaving the partially applied lambda
as just \Rest. For normal predicates, this is no help, because you can’t call the predicate
until all the arguments are ready, but with CLP and attributed variables, things can 
happen just by unifying the first argument with something.

.. All of this leads me to a question which I had on my mind a while ago — why do
both lambda libraries (Ullrich’s lambda and the newer yall) use copy_term_nat
and not just copy_term? What’s wrong with copying the attributes? If there aren’t
any, it won’t make any difference, and if there are, then arguably it makes more sense
to keep them.

Samer
signature.asc

Jan Burse

unread,
Jun 5, 2017, 9:29:07 AM6/5/17
to SWI-Prolog
+1

Jan Burse

unread,
Jun 5, 2017, 6:14:56 PM6/5/17
to SWI-Prolog
I wanted to try your gist with SWI-Prolog 7.5.5, but its not working.
When I have fixed the singleton warning for Cs0, I still get no result:

?- add(1.0,X,X1), pow(-1,X1,Y), time(taylor(5,X,Y,Cs)), X=0.0.
% 2,235 inferences, 0.000 CPU in 0.001 seconds (0% CPU, Infinite Lips)

X = 0.0,
X1 = Y, Y = 1.0,
Cs = [1.0, 0.0, 0.0, 0.0, 0.0].

The correct result should be [1.0, -1.0, 1.0, -1.0, 1.0] I guess.
Whats going wrong?

https://gist.github.com/samer--/3bc9d4785bbafcbf24bb15e1c3f0efe4

Jan Burse

unread,
Jun 5, 2017, 10:37:52 PM6/5/17
to SWI-Prolog
So I made my own version, with a copying subst, see source
at end of my post. Seems to work and doesn't use copy_term:

?- add(1,X,H), pow(H,-1,J), taylor(J,X,5,R).

R = [1, -1, 1, -1, 1, -1],

/* lots of constraints */


?- add(1,X,H), pow(H,0.5,J), taylor(J,X,5,R).

R = [1.0, 0.5, -0.125, 0.0625, -0.0390625, 0.02734375],

/* lots of constraints */


I dunno yet how to get rid of the lots of constraint, I was
thinking of using a signally constraint dispose/1,
but it would be unsafe, could also remove subterms
used in a deriv result.

Then I thought, CHR should do its own GC, like freeze
can do it. But there is a caveat, also a problem during
copy_term:
    - Constraints are not orientated: For example in
      our code below we use add(A,B,C) to indicate a
      term with root C and leaves A and B.

   -  CHR reachability doesn't use orientation, for example
       if we have two constraints add(A,B,C) and add(A,D,E),
       then when copying add(A,B,C) also add(A,D,E) is
       copied, although its from a different root E.

    - Same problem for a GC, if we have C reachable, and
      the constraints add(A,B,C) and add(A,D,E) are there,
      then the system woudl think that the different root E
      is also still in use.

   - To get an idea of the problem just look at some of
     the lots of constraints. Everything is connected, although
     we have long ago already calculated the taylor expansion.
     Without orientation information, no useful GC:

?- add(1,X,H), pow(H,0.5,J), taylor(J,X,3,R).

R = [1.0, 0.5, -0.125, 0.0625],

add(1, X, H),

mul(0.5, _G22806, _G22807),

mul(-0.5, _G22819, _G22806),

mul(-1.5, _G22832, _G22819),

mul(0.5, _G22845, _G22846),

mul(-0.5, _G22858, _G22845),

mul(0.5, _G22871, _G22872),

pow(H, -2.5, _G22832),

pow(H, -1.5, _G22858),

pow(H, -0.5, _G22871),

pow(H, 0.5, J).


So I guess CHR is nevertheless pretty dead for CAS. It just
lacks the useful notion of an AST it seems to me now. Via
OOP I can also code ASTs that do sub expression sharing,

I can even make them immutable, but additionally in most
OOP languages such as Java, I have also the benefit of
GC. And Prolog I have also immutability and ASTs,

the ASTs are just the ordinary Prolog terms. Many Prolog
systems have also pretty good GC for Prolog terms.
Its quite an important feature for good tail recursion,

its call environment trimming sometimes.

-------------------------------- source code ----------------------------------

:- use_module(library(chr)).

:- chr_constraint add/3, mul/3, pow/3, subst/4, deriv/3.

/* partial evaluation */
add(X,Y,Z) <=> number(X), number(Y) | Z is X+Y.
mul(X,Y,Z) <=> number(X), number(Y) | Z is X*Y.
pow(X,Y,Z) <=> number(X) | Z is X^Y.

/* simplification */
add(0,X,Y) <=> Y=X.
add(X,0,Y) <=> Y=X.
add(X,X,Y) <=> mul(2,X,Y).
mul(0,_,Y) <=> Y=0.
mul(_,0,Y) <=> Y=0.
mul(1,X,Y) <=> Y=X.
mul(X,1,Y) <=> Y=X.
mul(X,X,Y) <=> pow(X,2,Y).
pow(_,0,Y) <=> Y=1.
pow(X,1,Y) <=> Y=X.

/* substitution */
subst(C,_,_,R) <=> number(C) | R = C.
add(A,B,C) \ subst(C,X,Y,R) <=>
   subst(A,X,Y,S), subst(B,X,Y,T), add(S,T,R).
mul(A,B,C) \ subst(C,X,Y,R) <=>
   subst(A,X,Y,S), subst(B,X,Y,T), mul(S,T,R).
pow(A,B,C) \ subst(C,X,Y,R) <=>
    subst(A,X,Y,S), pow(S,B,R).
subst(C,X,Y,R) <=> C == X | R = Y.
subst(C,X,_,R) <=> C \== X | R = C.

/* derivation */
deriv(C,_,R) <=> number(C) | R = 0.
add(A,B,C) \ deriv(C,X,R) <=>
    deriv(A,X,S), deriv(B,X,T), add(S,T,R).
mul(A,B,C) \ deriv(C,X,R) <=>
    deriv(A,X,S), deriv(B,X,T), mul(A,T,H), mul(S,B,J), add(H,J,R).
pow(A,B,C) \ deriv(C,X,R) <=>
    deriv(A,X,S), D is B-1, pow(A,D,H), mul(B,H,J), mul(J,S,R).
deriv(C,X,R) <=> C == X | R = 1.
deriv(C,X,R) <=> C \== X | R = C.

/* taylor */
taylor(P, X, N, R) :-
   horner_schema(P, X, 0, N, R).

horner_schema(P, X, N, N, [R]) :- !,
   subst(P, X, 0, R).
horner_schema(P, X, N, M, [B|J]) :-
   deriv(P, X, Q),
   K is N+1,
   horner_schema(Q, X, K, M, H),
   div_coeff(H, K, J),
   subst(P, X, 0, B).

div_coeff([X|L], K, [Y|R]) :-
   Y is X/K,
   div_coeff(L, K, R).
div_coeff([],_,[]).

https://gist.github.com/jburse/e279d80c6882296ad074926999f6987f#file-diffchr-p

Jan Burse

unread,
Jun 5, 2017, 10:56:14 PM6/5/17
to SWI-Prolog
Since the subst solution doesn't use copy_term and leaves
a lot of garbage it is even a tick faster than everything else
so far, and I am using my old slow mac:

?- time((between(1,1000,_), add(1,X,H), pow(H,-1,J), taylor(J,X,8,R), fail; true)).

% 24,492,000 inferences, 4.179 CPU in 4.281 seconds (98% CPU, 5860286 Lips)

true.


42 ?- time((between(1,1000,_), add(1,X,H), pow(H,0.5,J), taylor(J,X,8,R), fail; true)).

% 24,492,000 inferences, 4.091 CPU in 4.272 seconds (96% CPU, 5987529 Lips)

true.


This gives it, if we divide by 1000-iterations, around 0.004 ms for
the taylor expansion. In your paper you show 0.006 ms, and maybe
your mac is similar like mine?

(P.S.: Evidence that the subst is faster my be already given by the
number of inferences, in your paper I see 26,038 inferences,
and the above has 24,492 inferences)

Jan Burse

unread,
Jun 6, 2017, 3:00:53 AM6/6/17
to SWI-Prolog
What I don't understand why I can write:

add(A,B,C) \ deriv(C,X,R) <=>
    deriv(A,X,S), deriv(B,X,T), add(S,T,R).

And don't need to write:

add(A,B,C) \ deriv(D,X,R) <=> C == D |
    deriv(A,X,S), deriv(B,X,T), add(S,T,R).

Have to check!

Samer Abdallah

unread,
Jun 6, 2017, 4:40:24 AM6/6/17
to Jan Burse, SWI-Prolog
On 5 Jun 2017, at 23:14, Jan Burse <burs...@gmail.com> wrote:

I wanted to try your gist with SWI-Prolog 7.5.5, but its not working.
When I have fixed the singleton warning for Cs0, I still get no result:

?- add(1.0,X,X1), pow(-1,X1,Y), time(taylor(5,X,Y,Cs)), X=0.0.
% 2,235 inferences, 0.000 CPU in 0.001 seconds (0% CPU, Infinite Lips)
X = 0.0,
X1 = Y, Y = 1.0,
Cs = [1.0, 0.0, 0.0, 0.0, 0.0].

The correct result should be [1.0, -1.0, 1.0, -1.0, 1.0] I guess.
Whats going wrong?

Thanks for spotting these - there was a mistake in the definition of 
derivs/3 - now fixed.

Samer.


https://gist.github.com/samer--/3bc9d4785bbafcbf24bb15e1c3f0efe4

Am Montag, 5. Juni 2017 14:00:57 UTC+2 schrieb Samer Abdallah:
.. All of this leads me to a question which I had on my mind a while ago — why do
both lambda libraries (Ullrich’s lambda and the newer yall) use copy_term_nat
and not just copy_term? What’s wrong with copying the attributes? If there aren’t
any, it won’t make any difference, and if there are, then arguably it makes more sense
to keep them.

signature.asc

Samer Abdallah

unread,
Jun 6, 2017, 5:02:04 AM6/6/17
to Jan Burse, SWI-Prolog
Hi Jan,

On 6 Jun 2017, at 03:37, Jan Burse <burs...@gmail.com> wrote:

So I made my own version, with a copying subst, see source
at end of my post. Seems to work and doesn't use copy_term:

?- add(1,X,H), pow(H,-1,J), taylor(J,X,5,R).
R = [1, -1, 1, -1, 1, -1],
/* lots of constraints */

?- add(1,X,H), pow(H,0.5,J), taylor(J,X,5,R).
R = [1.0, 0.5, -0.125, 0.0625, -0.0390625, 0.02734375],
/* lots of constraints */

I dunno yet how to get rid of the lots of constraint, I was
thinking of using a signally constraint dispose/1,
but it would be unsafe, could also remove subterms
used in a deriv result.

This looks interesting - it’s true, in my system I have to clean
up the constraints manually or they will build, or worse get in
the way and interfere with subsequent operations. It’s curious
how CHR feels very high level in some respects but low level in
others.

Some initial comments below, but I’ll have to think about your code
some more.


/* partial evaluation */
add(X,Y,Z) <=> number(X), number(Y) | Z is X+Y.
mul(X,Y,Z) <=> number(X), number(Y) | Z is X*Y.
pow(X,Y,Z) <=> number(X) | Z is X^Y.

/* simplification */
add(0,X,Y) <=> Y=X.
add(X,0,Y) <=> Y=X.
add(X,X,Y) <=> mul(2,X,Y).
mul(0,_,Y) <=> Y=0.
mul(_,0,Y) <=> Y=0.
mul(1,X,Y) <=> Y=X.
mul(X,1,Y) <=> Y=X.
mul(X,X,Y) <=> pow(X,2,Y).
pow(_,0,Y) <=> Y=1.
pow(X,1,Y) <=> Y=X.

This much is similar to what CLP(Q/R) would do anyway, and since
these are simplification constraints, there is no problem with garbage
building up.


/* derivation */
deriv(C,_,R) <=> number(C) | R = 0.
add(A,B,C) \ deriv(C,X,R) <=> 
    deriv(A,X,S), deriv(B,X,T), add(S,T,R).

This is forward mode automatic differentiation, but with the deriv/3 
constraint propagating backwards through the computation graph.
It’s forward mode because it is propagating the computation of
through d <everything> / d <input> through the graph for a fixed <input>, 
whereas reverse mode AD propagates the computation of 
d<output> / d<everything> for a fixed <output>. In my paper, I discuss
how my first solution computes reverse mode AD but by propagating 
deriv/3 forwards, whereas my second solution computes reverse mode
AD with back-propagation of constraints, and that this seem to fit the
problem better. Perhaps your forward mode AD will work better with
forward propagation of constraints, e.g. with rules like

add(A,B,C), deriv(A,X,DA), deriv(B,X,DB) ==> add(DA,DB,DC), deriv(C,X,DC).

This is now very close to an implementation of forward mode using
dual numbers, since for each node A, B, we have a differential DA, DB etc,
and each forward computation on the nodes is matched by a forward
compuation on the differentials. The added weight of CHR is not really
needed for forward mode AD, since we don’t need the same ability to
examine the computation graph that CHR gives us.


cheers,
Samer
signature.asc

Samer Abdallah

unread,
Jun 6, 2017, 5:05:18 AM6/6/17
to Jan Burse, SWI-Prolog
On 6 Jun 2017, at 08:00, Jan Burse <burs...@gmail.com> wrote:

What I don't understand why I can write:

add(A,B,C) \ deriv(C,X,R) <=>
    deriv(A,X,S), deriv(B,X,T), add(S,T,R).

And don't need to write:

add(A,B,C) \ deriv(D,X,R) <=> C == D |
    deriv(A,X,S), deriv(B,X,T), add(S,T,R).


I think this is because, when doing rule selection
with variables shared between the heads, CHR already matches on
the basis of ==/2 equality.

It occurred to me that perhaps we should be having this discussion
on the CHR mailing list, which I just joined the other day.

Samer

Have to check!

Am Dienstag, 6. Juni 2017 04:56:14 UTC+2 schrieb Jan Burse:
Since the subst solution doesn't use copy_term and leaves
a lot of garbage it is even a tick faster than everything else
so far, and I am using my old slow mac:

?- time((between(1,1000,_), add(1,X,H), pow(H,-1,J), taylor(J,X,8,R), fail; true)).
% 24,492,000 inferences, 4.179 CPU in 4.281 seconds (98% CPU, 5860286 Lips)
true.

42 ?- time((between(1,1000,_), add(1,X,H), pow(H,0.5,J), taylor(J,X,8,R), fail; true)).
% 24,492,000 inferences, 4.091 CPU in 4.272 seconds (96% CPU, 5987529 Lips)
true.

This gives it, if we divide by 1000-iterations, around 0.004 ms for
the taylor expansion. In your paper you show 0.006 ms, and maybe
your mac is similar like mine?

(P.S.: Evidence that the subst is faster my be already given by the
number of inferences, in your paper I see 26,038 inferences,
and the above has 24,492 inferences)

Am Dienstag, 6. Juni 2017 04:37:52 UTC+2 schrieb Jan Burse:
So I made my own version, with a copying subst, see source
at end of my post. Seems to work and doesn't use copy_term:

signature.asc

Jan Burse

unread,
Jun 6, 2017, 7:08:47 AM6/6/17
to SWI-Prolog, burs...@gmail.com
Thanks for the teaching! I guess I will upload a new
gist version, as soon as I have time,

where I will use this CHR feature, hopefully it applies
as well for joins inside literals, i.e. for subst and deriv,

I will try to replace these rules:


subst(C,X,Y,R) <=> C == X | R = Y.
deriv(C,X,R) <=> C == X | R = 1.

and do some new measurements. BTW I am/was on some CHR
mailing list, but it doesn't show much traffic. Why not a

CHR google group, everybody can create a google group?

Carlo Capelli

unread,
Jun 6, 2017, 11:13:02 AM6/6/17
to Samer Abdallah, Jan Burse, SWI-Prolog
Samer, Jan

your thread is passionating... please don't leave now :)
Jan, I find the idea of a goggle group very nice (also because the official (?) CHR homepage is a bit outdated).

Thanks, Carlo



To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+unsubscribe@googlegroups.com.

Jan Burse

unread,
Jun 7, 2017, 3:06:04 PM6/7/17
to SWI-Prolog
What also comes to mind, is realizing rules that reduce the
amount of work, like for example:

subst(A,X,Y,B) \ subst(A,X,Y,C) <=> B =C.
deriv(A,X,B) \ deriv(A,X,C) <=> B =C.

Which makes me thinking whether I should do a CHR port
to Jekejeke Prolog. I would just use the existing delta engine,

but on the surface, replace my own rule and query language
by a additional module, that supports the CHR rule and
query language.

I have a rule set that can do something else, namely
a rule set for a calculus of construction. Its quite fun!
This way I could run this rule set both in SWI-Prolog

and in Jekejeke Prolog, same with the autodiff stuff.
I didn't make up my mind yet, how to translate the CHR
rules, because of the ==/2 semantics, which is

a little Unicorn in the whole CHR venture, the ==/2 is
not a logical predicate, its a meta-logical predicate. So
we have a more complex declarativity of CHR rules,

than of ordinary logic programming programs based
on Horn Clauses or Rules with negation as failure.

Samer Abdallah

unread,
Jun 7, 2017, 6:46:50 PM6/7/17
to Carlo Capelli, Jan Burse, SWI-Prolog
On 6 Jun 2017, at 16:13, Carlo Capelli <cc.car...@gmail.com> wrote:

Samer, Jan

your thread is passionating... please don't leave now :)

If you want to see something really crazy, have a look at this:


I tried it in OCaml first but the compiler couldn’t handle the
types. It’s kind of nuts that Haskell can make sense of it.

Samer
signature.asc
Reply all
Reply to author
Forward
0 new messages