Samer Abdallah

unread,

Apr 11, 2017, 7:52:42 PM4/11/17

to SWI-Prolog

Hello all,

I recently noticed that automatic differentiation would be a really neat solution to

a problem I’m working on. Rather than rushing straight into the horrible Pythony arms

of Theano, TensorFlow etc, I thought perhaps I could do it in Prolog using CHR. I tried a

month or two ago but only got into a mess because I have no clue what I’m doing with

CHR, though I did get a sense that my problem was with aggregation, which is apparently

a known difficulty with CHR (see ‘Aggregates for Constraint Handling Rules’ by Jon Sneyers,

Peter Van Weert, and Tom Schrijvers).

Anyway, I tried again yesterday a got something that sort of works. Here it is (this is

reverse mode differentiation, which is basically the same as gradient back-propagation

in neural networks):

:- chr_constraint deriv/3, agg/2, acc/1, acc/2, mul/3, add/3, log/2, exp/2, exp1/2, log1/2, mul1/3, add1/3, go/0.

remove_dup @ deriv(L,X,DX) \ deriv(L,X,DX1) <=> DX=DX1.

identity @ deriv(L,L,DL) <=> DL=1.

decl_accum @ deriv(_,_,DX) ==> acc(DX). % declares DX as something that needs to be summed

deriv_log @ deriv(L,X,DX), log(X,Y) ==> deriv(L,Y,DY), mul1(1/X,DY,Z), agg(Z,DX).

deriv_exp @ deriv(L,X,DX), exp(X,Y) ==> deriv(L,Y,DY), mul1(Y,DY,Z), agg(Z,DX).

deriv_add_l @ deriv(L,X,DX), add(X,_,Y) ==> deriv(L,Y,DY), agg(DY,DX).

deriv_add_r @ deriv(L,X,DX), add(_,X,Y) ==> deriv(L,Y,DY), agg(DY,DX).

deriv_mul_l @ deriv(L,X,DX), mul(K,X,Y) ==> deriv(L,Y,DY), mul1(K,DY,Z), agg(Z,DX).

deriv_mul_r @ deriv(L,X,DX), mul(X,K,Y) ==> deriv(L,Y,DY), mul1(K,DY,Z), agg(Z,DX).

remove_dup @ acc(X) \ acc(X) <=> true.

accum_recur @ acc(S1,X), agg(Z,X) <=> add1(Z,S1,S2), acc(S2,X).

accum_base @ acc(S,X) <=> S=X.

mul(X,Y,Z) ==> mul1(X,Y,Z).

add(X,Y,Z) ==> add1(X,Y,Z).

log(X,Y) ==> log1(X,Y).

exp(X,Y) ==> exp1(X,Y).

log1(X,Y) <=> ground(X) | Y is log(X).

exp1(X,Y) <=> ground(X) | Y is exp(X).

mul1(X,Y,Z) <=> ground(X-Y) | Z is X*Y.

add1(X,Y,Z) <=> ground(X-Y) | Z is X+Y.

go \ acc(DX) <=> acc(0,DX).

As you can see, as well as the house-keeping rules and self-explanatory arithmetic, there

are rules implementing the differentiation chain-rule for sums, produces, log and exp.

The constraint deriv(L,X,DX) means 'the derivative of L with respect to X is DX’.

The constraint agg(Z,DX) means ‘Z is an additive contribution to DX’. The system

of constraints acc/1, acc/2 and go/0 handle the summation of agg/2 constraints: acc(X)

means ‘X will be a sum of values found in agg(_,X) constraints’ and acc(S,X) means ‘X is S plus

the sum of values in any remaining agg(_,X) constraints’.

Here’s an example call:

?- add(X,Y,L), mul(Z,V,X), mul(Z,W,Y), log(L,LL), deriv(LL,Z,DZ), go, V=4, Z=2, W=3.

X = 8,

Y = 6,

L = 14,

Z = 2,

V = 4,

W = 3,

LL = 2.6390573296152584,

DZ = 0.5,

deriv(2.6390573296152584, 8, 1/14),

deriv(2.6390573296152584, 14, 1/14),

deriv(2.6390573296152584, 6, 1/14),

deriv(2.6390573296152584, 2, 0.5),

mul(2, 3, 6),

mul(2, 4, 8),

add(8, 6, 14),

log(14, 2.6390573296152584),

go.

It works, but it’s less than ideal for the following reasons:

1. The handling of deriv/3 and acc/{1,2} is only correct before the various variables are instantiated.

This means the whole thing can go wrong if the operands receive their values before go/0 is called,

even for values which are logically constants, e.g. foldl(mul, [X1,X2], P0, P1), deriv(P1,X1,DX1)

doesn’t work if P0 is replaced with the constant 1.

2. The constraint go/0 has to be invoked to trigger the accumulation of sums. I

couldn’t think of any way to make the process completely automatic.

Problem 2 is probably liveable with (and could conceivably be unavoidable), but Problem

1 points to a logical failing (I suspect each node in the computation graph needs some sort

of ground unique identifier). So, is there anyone here with more CLP/CHR-fu than me that would

be interested in helping to make it better? Automatic differentiation is all the rage in machine

learning at the moment, and I don’t see why those Python programmers should have all

the fun…

cheers,

Samer

signature.asc

Jan Burse

unread,

Apr 15, 2017, 5:57:09 AM4/15/17

to SWI-Prolog

deriv was part of the thesis of Warren to show that Prolog can
be used for the same things as LISP, since the very original
deriv dates back to

Warren, D.H.D. (1978): Applied Logic - Its Use and

Implementation as a Programming Tool,

https://www.era.lib.ed.ac.uk/handle/1842/6648

@Samer Abdallah If you add subst and eval, you can also do taylor
expansion. I understand that you try to have eval via CHR instantiation.
The example you show indicates this.

But somehow taylor expansion is a use case where this doesn't
might work. There are two conflicting requirements in taylor expansions
which breaks the CHR instantiaion view.

a) Requirement 1: For each summand of the taylor expansion
    we need the n-th derivative. So typically when we have compute
    a derivative, we will keep it and not instantiate it, to be in position
    of computing the next derivatve.

b) Requirement 2: For each summand of the taylor expansion,
     or for simplicity look at Maclaurin series where a=0, we need
     to evaluate the derivative at the point x=0. This calls for instantiation,
     but maybe its in conflict with requirement 1.

Maybe you are lucky that you don' have this conflict. But many
derivation rules leave some sub expressions unchanged. For
example the derivation rule for F*G is:

     (F*G)' = F'*G + F*G'

So you have a term G and F unchanged in the result. So that
indeed an instantiation conflict might arise. The workaround is to
have a subst predicate, which does substitution, and creates a copy.

If the subst predicate is applied for an expression that is later
free of variables, the subst predicate could do evaluation at the
same time, so that the attribuite variable instantiation mechanism

of CHR doesn't come into play. But for more advanced taylor
expansion, where parameters are involved, the subst predicate
would perform on a symbolic level, and return a new expression.

I have already implemented deriv and subst in my december 2016
prototypes, but I didn't do yet taylor, since it seems that taylor
also needs limes. To evaluate the derivatives at point x = 0,

this sometimes does't work so easily, and it seems that existing
CAS systems such as Wolfram Alpha or symbollab do limes
at point x = 0 and not only simple evaluation, one more issue that

is probably not easily solve by the CHR variable instantiation view...

Jan Burse

unread,

Apr 15, 2017, 10:35:08 AM4/15/17

to SWI-Prolog

Ok I made a prototype since I could resist. For tayler it goes recursively
down, computes all the derivs and then while going leaving the recursion
it applies a horner schema. Here is an example run:

?- X is 1/(1+A), Y is taylor(X, A, 5).
X is 1/(1+A),
Y is 1-A+A^2-A^3+A^4-A^5

But its a little fragile, somehow the derivations get quite large, so that
I finaly end up with a memory error for large number of requested
summands:

?- X is 1/(1+A), Y is taylor(X, A, 50).
Error: Execution aborted since memory threshold exceeded.
    functor/3
    is_abnormal/1
    (\+)/1
    is/2
    sys_poly_add/3

I dunno yet what can be done against this. I thought the CHR code
could be a nice device to experiment with evaluation strategies, in
understanding them and in further developing them. There seems
to be a forward and backward mode of computing derivations,

which I do not yet understand. Also using duals numbers, already used
by the German mathematician Eduard Study have a certain appeal
to me since they can combine derive and subst in one step, but I didn't
figure out yet how to effectively do the n-the derivative.

For more introductory information see for example here:

Automatic differentiation - H°avard Berland 2016
http://www.robots.ox.ac.uk/~tvg/publications/talks/autodiff.pdf

Jan Burse

unread,

Apr 15, 2017, 10:40:42 AM4/15/17

to SWI-Prolog

The prototype code is open source on gist and it depends on a
further polynomial gist and the Jekejeke wrapped attribute variables:
https://gist.github.com/jburse/d8be109a24bdd52218472ecb7e68b28a#file-series-p

Samer Abdallah

unread,

Apr 15, 2017, 5:30:59 PM4/15/17

to Jan Burse, SWI-Prolog

On 15 Apr 2017, at 10:57, Jan Burse <burs...@gmail.com> wrote:

deriv was part of the thesis of Warren to show that Prolog can
be used for the same things as LISP, since the very original
deriv dates back to

Warren, D.H.D. (1978): Applied Logic - Its Use and

Implementation as a Programming Tool,

https://www.era.lib.ed.ac.uk/handle/1842/6648

Interesting - I hadn’t seen Warren’s thesis before, from back when

Prolog was brand new! It looks like his ‘deriv’ is essentially

symbolic differentiation, as it builds a big unevaluated expression

for the entire derivative.

When I started looking into AD, I didn’t fully appreciate the difference

between symbolic differentiation (which I had experimented with in OCaml,

writing a little test program that derives the equations of motion for a gravity

simulation by differentiating the system Hamiltonian) and AD, which is more

of a composition of numerical operations, avoiding the building of a big

symbolic expression of the whole derivative, as described in those PDF slides

you linked. Of the two main approaches to AD, reverse mode is more suitable

for my application than forward mode with dual numbers - dual numbers are

neat, and can be implemented by overloading the arithmetic operators, but it’s

more computationally expensive than reverse mode when you want the gradient

of a scalar valued function with respect to many parameters. Of course, reverse mode

is recognisable as a generalisation of back-propagation.

@Samer Abdallah If you add subst and eval, you can also do taylor
expansion. I understand that you try to have eval via CHR instantiation.
The example you show indicates this.

I’ve not tried doing more than one derivative as I don’t need them just yet,

but I think other reverse mode AD libraries let you do multiple derivatives,

eg to get the Hessian for numerical optimisation.

I should probably explain the other trick I’m using to avoid doing an

explicit subst operation - the result of my CHR autodiff is a set of

attributed variables with delayed goals like a data flow graph,

which is defined once but used many times by (the horror) copying the whole

lot before unifying the ‘inputs’ and reading off the ‘outputs’, e.g. taking my

previous example:

?- add(X,Y,L), mul(Z,V,X), mul(Z,W,Y), log(L,LL), deriv(LL,Z,DZ), go,

T = t(V,S,W,LL,DZ),

copy_term(T, t(4,2,3,LL2,DZ1)),

copy_term(T, t(2,4,5,LL2,DZ2)),

copy_term(T, t(3,6,7,LL3,DZ3)).

This does the differentiation once, creates a load of delayed additions

and multiplications (essentially, the CHR part ‘compiles' the differentiation

down to a graph of primitive numerical operations) and then applies them three

times to three different lots of values - this is just what you need for an iterative

optimisation. There is no subst because the variables are simply instantiated, and

there is no eval because numerical operations are already there waiting to be

triggered. The CHR part is rather slow, but the repeated evaluations are not that

much slower than normal arithmetic in SWI Prolog.

I’ll have to see if I can get it to work with multiple derivatives as you describe

below.

Samer.

limes?

--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+...@googlegroups.com.
Visit this group at https://groups.google.com/group/swi-prolog.
For more options, visit https://groups.google.com/d/optout.

signature.asc

Samer Abdallah

unread,

Apr 19, 2017, 5:47:26 PM4/19/17

to Jan Burse, SWI-Prolog

Hi Jan,

I thought a bit more about what you said about multiple derivatives,

eg for a Taylor series, and tried modifying my code to support it.

I started with multiple for derivatives for X^K, with K an integer constant.

I made it so that the derivative propagation rules expanded the computation

graph using the same mul/3, add/3, pow/3 etc operators as the top level

interface, so that the results of these added operations could

themselves be differented. This worked for positive powers,

since this created a recursion which stopped at K=0, since X^0 = 1 for all

X. For negative powers, it entered an infinite loop: X^(-1) triggers computation

of X^(-2), which triggers X^(-3) etc. Hence, more generally, this approach

is not going to work.

The problem was that, although the algorithm was notionally reverse mode

differentiation, a request for dY/dX using deriv(Y,X,DYDX) was requesting

dY/dZ for *all* nodes Z that X is an input to, including nodes Z created by

the request for dY/dX, even when the forward path through Z did not reach Y.

The CHR execution was thus rippling *forwards* through the computation

graph, not backwards, a bit like forward chaining in logic programming.

So I tried another approach, where the API includes back/1, such that

back(Y) declares that dY/dY = 1, thus triggering the computation of dY/dZ for

all nodes Z reached by travelling backwards through the computation graph.

The user’s request for, e.g. deriv(Y,X,DZDX) simply ‘catches’ the result for dY/dX

that was or will be produced as the deriv/3 recursion ripples back through the graph.

Requesting second or higher derivatives in this framework is not a problem

and I was able to get the Taylor series coefficients without any problem.

Bearing in mind that I still don’t really know what I’m doing with CHR, some code (enough to

replicate the Taylor series example -- full gist with log/2 and exp/2 is here:

https://gist.github.com/samer--/3bc9d4785bbafcbf24bb15e1c3f0efe4) is presented

below. It’s a bit sensitive to the order in which certain constraints are invoked -

CHR is a bit of a strange beast in this regard: in one way it’s very high level code,

but in another, it feels quite imperative in the way that the constraint store has

to be massaged by doing things in the right order.

:- module(autodiff, [mul/3, add/3, pow/3, deriv/3, back/1, go/0, compile/0, derivs/3, taylor/4]).

:- use_module(library(chr)).

:- chr_constraint add(?,?,-), mul(?,?,-), pow(+,-,-), deriv(?,-,?), agg(?,-), acc(?,-), acc(-), go, back(-), compile.

% operations interface with simplifications

mul(0.0,_,Y) <=> Y=0.0.

mul(_,0.0,Y) <=> Y=0.0.

mul(1.0,X,Y) <=> Y=X.

mul(X,1.0,Y) <=> Y=X.

mul(X,Y,Z1) \ mul(X,Y,Z2) <=> Z1=Z2.

pow(1,X,Y) <=> Y=X.

pow(0,_,Y) <=> Y=1.

add(0.0,X,Y) <=> Y=X.

add(X,0.0,Y) <=> Y=X.

add(X,Y,Z1) \ add(X,Y,Z2) <=> Z1=Z2.

% initiatiate back-propagation starting from Y

back(Y) <=> var(Y) -> deriv(Y,Y,1.0); true.

% propagate derivatives through unary and binary operators

deriv(L,X,DX) \ deriv(L,X,DX1) <=> DX=DX1.

deriv(L,_,DX) <=> ground(L) | DX=0.0.

deriv(_,_,DX) ==> var(DX) | acc(DX).

deriv(L,Y,DY), pow(K,X,Y) ==> deriv(L,X,DX), pow_contrib(K,X,DY,Z), agg(Z,DX).

deriv(L,Y,DY), add(X1,X2,Y) ==> maplist(add_contrib(L,DY),[X1,X2]).

deriv(L,Y,DY), mul(X1,X2,Y) ==> maplist(mul_contrib(L,DY),[X1,X2],[X2,X1]).

deriv(L,Y,DY), agg(X1,Y) ==> add_contrib(L,DY,X1). % allows multiple derivs before go/0 is called.

pow_contrib(K,X,DY,Z) :- K1 is K - 1, KK is float(K), pow(K1,X,XpowK1), mul(KK,XpowK1,W), mul(DY,W,Z).

mul_contrib(L,DY,X1,X2) :- var(X1) -> deriv(L,X1,DX1), mul(X2,DY,T1), agg(T1,DX1); true.

add_contrib(L,DY,X1) :- var(X1) -> deriv(L,X1,DX1), agg(DY,DX1); true.

acc(X) \ acc(X) <=> true.

% reduce deriv/3, agg/2 and acc/{1,2} to set of add/3 constraints

go \ deriv(_,_,_) <=> true.

go \ acc(DX) <=> acc(0.0,DX).

go <=> true.

acc(S1,X), agg(Z,X) <=> add(Z,S1,S2), acc(S2,X).

acc(S,X) <=> S=X.

% convert arithmetic constraints to frozen goals.

compile \ add(X,Y,Z) <=> delay(X+Y,Z).

compile \ mul(X,Y,Z) <=> delay(X*Y,Z).

compile \ add(X,Y,Z) <=> delay(X+Y,Z).

compile \ pow(K,X,Y) <=> delay(X**K,Y).

compile <=> true.

delay(Expr,Res) :- when(ground(Expr), Res is Expr).

% ------------ multiple derivatives and Taylor series ----------

derivs(X,Y,[1.0|Ds]) :- foldl(d(X),Ds,Y,_).

d(X,DYDX,Y,DYDX) :- back(Y), deriv(Y,X,DYDX), go.

taylor(N,X,Y,Cs) :-

length(Ds,N), derivs(X,Y,Ds),

compile, % ack! this destroys the computation graph!

copy_term(X-Ds,0.0-Ds0), % NB. like a lambda call but needs attributes to be copied too

numlist(1,N,Is),

foldl(nth_deriv_coeff,Is,Ds0,Cs,1.0,_).

nth_deriv_coeff(I,D,C,P1,P2) :- P2 is P1*I, C is D/P1.

With this loaded in SWI Prolog, try these (the X=0.0 in the second goal is not

necessary, it is only there to get rid of a load of frozen goals which would otherwise

obscure the result when printed):

?- add(1.0,X,X1), pow(-1,X1,Y), time(taylor(5,X,Y,Cs)).

?- add(1.0,X,X1), pow(-1,X1,Y), time(taylor(50,X,Y,Cs)), X=0.0, writeln(Cs).

As you can see, there is no problem with 50 or more terms in the expansion!

If you run only the first two CHR goals, you will see the CHR representation

of the computation Y = (1+X)^(-1). Unfortunately, the use of compile/0 in taylor/4

destroys this computation graph, so it cannot be used for any more differentiations

afterwards. I suppose the simpagation rules for compile could be replaced with

propagation rules, but I don’t have a good sense of how best to manage the

accumulation of constraints in the store that would result from this.

If there any CHR experts listening, I would love to get your take on this!

cheers,

Samer

ps. [below]

On 15 Apr 2017, at 15:35, Jan Burse <burs...@gmail.com> wrote:

understanding them and in further developing them. There seems
to be a forward and backward mode of computing derivations,
which I do not yet understand.

Forward-mode differentiation computes d <Everything in the graph> / d <selected inputs>,

and can be done by overloading ordinary arithmetic operators to work with dual-numbers,

without creating a data structure to represent the computation graph

The derivatives are computed exactly in parallel with the forward computation.

Reverse mode computes d <selected outputs> / d<Everything in the graph>.

It requires a representation of the computation graph in order to collect together,

for each node, all the 'downstream’ nodes that depend on it directly.

Forward mode is generally cheaper when the number of inputs is less than the number

of outputs, while reverse mode is cheaper when we want the derivative of a small number

of outputs wrt many inputs, and hence is good for optimising loss functions with many

parameters.

Also using duals numbers, already used
by the German mathematician Eduard Study have a certain appeal
to me since they can combine derive and subst in one step, but I didn't
figure out yet how to effectively do the n-the derivative.

For more introductory information see for example here:

Automatic differentiation - H°avard Berland 2016
http://www.robots.ox.ac.uk/~tvg/publications/talks/autodiff.pdf

Am Samstag, 15. April 2017 11:57:09 UTC+2 schrieb Jan Burse:
a) Requirement 1: For each summand of the taylor expansion
    we need the n-th derivative. So typically when we have compute
    a derivative, we will keep it and not instantiate it, to be in position
    of computing the next derivatve.

b) Requirement 2: For each summand of the taylor expansion,
     or for simplicity look at Maclaurin series where a=0, we need
     to evaluate the derivative at the point x=0. This calls for instantiation,
     but maybe its in conflict with requirement 1.

signature.asc

Samer Abdallah

unread,

Apr 19, 2017, 6:32:22 PM4/19/17

to Jan Burse, SWI-Prolog

p.s.

There was a bug in my code that was masked in the example Y = 1/(1+X).

The zeroth order derivative of Y is Y, not 1.0:

-derivs(X,Y,[1.0|Ds]) :- foldl(d(X),Ds,Y,_).
+derivs(X,Y,[Y|Ds]) :- foldl(d(X),Ds,Y,_).

Samer

signature.asc

Jan Burse

unread,

Apr 19, 2017, 7:15:41 PM4/19/17

to SWI-Prolog

Hi,

Cool!

In case I would use CLP(*) the same code wouldn't work for
my system, since I have implememted copy_term/2 without
copying the attribute variables.

Your result isn't more expensive, than a dedicated subst
which does anyway also a copy. So basically there is proof
that taylor can be also done with CLP(*).

Bye

BTW: I fixed the performance problem, I introduce some
structure sharing into the d (U/V)/dx rule. You don't have
such a rule, you seem to use U^(-1).

The interesting thing of this structure sharing is, that it
is different from the structure sharing for d (U*V)/dx where
we would typically share U and V.

Here I am sharing the whole U/V, not sure how this would
translate to U^(-1), something like -(U^(-1))^2 maybe?
So that I have implemented

d (U/V)/dx = (d U/dx - d V/dx *(U/V)) / V

Since weekend on Github and with API docu as well:
https://github.com/jburse/jekejeke-devel/blob/master/jekmin/headless/jekmin/frequent/leibniz/deriv.p#L139

Message has been deleted

Samer Abdallah

unread,

Apr 20, 2017, 8:37:41 AM4/20/17

to Jan Burse, SWI-Prolog

On 20 Apr 2017, at 00:15, Jan Burse <burs...@gmail.com> wrote:

BTW: I fixed the performance problem, I introduce some
structure sharing into the d (U/V)/dx rule. You don't have
such a rule, you seem to use U^(-1).

So that I have implemented

d (U/V)/dx = (d U/dx - d V/dx *(U/V)) / V

I tried adding a div/3 constraint with a deriv/3 rule something like this:

deriv(L,Y,DY), div(X1,X2,Y) ==> % ie Y = X1/X2
(var(X1) -> deriv(L,X1,DX1), div(DY,X2,T1), agg(T1,DX1); true),
(var(X2) -> deriv(L,X2,DX2), mul(X2,X2,X2sq), mul(-1.0,X2sq,NX2sq),

div(DY,NX2sq,T2), agg(T2,DX2)

; true).

but it requires a bunch of extra rules to handle div/3 everywhere else, and

also suggests of adding more rules for sub/3 (subtraction)

and possibly neg/2 for negation. Seeing as pow/3 is quite useful anyway, sticking

with just add/3, mul/3, and pow/3 seems like a more minimal but complete set

of primitives. Sharing of subterms is not such a big issue because the

nodes in the computation are just variables, eventually instantiated to numbers.

Negative powers resulting from differentiation are indeed computed (by is/2) as (X**K) for

negative K, rather than by repeated division (Numerator/X/X/X …) but I don’t think

that’s a bad thing.

Samer.

p.s. have also added documentation to the gist here:

https://gist.github.com/samer--/3bc9d4785bbafcbf24bb15e1c3f0efe4

signature.asc

Jan Burse

unread,

Apr 20, 2017, 9:31:08 AM4/20/17

to SWI-Prolog

I have a next challenge. But its not really differentiation, but more
dealing with symbolic expressions in general, but still relevant. I
dunno how much has done in Prolog already.

Lets look at algebraic numbers, they are zeros of a polynomial
p(x). For a polynomial p(x) of degree n, we can define a new
one p*(x) as follows:

    p*(x) = x^n * p(1/x)

If a is a root of p(x), then 1/a is a root of p*(x). So for algebraic
numbers, p*(x) is the multiplicative inverse of p(x). I get p*(x)
via substitution(*), namely with this code:

      star(P, X, N, Q) :- Q is X^N*subst(P,X,1/X).

For the golden ratio, phi = 1.618033..., which has the polynomial
p(x) = x^2-x-1, I get the polynomial p*(x) = -x^2-x+1, which has
the positive root Phi = 0.618033...:

?- P is X^2-X-1, Q is star(P,X,2).
      P is - 1-X+X^2,
      Q is 1-X-X^2

We see indeed Phi = 1/phi. Can we extend this to do arithmetic
over non-constant radicals. i.e. phi = 1.618033... = (1 + sqrt(5))/2
is a constant radical. But maybe we compute then with

expressions such as a+sqrt(b) symbolically? (**)

(*)
The clever observer might notice that we can simply reverse
list the coefficients. But maybe we need other methods to scale
up when coefficients are symbolic etc...

(**)
I would like to do taylor expansion of sqrt(1+x). But I didn't
introduce radicals already. A cludge might be to view it as (1+x)^(1/2),
which could already work for Samer Abdallah approach.

Jan Burse

unread,

Apr 20, 2017, 9:49:08 AM4/20/17

to SWI-Prolog

In one of his works Isaac Newton did a series development
symbolically for things like sqrt(a^2+x^2) and some such.
This would be the next step...to have this working...

Method of Fluxions - Sir Isaac Newton 1671

http://gymportalen.dk/sites/lru.dk/files/lru/2_1_a_treatise.pdf

You find the example sqrt(a^2+x^2) on page 12 where it
says "Examples of Reduction by Extraction of Roots". One
must get the same result with taylor,

since taylor expansions are unique. So can Prolog 2017 be
as intelligent as Sir Isaac Newton 1671 ?

Am Donnerstag, 20. April 2017 15:31:08 UTC+2 schrieb Jan Burse:

expressions such as a+sqrt(b) symbolically? (**)

Jan Burse

unread,

Apr 20, 2017, 10:01:47 AM4/20/17

to SWI-Prolog

Disclaimer: Hopefull this also holds for symbolic expansions,
besides the problem of comparing coefficients symbolically.
We have infinite series, and if we develop them up to n-summands.

We stil are not sure what the rest of the summands is. So
to complete the exercise, we might also need an infinite
sum notation, and reasoning about this.

Problems, problems, ...

Samer Abdallah

unread,

Apr 21, 2017, 4:58:44 PM4/21/17

to Jan Burse, SWI-Prolog

On 20 Apr 2017, at 14:31, Jan Burse <burs...@gmail.com> wrote:

I have a next challenge. But its not really differentiation, but more
dealing with symbolic expressions in general, but still relevant. I
dunno how much has done in Prolog already.

Lets look at algebraic numbers, they are zeros of a polynomial
p(x). For a polynomial p(x) of degree n, we can define a new
one p*(x) as follows:

I might have to leave this till later as I have to work on my application now!

BTW, for an entertaining comparison, take a look at this:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/gradients_impl.py

(**)
I would like to do taylor expansion of sqrt(1+x). But I didn't
introduce radicals already. A cludge might be to view it as (1+x)^(1/2),
which could already work for Samer Abdallah approach.

Indeed it does!

?- add(1.0,X,X1), pow(-0.5,X1,Y), time(taylor(50,X,Y,Cs)), X=0.0.
% 793,170 inferences, 0.233 CPU in 0.236 seconds (99% CPU, 3405186 Lips)
X = 0.0,
X1 = Y, Y = 1.0,
Cs = [1.0, -0.5, 0.375, -0.3125, 0.2734375, -0.24609375, 0.2255859375, -0.20947265625, 0.196380615234375|...].

After making changes to avoid conversions to floats (replacing floating point constants in the code

with integers, plus changing two other lines) we get:

?- add(1,X,X1), pow(-1 rdiv 2,X1,Y), time(taylor(50,X,Y,Cs)), X=0.
% 971,174 inferences, 0.196 CPU in 0.198 seconds (99% CPU, 4966118 Lips)
X = 0,
X1 = Y, Y = 1,
Cs = [1, -1 rdiv 2, 3 rdiv 8, -5 rdiv 16, 35 rdiv 128, -63 rdiv 256, 231 rdiv 1024, -429 rdiv 2048, ...rdiv...|…].

Samer

signature.asc

Jan Burse

unread,

Apr 29, 2017, 9:52:56 AM4/29/17

to SWI-Prolog

In the past there was probably a primacy of numerical methods. I don't
have yet a complete grip of the advantages of symbolic methods, especially
in the demain of deep leraning.

But here have a little sample of some advantage of symbolic methods,
I discovered yesterday. Take the problem of generating a two dimensional
plot of a function g(x,y). We just want to assign color values to g(x,y)

in a grid. Here are two methods, assume the function is given by a
symboic expressions F and two variables A,B:

1) Without currying:
    At each grid cell evaluate g(x,y), in my system this is done via
    subst(subst(F,A,X),B,Y). I guess there is also a CHR/CLP(*) variant
    to do this?

2) With currying:
    At each row pre-evaluate h= lambda x.g(x,y), in my system this is
    done via H is subst(F,B,Y). Then in each row for each column
    evaluate h(x), in my system this is done via subst(H,A,X). I guess
there is also a CHR/CLP(*) variant to do this?

I made a little test, and plotted some functions this way. And measured
the time taken to generate a SVG file. Here are the results:

Without currying: plot(X^2+Y^2-1,X,Y) --> ca 2 1/2 secs
With currying: plot(X^2+Y^2-1,X,Y) --> ca 1 secs

This is a nice benefit! Can this be reproduced in CHR/CLP(*)?

See also: What do have Currying and Plotting in Common?
https://plus.google.com/+JekejekeCh/posts/PiM9EfkTniQ

Jan Burse

unread,

May 14, 2017, 11:27:02 AM5/14/17

to SWI-Prolog

During my work on implicit plotting, I stepped over a new
challenge. I wanted to make a contour plot of sqrt(X^2+Y^2)

and noticed I cannot yet exactly compute with radical
expressions. Happily sums of square roots work now.
We also solved comparison of radical expressions, a
challenging example is this pair:

a = sqrt(98428513)+sqrt(101596577)

b = sqrt(400025090)

Be careful a simple poket calculator wont help you!
Which lead me to thid post:

Where Computer Algebra Systems beat Constraint Logic Programming
http://stackoverflow.com/a/43965110/502187

But its a little bit unfair to say CAS beats CLP, but the
above is the twitter headline and I had only 150 chars available.
More precisely its a case where CAS is better than CLP(FD).
But the CHR demonstrations here show also that

CAS and CHR are not necessarely disjoint...

Samer Abdallah

unread,

Jun 5, 2017, 7:17:00 AM6/5/17

to SWI-Prolog, Jan Burse

Hello all,

I uploaded a technical note on this differentiation work to arXiv:

https://arxiv.org/abs/1706.00231

cheers,

Samer

signature.asc

Samer Abdallah

unread,

Jun 5, 2017, 8:00:57 AM6/5/17

to Jan Burse, SWI-Prolog

Hi Jan,

I would say this is orthogonal to the problem of automatic differentiation,

but I would guess that this kind of partial evaluation is perfectly doable in

in a CLP-based numeric system. In my autodiff system, we build what is

essentially a single-use push-driven data flow graph by setting up delayed

goals like

when(ground(X,Y), Z is X+Y), etc.

Alternatively one could use CLP(R) or CLP(Q). I’m using when(ground, _) because

of numerical accuracy problems with CLP(R).

To use the graph multiple times, we must copy all the attributed variables,

before unifying the inputs with ground values, eg

copy_term(t(X,Y,Z), t(X1,Y1,Z1)) , X1=2, Y1=3.

% Z1=5

or just

copy_term(t(X,Y,Z), t(2,3,Z1)).

% Z1=5

Note that, if we had a lambda library that used copy_term/2 and not

copy_term_nat/2, then these semantics would be captured as a lambda term

with an empty body, since the copy and unify would do the whole job:

call(\X^Y^Z^true, 2, 3, Z1).

% Z1=5

To use partial evaluation, just unify one of the variables. This will trigger

any computations that can be done immediately, and, if using CLP(Q/R)

might result in a simplification of the remaining computation graph.

Then use copy_term with remaining variables to evaluate at different

values of the other argument.

copy_term(t(X,Y,Z), t(2,Y1,Z1)), % partial eval at X=2

copy_term(t(Y1,Z1), t(1,Z2_1)), % complete eval at Y=1

copy_term(t(Y1,Z1), t(2,Z2_2)), % … Y=2

copy_term(t(Y1,Z1), t(3,Z2_3)). % … Y=3

Using an attribute-copying lambda, this would be

call(\X^Y^Z^true, 2, Y1, Z1),

maplist(\Y1^Z1^true, [1,2,3], Zs)

This kind of partial application could conceivably be built into the lambda library -

term structures of \X^Y^Z^true and \Y^Z^true look like they are vaguely compatible

with the idea, since \X^Y^Z^true unifies with \X^Rest, leaving the partially applied lambda

as just \Rest. For normal predicates, this is no help, because you can’t call the predicate

until all the arguments are ready, but with CLP and attributed variables, things can

happen just by unifying the first argument with something.

.. All of this leads me to a question which I had on my mind a while ago — why do

both lambda libraries (Ullrich’s lambda and the newer yall) use copy_term_nat

and not just copy_term? What’s wrong with copying the attributes? If there aren’t

any, it won’t make any difference, and if there are, then arguably it makes more sense

to keep them.

Samer

signature.asc

Jan Burse

unread,

Jun 5, 2017, 9:29:07 AM6/5/17

to SWI-Prolog

+1

Jan Burse

unread,

Jun 5, 2017, 6:14:56 PM6/5/17

to SWI-Prolog

I wanted to try your gist with SWI-Prolog 7.5.5, but its not working.
When I have fixed the singleton warning for Cs0, I still get no result:

?- add(1.0,X,X1), pow(-1,X1,Y), time(taylor(5,X,Y,Cs)), X=0.0.
% 2,235 inferences, 0.000 CPU in 0.001 seconds (0% CPU, Infinite Lips)

X = 0.0,
X1 = Y, Y = 1.0,

Cs = [1.0, 0.0, 0.0, 0.0, 0.0].

The correct result should be [1.0, -1.0, 1.0, -1.0, 1.0] I guess.
Whats going wrong?

https://gist.github.com/samer--/3bc9d4785bbafcbf24bb15e1c3f0efe4

Jan Burse

unread,

Jun 5, 2017, 10:37:52 PM6/5/17

to SWI-Prolog

So I made my own version, with a copying subst, see source
at end of my post. Seems to work and doesn't use copy_term:

?- add(1,X,H), pow(H,-1,J), taylor(J,X,5,R).

R = [1, -1, 1, -1, 1, -1],

/* lots of constraints */

?- add(1,X,H), pow(H,0.5,J), taylor(J,X,5,R).

R = [1.0, 0.5, -0.125, 0.0625, -0.0390625, 0.02734375],

/* lots of constraints */

I dunno yet how to get rid of the lots of constraint, I was
thinking of using a signally constraint dispose/1,
but it would be unsafe, could also remove subterms
used in a deriv result.

Then I thought, CHR should do its own GC, like freeze
can do it. But there is a caveat, also a problem during
copy_term:
    - Constraints are not orientated: For example in
      our code below we use add(A,B,C) to indicate a
      term with root C and leaves A and B.

   - CHR reachability doesn't use orientation, for example
       if we have two constraints add(A,B,C) and add(A,D,E),
       then when copying add(A,B,C) also add(A,D,E) is
       copied, although its from a different root E.

    - Same problem for a GC, if we have C reachable, and
      the constraints add(A,B,C) and add(A,D,E) are there,
      then the system woudl think that the different root E
      is also still in use.

   - To get an idea of the problem just look at some of
     the lots of constraints. Everything is connected, although
     we have long ago already calculated the taylor expansion.
     Without orientation information, no useful GC:

?- add(1,X,H), pow(H,0.5,J), taylor(J,X,3,R).

R = [1.0, 0.5, -0.125, 0.0625],

add(1, X, H),

mul(0.5, _G22806, _G22807),

mul(-0.5, _G22819, _G22806),

mul(-1.5, _G22832, _G22819),

mul(0.5, _G22845, _G22846),

mul(-0.5, _G22858, _G22845),

mul(0.5, _G22871, _G22872),

pow(H, -2.5, _G22832),

pow(H, -1.5, _G22858),

pow(H, -0.5, _G22871),

pow(H, 0.5, J).

So I guess CHR is nevertheless pretty dead for CAS. It just
lacks the useful notion of an AST it seems to me now. Via
OOP I can also code ASTs that do sub expression sharing,

I can even make them immutable, but additionally in most
OOP languages such as Java, I have also the benefit of
GC. And Prolog I have also immutability and ASTs,

the ASTs are just the ordinary Prolog terms. Many Prolog
systems have also pretty good GC for Prolog terms.
Its quite an important feature for good tail recursion,

its call environment trimming sometimes.

-------------------------------- source code ----------------------------------

:- use_module(library(chr)).

:- chr_constraint add/3, mul/3, pow/3, subst/4, deriv/3.

/* partial evaluation */
add(X,Y,Z) <=> number(X), number(Y) | Z is X+Y.
mul(X,Y,Z) <=> number(X), number(Y) | Z is X*Y.
pow(X,Y,Z) <=> number(X) | Z is X^Y.

/* simplification */
add(0,X,Y) <=> Y=X.
add(X,0,Y) <=> Y=X.
add(X,X,Y) <=> mul(2,X,Y).
mul(0,_,Y) <=> Y=0.
mul(_,0,Y) <=> Y=0.
mul(1,X,Y) <=> Y=X.
mul(X,1,Y) <=> Y=X.
mul(X,X,Y) <=> pow(X,2,Y).
pow(_,0,Y) <=> Y=1.
pow(X,1,Y) <=> Y=X.

/* substitution */
subst(C,_,_,R) <=> number(C) | R = C.
add(A,B,C) \ subst(C,X,Y,R) <=>
   subst(A,X,Y,S), subst(B,X,Y,T), add(S,T,R).
mul(A,B,C) \ subst(C,X,Y,R) <=>
   subst(A,X,Y,S), subst(B,X,Y,T), mul(S,T,R).
pow(A,B,C) \ subst(C,X,Y,R) <=>
    subst(A,X,Y,S), pow(S,B,R).
subst(C,X,Y,R) <=> C == X | R = Y.
subst(C,X,_,R) <=> C \== X | R = C.

/* derivation */
deriv(C,_,R) <=> number(C) | R = 0.
add(A,B,C) \ deriv(C,X,R) <=>
    deriv(A,X,S), deriv(B,X,T), add(S,T,R).
mul(A,B,C) \ deriv(C,X,R) <=>
    deriv(A,X,S), deriv(B,X,T), mul(A,T,H), mul(S,B,J), add(H,J,R).
pow(A,B,C) \ deriv(C,X,R) <=>
    deriv(A,X,S), D is B-1, pow(A,D,H), mul(B,H,J), mul(J,S,R).
deriv(C,X,R) <=> C == X | R = 1.
deriv(C,X,R) <=> C \== X | R = C.

/* taylor */
taylor(P, X, N, R) :-
   horner_schema(P, X, 0, N, R).

horner_schema(P, X, N, N, [R]) :- !,
   subst(P, X, 0, R).
horner_schema(P, X, N, M, [B|J]) :-
   deriv(P, X, Q),
   K is N+1,
   horner_schema(Q, X, K, M, H),
   div_coeff(H, K, J),
   subst(P, X, 0, B).

div_coeff([X|L], K, [Y|R]) :-
   Y is X/K,
   div_coeff(L, K, R).
div_coeff([],_,[]).

https://gist.github.com/jburse/e279d80c6882296ad074926999f6987f#file-diffchr-p

Jan Burse

unread,

Jun 5, 2017, 10:56:14 PM6/5/17

to SWI-Prolog

Since the subst solution doesn't use copy_term and leaves
a lot of garbage it is even a tick faster than everything else
so far, and I am using my old slow mac:

?- time((between(1,1000,_), add(1,X,H), pow(H,-1,J), taylor(J,X,8,R), fail; true)).

% 24,492,000 inferences, 4.179 CPU in 4.281 seconds (98% CPU, 5860286 Lips)

true.

42 ?- time((between(1,1000,_), add(1,X,H), pow(H,0.5,J), taylor(J,X,8,R), fail; true)).

% 24,492,000 inferences, 4.091 CPU in 4.272 seconds (96% CPU, 5987529 Lips)

true.

This gives it, if we divide by 1000-iterations, around 0.004 ms for
the taylor expansion. In your paper you show 0.006 ms, and maybe
your mac is similar like mine?

(P.S.: Evidence that the subst is faster my be already given by the
number of inferences, in your paper I see 26,038 inferences,
and the above has 24,492 inferences)

Jan Burse

unread,

Jun 6, 2017, 3:00:53 AM6/6/17

to SWI-Prolog

What I don't understand why I can write:

add(A,B,C) \ deriv(C,X,R) <=>
deriv(A,X,S), deriv(B,X,T), add(S,T,R).

And don't need to write:

add(A,B,C) \ deriv(D,X,R) <=> C == D |
deriv(A,X,S), deriv(B,X,T), add(S,T,R).

Have to check!

Samer Abdallah

unread,

Jun 6, 2017, 4:40:24 AM6/6/17

to Jan Burse, SWI-Prolog

On 5 Jun 2017, at 23:14, Jan Burse <burs...@gmail.com> wrote:

I wanted to try your gist with SWI-Prolog 7.5.5, but its not working.
When I have fixed the singleton warning for Cs0, I still get no result:

?- add(1.0,X,X1), pow(-1,X1,Y), time(taylor(5,X,Y,Cs)), X=0.0.
% 2,235 inferences, 0.000 CPU in 0.001 seconds (0% CPU, Infinite Lips)
X = 0.0,
X1 = Y, Y = 1.0,
Cs = [1.0, 0.0, 0.0, 0.0, 0.0].

The correct result should be [1.0, -1.0, 1.0, -1.0, 1.0] I guess.
Whats going wrong?

Thanks for spotting these - there was a mistake in the definition of

derivs/3 - now fixed.

Samer.

https://gist.github.com/samer--/3bc9d4785bbafcbf24bb15e1c3f0efe4

Am Montag, 5. Juni 2017 14:00:57 UTC+2 schrieb Samer Abdallah:
.. All of this leads me to a question which I had on my mind a while ago — why do
both lambda libraries (Ullrich’s lambda and the newer yall) use copy_term_nat
and not just copy_term? What’s wrong with copying the attributes? If there aren’t
any, it won’t make any difference, and if there are, then arguably it makes more sense
to keep them.

signature.asc

Samer Abdallah

unread,

Jun 6, 2017, 5:02:04 AM6/6/17

to Jan Burse, SWI-Prolog

Hi Jan,

On 6 Jun 2017, at 03:37, Jan Burse <burs...@gmail.com> wrote:

So I made my own version, with a copying subst, see source
at end of my post. Seems to work and doesn't use copy_term:

?- add(1,X,H), pow(H,-1,J), taylor(J,X,5,R).
R = [1, -1, 1, -1, 1, -1],
/* lots of constraints */

?- add(1,X,H), pow(H,0.5,J), taylor(J,X,5,R).
R = [1.0, 0.5, -0.125, 0.0625, -0.0390625, 0.02734375],
/* lots of constraints */

I dunno yet how to get rid of the lots of constraint, I was
thinking of using a signally constraint dispose/1,
but it would be unsafe, could also remove subterms
used in a deriv result.

This looks interesting - it’s true, in my system I have to clean

up the constraints manually or they will build, or worse get in

the way and interfere with subsequent operations. It’s curious

how CHR feels very high level in some respects but low level in

others.

Some initial comments below, but I’ll have to think about your code

some more.

/* partial evaluation */
add(X,Y,Z) <=> number(X), number(Y) | Z is X+Y.
mul(X,Y,Z) <=> number(X), number(Y) | Z is X*Y.
pow(X,Y,Z) <=> number(X) | Z is X^Y.

/* simplification */
add(0,X,Y) <=> Y=X.
add(X,0,Y) <=> Y=X.
add(X,X,Y) <=> mul(2,X,Y).
mul(0,_,Y) <=> Y=0.
mul(_,0,Y) <=> Y=0.
mul(1,X,Y) <=> Y=X.
mul(X,1,Y) <=> Y=X.
mul(X,X,Y) <=> pow(X,2,Y).
pow(_,0,Y) <=> Y=1.
pow(X,1,Y) <=> Y=X.

This much is similar to what CLP(Q/R) would do anyway, and since

these are simplification constraints, there is no problem with garbage

building up.

/* derivation */
deriv(C,_,R) <=> number(C) | R = 0.
add(A,B,C) \ deriv(C,X,R) <=>
deriv(A,X,S), deriv(B,X,T), add(S,T,R).

This is forward mode automatic differentiation, but with the deriv/3

constraint propagating backwards through the computation graph.

It’s forward mode because it is propagating the computation of

through d <everything> / d <input> through the graph for a fixed <input>,

whereas reverse mode AD propagates the computation of

d<output> / d<everything> for a fixed <output>. In my paper, I discuss

how my first solution computes reverse mode AD but by propagating

deriv/3 forwards, whereas my second solution computes reverse mode

AD with back-propagation of constraints, and that this seem to fit the

problem better. Perhaps your forward mode AD will work better with

forward propagation of constraints, e.g. with rules like

add(A,B,C), deriv(A,X,DA), deriv(B,X,DB) ==> add(DA,DB,DC), deriv(C,X,DC).

This is now very close to an implementation of forward mode using

dual numbers, since for each node A, B, we have a differential DA, DB etc,

and each forward computation on the nodes is matched by a forward

compuation on the differentials. The added weight of CHR is not really

needed for forward mode AD, since we don’t need the same ability to

examine the computation graph that CHR gives us.

cheers,

Samer

signature.asc

Samer Abdallah

unread,

Jun 6, 2017, 5:05:18 AM6/6/17

to Jan Burse, SWI-Prolog

On 6 Jun 2017, at 08:00, Jan Burse <burs...@gmail.com> wrote:

What I don't understand why I can write:

add(A,B,C) \ deriv(C,X,R) <=>
deriv(A,X,S), deriv(B,X,T), add(S,T,R).

And don't need to write:

add(A,B,C) \ deriv(D,X,R) <=> C == D |
deriv(A,X,S), deriv(B,X,T), add(S,T,R).

I think this is because, when doing rule selection

with variables shared between the heads, CHR already matches on

the basis of ==/2 equality.

It occurred to me that perhaps we should be having this discussion

on the CHR mailing list, which I just joined the other day.

Samer

Have to check!

Am Dienstag, 6. Juni 2017 04:56:14 UTC+2 schrieb Jan Burse:
Since the subst solution doesn't use copy_term and leaves
a lot of garbage it is even a tick faster than everything else
so far, and I am using my old slow mac:

?- time((between(1,1000,_), add(1,X,H), pow(H,-1,J), taylor(J,X,8,R), fail; true)).
% 24,492,000 inferences, 4.179 CPU in 4.281 seconds (98% CPU, 5860286 Lips)
true.

42 ?- time((between(1,1000,_), add(1,X,H), pow(H,0.5,J), taylor(J,X,8,R), fail; true)).
% 24,492,000 inferences, 4.091 CPU in 4.272 seconds (96% CPU, 5987529 Lips)
true.

This gives it, if we divide by 1000-iterations, around 0.004 ms for
the taylor expansion. In your paper you show 0.006 ms, and maybe
your mac is similar like mine?

(P.S.: Evidence that the subst is faster my be already given by the
number of inferences, in your paper I see 26,038 inferences,
and the above has 24,492 inferences)

Am Dienstag, 6. Juni 2017 04:37:52 UTC+2 schrieb Jan Burse:
So I made my own version, with a copying subst, see source
at end of my post. Seems to work and doesn't use copy_term:

signature.asc

Jan Burse

unread,

Jun 6, 2017, 7:08:47 AM6/6/17

to SWI-Prolog, burs...@gmail.com

Thanks for the teaching! I guess I will upload a new
gist version, as soon as I have time,

where I will use this CHR feature, hopefully it applies
as well for joins inside literals, i.e. for subst and deriv,

I will try to replace these rules:

subst(C,X,Y,R) <=> C == X | R = Y.

deriv(C,X,R) <=> C == X | R = 1.

and do some new measurements. BTW I am/was on some CHR
mailing list, but it doesn't show much traffic. Why not a

CHR google group, everybody can create a google group?

Carlo Capelli

unread,

Jun 6, 2017, 11:13:02 AM6/6/17

to Samer Abdallah, Jan Burse, SWI-Prolog

Samer, Jan

your thread is passionating... please don't leave now :)

Jan, I find the idea of a goggle group very nice (also because the official (?) CHR homepage is a bit outdated).

Thanks, Carlo

To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/swi-prolog.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "SWI-Prolog" group.

To unsubscribe from this group and stop receiving emails from it, send an email to swi-prolog+unsubscribe@googlegroups.com.

Jan Burse

unread,

Jun 7, 2017, 3:06:04 PM6/7/17

to SWI-Prolog

What also comes to mind, is realizing rules that reduce the
amount of work, like for example:

subst(A,X,Y,B) \ subst(A,X,Y,C) <=> B =C.
deriv(A,X,B) \ deriv(A,X,C) <=> B =C.

Which makes me thinking whether I should do a CHR port
to Jekejeke Prolog. I would just use the existing delta engine,

but on the surface, replace my own rule and query language
by a additional module, that supports the CHR rule and
query language.

I have a rule set that can do something else, namely
a rule set for a calculus of construction. Its quite fun!
This way I could run this rule set both in SWI-Prolog

and in Jekejeke Prolog, same with the autodiff stuff.
I didn't make up my mind yet, how to translate the CHR
rules, because of the ==/2 semantics, which is

a little Unicorn in the whole CHR venture, the ==/2 is
not a logical predicate, its a meta-logical predicate. So
we have a more complex declarativity of CHR rules,

than of ordinary logic programming programs based
on Horn Clauses or Rules with negation as failure.

Samer Abdallah

unread,

Jun 7, 2017, 6:46:50 PM6/7/17

to Carlo Capelli, Jan Burse, SWI-Prolog

On 6 Jun 2017, at 16:13, Carlo Capelli <cc.car...@gmail.com> wrote:

Samer, Jan

your thread is passionating... please don't leave now :)

If you want to see something really crazy, have a look at this:

https://gist.github.com/samer--/94ec72cfc055d48eebf3dcd688618d85

I tried it in OCaml first but the compiler couldn’t handle the

types. It’s kind of nuts that Haskell can make sense of it.

Samer

signature.asc

Reply all

Reply to author

Forward

Automatic differentiation in CHR

Samer Abdallah

Jan Burse

Jan Burse

Jan Burse

Samer Abdallah

Samer Abdallah

Samer Abdallah

Jan Burse

Samer Abdallah

Jan Burse

Jan Burse

Method of Fluxions - Sir Isaac Newton 1671

Jan Burse

Samer Abdallah

Jan Burse

Jan Burse

Samer Abdallah

Samer Abdallah

Jan Burse

Jan Burse

Jan Burse

Jan Burse

Jan Burse

Samer Abdallah

Samer Abdallah

Samer Abdallah

Jan Burse

Carlo Capelli

Jan Burse

Samer Abdallah