Finding if expression can be manipulated to contain certain subexpressions

48 views
Skip to first unread message

Ondřej Čertík

unread,
Aug 22, 2014, 7:38:36 PM8/22/14
to sympy
Hi,

We have the following problem: given a (complicated) expression "e",
and a set of known expressions s1, s2, s3, ..., we would like to
rewrite "e", so that it can be written in terms of any subset of s1,
s2, s3, ..... Some examples:

e = x^2+2*x*y+z^2
s1 = x+y

then we would like to get e == s1^2. Or:

e = a*x+a*y
s1 = x+y

then e == a*s1.

Actually, what we really need is given an integral, like this (much
more complicated in practice):

e = c*Integral(x**2, (x, a, b))
s1 = Integral(y**2, (y, a, b))

then it would give me e==c*s1. Notice the different dummy variable.
Here is another example, it would be able to recognize variable
substitutions:

e = c*Integral(x**2, (x, a, b))
s1 = Integral(sin(y)**2, (y, asin(a), asin(b)))
# I hope I substituted correctly

The goal is to have a large database of known expressions s1, s2, s3,
... s10000, and then if given a new expression, SymPy would be able to
figure out a way to write it in terms of them (I understand that it
might not always succeed).

This came up when discussing a particular application with Wang-Kong,
a colleague of mine at LANL. It would be used in condensed matter
physics, regarding conductivity calculations, that it would be nice to
write expressions for new materials ("e") in terms of known ones ("s1,
s2, ...").

Ondrej

Chris Smith

unread,
Aug 23, 2014, 9:18:33 AM8/23/14
to sy...@googlegroups.com
The first things that come to mind are to see if e.extract_additively(s) or factor(e).extract_multiplicatively(s). There was some discussion about unification of expression at https://groups.google.com/forum/#!searchin/sympy/smichr/sympy/wNJ8gZJv4Yo/slYc5XbnhJUJ, too.

F. B.

unread,
Aug 23, 2014, 12:45:24 PM8/23/14
to sy...@googlegroups.com
I think your problem is a bit general, and it requires a combination of tools.

On Saturday, August 23, 2014 1:38:36 AM UTC+2, Ondřej Čertík wrote:
Some examples:

e = x^2+2*x*y+y^2
s1 = x+y

Maybe this could be done using a solver, the condition e-s1 being true for all x, y?
 
Mathematica has this: http://reference.wolfram.com/language/ref/SolveAlways.html

Is there anything like that in SymPy?
 
e = c*Integral(x**2, (x, a, b))
s1 = Integral(sin(y)**2, (y, asin(a), asin(b)))
# I hope I substituted correctly


Maybe this could be tackled by
  • a structural unification yielding all incompatible subtrees pairs (in this case: [(x**2, sin(y)**2), (a, asin(a)), (b, asin(b)],
  • relabeling the vars from the e expression: [(x_e**2, sin(y)**2), (a_e, asin(a)), (b_e, asin(b)]
  • matching all pairs but something similar to Mathematica's SolveAlways or some other clever usage of a solver.

Aaron Meurer

unread,
Aug 23, 2014, 3:26:55 PM8/23/14
to sy...@googlegroups.com, Matthew Rocklin
The way I'd do this is:

- Write down all possible ways to rewrite expressions. These could be patterns which are unified against the expression, or just functions which are applied to an expression.
- Have a system which goes through all subexpressions and applies these until it finds what it is looking for.
- Since this gets combinatorially large fast, have some way to "hint" which methods are better to try and in what order. 

I think the second part already exists, somewhere in the strategies module. Matthew Rocklin would be able to answer the best. 

But I think this is a very hard problem as it is, because although what I described is simple, without any limitations, it gets combinatorially impossible.  For example, an addition of n terms has 2**n sub-expressions of just taking all possible combinations of terms to make smaller additions (that doesn't even count the subexpressions of the terms themselves). That also doesn't consider that you might want to consider something like x + x as a "subexpression" of 3*x.

For the s1 = x + y example, you can let x = s1 - y and substitute this in the expression, and simplify it to get s1**2. 

Aaron Meurer



--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/9166b421-24f9-45eb-918c-c697b2e2e451%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

F. B.

unread,
Aug 27, 2014, 4:54:46 PM8/27/14
to sy...@googlegroups.com, mroc...@gmail.com


On Saturday, August 23, 2014 9:26:55 PM UTC+2, Aaron Meurer wrote:
For example, an addition of n terms has 2**n sub-expressions of just taking all possible combinations of terms to make smaller additions (that doesn't even count the subexpressions of the terms themselves). That also doesn't consider that you might want to consider something like x + x as a "subexpression" of 3*x.


Maybe if you get some genetic programming algorithm to work, it could avoid the need to explore all the combinations. The problem is to make it converge towards the solution.
Reply all
Reply to author
Forward
0 new messages