e="square(square(x+y)+z)+square(x+w)"
I would like to define a function
def substitute(math_expr,**subs):
#.... something here
result result
such that when I call
print substitute(e, square="x -> x**2")
I obtain
"((x+y)**2+z)**2+(x+w)**2"
The difficult part is that e can contains arbitrarily nested invocations
of square: notice for instance that Mathematica is unable to solve this:
$ math
Mathematica 4.1 for Linux
Copyright 1988-2000 Wolfram Research, Inc.
-- Motif graphics initialized --
In[1]:= square[square[x+y]+z]+square[x+w]/.square[x_] -> x^2
2 2
Out[1]= (w + x) + (z + square[x + y])
________________________________________________________________________
If somebody is idle with nothing to do, he/she could think a little
about that ;-)
Bye,
--
Michele Simionato - Dept. of Physics and Astronomy
210 Allen Hall Pittsburgh PA 15260 U.S.A.
Phone: 001-412-624-9041 Fax: 001-412-624-9163
Home-page: http://www.phyast.pitt.edu/~micheles/
>I saw a recent posting about symbolic manipulation in Python and I have
>understood that it is much better to use Mathematica or Maple.Of course,
>this is not surprising at all, still I am not completely happy with both
>Mathematica and Maple, essentially due to the fact that they do not scale
>well with large projects (at least in my experience) and I would welcome a more
>programming oriented replacement of them (I think the approach of GiNaC is
>promising but I have never used it).
>Nevertheless, for sake of personal illumination, I would like to understand
>what can be done in Python for simple problems. To this aim I propose here
>a toy problem which I think this can be solved by using the parser and/or
>compiler modules (disclaimer: I am not particularly familiar with these
>modules), i.e. the substitution of functions in expressions.
>In order to be specific I will pick up an example. Consider for instance
>the expression
>
>e="square(square(x+y)+z)+square(x+w)"
>
>I would like to define a function
>
>def substitute(math_expr,**subs):
> #.... something here
> result result
>
>such that when I call
>
>print substitute(e, square="x -> x**2")
>
>I obtain
>
>"((x+y)**2+z)**2+(x+w)**2"
[1] moving this down to [2] for comparison
>
>The difficult part is that e can contains arbitrarily nested invocations
>of square: notice for instance that Mathematica is unable to solve this:
^^^^^^--?? (cf. [1],[2])
>
>$ math
>Mathematica 4.1 for Linux
>Copyright 1988-2000 Wolfram Research, Inc.
> -- Motif graphics initialized --
>
>In[1]:= square[square[x+y]+z]+square[x+w]/.square[x_] -> x^2
>
> 2 2
>Out[1]= (w + x) + (z + square[x + y])
^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^
| |
+-----------+ |
| |
vvvvvvvv |
[2] "((x+y)**2+z)**2+(x+w)**2" |
^^^^^^^^^^^^^^^ |
| |
+-------------------+
What am I missing?
Regards,
Bengt Richter
Ok. Try this:
square[square[x+y]+z]+square[x+w] //. square[x_]->x^2
Notice the //. instead of /.
--
CARL BANKS
Michele
>bo...@oz.net (Bengt Richter) wrote in message news:<atasp6$loc$0...@216.39.172.122>...
>> >
>> >In[1]:= square[square[x+y]+z]+square[x+w]/.square[x_] -> x^2
>> >
>> > 2 2
>> >Out[1]= (w + x) + (z + square[x + y])
>> ^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^
>> | |
>> +-----------+ |
>> | |
>> vvvvvvvv |
>> [2] "((x+y)**2+z)**2+(x+w)**2" |
>> ^^^^^^^^^^^^^^^ |
>> | |
>> +-------------------+
>>
>> What am I missing?
>
>The "/." substitution operator of Mathematica does not work
>recursively:
>you see that the result still contain the expression square[x+y] and
D'oh ;-/
>that
>you should apply "/." again to get rid of this second square.
>Of course, as Carl pointed out, you could use the "//." operator, but
>this
>is not the point: I would like to do this in Python. My example
>(perhaps
>not too happily chosen) simply wanted to point out the need for a
>recursive
>algorithm. Still I think the solution should be not too difficult
>using the
>compiler module, but I have problems in understanding this module
>since the documentation is still rather poor and not so clear (to me
>at least).
Well, for a toy problem, parsing according to python grammar is probably
overkill. I.e.,
'square(square(x+y)+z)+square(x+w)'
becomes
['eval_input',
['testlist',
['test',
['and_test',
['not_test',
['comparison',
['expr',
['xor_expr',
['and_expr',
['shift_expr',
['arith_expr',
['term',
['factor',
['power',
['atom', ['NAME', 'square']],
['trailer',
['LPAR', '('],
['arglist',
['argument',
['test',
['and_test',
['not_test',
['comparison',
['expr',
['xor_expr',
['and_expr',
['shift_expr',
['arith_expr',
['term',
['factor',
['power',
['atom',
['NAME', 'square']],
['trailer',
['LPAR', '('],
['arglist',
['argument',
['test',
['and_test',
['not_test',
['comparison',
['expr',
['xor_expr',
['and_expr',
['shift_expr',
['arith_expr',
['term',
['factor',
['power',
['atom',
['NAME',
'x']]]]],
['PLUS',
'+'],
['term',
['factor',
['power',
['atom',
['NAME',
'y']]]]]]]]]]]]]]]],
['RPAR', ')']]]]],
['PLUS', '+'],
['term',
['factor',
['power',
['atom',
['NAME', 'z']]]]]]]]]]]]]]]],
['RPAR', ')']]]]],
['PLUS', '+'],
['term',
['factor',
['power',
['atom', ['NAME', 'square']],
['trailer',
['LPAR', '('],
['arglist',
['argument',
['test',
['and_test',
['not_test',
['comparison',
['expr',
['xor_expr',
['and_expr',
['shift_expr',
['arith_expr',
['term',
['factor',
['power',
['atom',
['NAME', 'x']]]]],
['PLUS', '+'],
['term',
['factor',
['power',
['atom',
['NAME', 'w']]]]]]]]]]]]]]]],
['RPAR', ')']]]]]]]]]]]]]]],
['NEWLINE', ''],
['ENDMARKER', '']]
which compiles to
0 SET_LINENO 0
3 LOAD_NAME 0 (square)
6 LOAD_NAME 0 (square)
9 LOAD_NAME 1 (x)
12 LOAD_NAME 2 (y)
15 BINARY_ADD
16 CALL_FUNCTION 1
19 LOAD_NAME 3 (z)
22 BINARY_ADD
23 CALL_FUNCTION 1
26 LOAD_NAME 0 (square)
29 LOAD_NAME 1 (x)
32 LOAD_NAME 4 (w)
35 BINARY_ADD
36 CALL_FUNCTION 1
39 BINARY_ADD
40 RETURN_VALUE
Probably rolling your own parser and using the tokenizer would make sense. E.g., the tokenizer
will get you
>>> import tokenize, StringIO
>>> def prtoks(expr):
... tokens = tokenize.generate_tokens(StringIO.StringIO(expr).readline)
... for t in tokens: print '%10s: %s' %(token.tok_name[t[0]], t[1])
...
>>> e = 'square(square(x+y)+z)+square(x+w)'
>>> prtoks(e)
NAME: square
OP: (
NAME: square
OP: (
NAME: x
OP: +
NAME: y
OP: )
OP: +
NAME: z
OP: )
OP: +
NAME: square
OP: (
NAME: x
OP: +
NAME: w
OP: )
ENDMARKER:
Or maybe just recognize the ops yourself in the tokens:
>>> def tokens(expr):
... tokens = tokenize.generate_tokens(StringIO.StringIO(expr).readline)
... return [t[1] for t in tokens]
...
>>> tokens(e)
['square', '(', 'square', '(', 'x', '+', 'y', ')', '+', 'z', ')', '+', 'square', '(', 'x', '+',
'w', ')', '']
You said in your other post:
______________________________________________
e="square(square(x+y)+z)+square(x+w)"
I would like to define a function
def substitute(math_expr,**subs):
#.... something here
result result
such that when I call
print substitute(e, square="x -> x**2")
I obtain
"((x+y)**2+z)**2+(x+w)**2"
______________________________________________
what does
square="x -> x**2"
mean in general? I gather here it means that
square(something)
should be rewritten as
(something)**2
perhaps without parentheses if something is a single term,
but what other kinds of **subs do you anticipate?
How about this (using tokens() defined above), specifying the
function parameter rewrite via a format string:
>>> from __future__ import generators
>>> def funrew(expr, funname, rewfmt):
... toks = tokens(expr)
... while toks:
... if toks[0]!=funname:
... yield toks.pop(0)
... continue
... toks.pop(0) # dump funname
... # find and process parenthesized call parameter expression
... i = ump = 0
... for t in toks:
... i += 1
... if t =='(': ump += 1
... if t == ')':
... ump -= 1
... if not ump: break
... rest = toks[i:]
... yield (rewfmt % ''.join(funrew(''.join(toks[1:i-1]), funname, rewfmt)))
... toks = rest
...
>>> def substitute(expr, funname, rewfmt):
... return ''.join(funrew(e, funname,rewfmt))
...
>>> e
'square(square(x+y)+z)+square(x+w)'
>>> substitute(e, 'square','(%s)**2')
'((x+y)**2+z)**2+(x+w)**2'
Well, this is not so exciting for functions with multiple parameters. How
might you want to rewrite those? Probably isolate terms and rewrite each term
and then feed that as a tuple to a rewfmt? If you can assume no terms containing
nested commas (probably reasonable for math expressions) then it shouldn't be
too hard. But much beyond that and you may want to use the Python parser after all ;-)
Anyway, I'll leave the rest as an exercise ;-)
Regards,
Bengt Richter
It was such a jellybean project I couldn't resist modding it a it to handle
comma separated function args, and multiple functions:
====< simiosub.py >=======================================
# simiosub.py
# a program to rewrite function calls in an expression
from __future__ import generators
import tokenize, StringIO, token
def tokens(expr):
tokens = tokenize.generate_tokens(StringIO.StringIO(expr).readline)
return [t[1] for t in tokens]
def substitute(expr, **funs):
def funrew(expr, funs):
if isinstance(expr, str): toks = tokens(expr)
else: toks = expr
while toks:
if toks[0] not in funs:
yield toks.pop(0)
continue
currfun = toks.pop(0)
# find and process parenthesized call parameter expression
i = ump = 0
for t in toks:
i += 1
if t =='(': ump += 1
if t == ')':
ump -= 1
if not ump: break
rest = toks[i:]
parenexp = toks[1:i-1]
ixcommas = [t[1] for t in zip(parenexp,range(len(parenexp)))
if t[0]==',']+[len(parenexp)]
j = 0; paramlist= []
for k in ixcommas:
paramlist.append(''.join(funrew(parenexp[j:k], funs)))
j = k+1
yield (funs[currfun] % tuple(paramlist))
toks = rest
return ''.join(funrew(expr, funs))
e = 'square(square(x+y)+z)+square(x+w)'
# substitute(e, 'square','(%s)**2')
if __name__ == '__main__':
import sys
args = sys.argv[1:]
if not args: args = ['square(square(x+y)+z)+square(x+w)', 'square=(%s)**2']
# should return '((x+y)**2+z)**2+(x+w)**2'
e = args.pop(0); funs = dict(map(lambda x: tuple(x.split('=')),args))
print 'before:', e
for sub in args: print ' ',sub
print ' after:', substitute(e, **funs)
==========================================================
Default test:
[21:44] C:\pywk\clp>simiosub.py
before: square(square(x+y)+z)+square(x+w)
square=(%s)**2
after: ((x+y)**2+z)**2+(x+w)**2
A little more complex:
[21:44] C:\pywk\clp>simiosub.py foo(x*y)+zeep(1,foo(2),3) "foo=(foo %s)" "zeep=(zeep %s %s %s)"
before: foo(x*y)+zeep(1,foo(2),3)
foo=(foo %s)
zeep=(zeep %s %s %s)
after: (foo x*y)+(zeep 1 (foo 2) 3)
Regards,
Bengt Richter
>On 14 Dec 2002 00:05:19 GMT, bo...@oz.net (Bengt Richter) wrote:
>[ ... first version substitute routine ...]
>> 'square(square(x+y)+z)+square(x+w)'
>> >>> substitute(e, 'square','(%s)**2')
>> '((x+y)**2+z)**2+(x+w)**2'
>>
>>Well, this is not so exciting for functions with multiple parameters. How
>>might you want to rewrite those? Probably isolate terms and rewrite each term
>>and then feed that as a tuple to a rewfmt? If you can assume no terms containing
>>nested commas (probably reasonable for math expressions) then it shouldn't be
>>too hard. But much beyond that and you may want to use the Python parser after all ;-)
>>
>>Anyway, I'll leave the rest as an exercise ;-)
>
>It was such a jellybean project I couldn't resist modding it a it to handle
>comma separated function args, and multiple functions:
>
One more time, allowing rewrite specs in terms of function parameters
specified as $1, $2, etc in the rewrite format.
====< simiosub.py >=======================================
# simiosub.py
# a program to rewrite function calls in an expression
from __future__ import generators
import tokenize, StringIO
def tokens(expr):
tokens = tokenize.generate_tokens(StringIO.StringIO(expr).readline)
return [t[1] for t in tokens]
def substitute(expr, **funs):
def funrew(expr, funs):
if isinstance(expr, str): toks = tokens(expr)
else: toks = expr
while toks:
if toks[0] not in funs:
yield toks.pop(0)
continue
currfun = toks.pop(0)
# find and process parenthesized call parameter expression
i = ump = 0
for t in toks:
i += 1
if t =='(': ump += 1
if t == ')':
ump -= 1
if not ump: break
rest = toks[i:]
parenexp = toks[1:i-1]
ixcommas = [t[1] for t in zip(parenexp,range(len(parenexp)))
if t[0]==',']+[len(parenexp)]
i = j = 0; paramdir= {}
for k in ixcommas:
i += 1 # for $1, $2, etc
paramdir[str(i)] = ''.join(funrew(parenexp[j:k], funs))
j = k+1
yield (funs[currfun] % paramdir)
toks = rest
# change $x to %(x)s in formats
sfuns = {}
for k, v in funs.items():
sl = v.split('$')
for i in range(1,len(sl)): sl[i] = '%%(%s)s%s' % (sl[i][0],sl[i][1:])
sfuns[k] = ''.join(sl)
return ''.join(funrew(expr, sfuns))
if __name__ == '__main__':
import sys
args = sys.argv[1:]
if not args: args = ['square(square(x+y)+z)+square(x+w)', 'square=($1)**2']
# should return '((x+y)**2+z)**2+(x+w)**2'
e = args.pop(0); funs = dict(map(lambda x: tuple(x.split('=')),args))
print 'before:', e
for sub in args: print ' ',sub
print ' after:', substitute(e, **funs)
==========================================================
Default test:
[22:42] C:\pywk\clp>simiosub.py
before: square(square(x+y)+z)+square(x+w)
square=($1)**2
after: ((x+y)**2+z)**2+(x+w)**2
A little more complex, changing to infix function calls:
[22:42] C:\pywk\clp>simiosub.py foo(x*y)+zeep(1,foo(2),3) "foo=(foo $1)" "zeep=(zeep $1 $2 $3)"
before: foo(x*y)+zeep(1,foo(2),3)
foo=(foo $1)
zeep=(zeep $1 $2 $3)
after: (foo x*y)+(zeep 1 (foo 2) 3)
Using the reusability of parameters in the rewrite format:
[22:42] C:\pywk\clp>simiosub.py square(x+y) "square=($1)*($1)"
before: square(x+y)
square=($1)*($1)
after: (x+y)*(x+y)
And on the original test:
[22:45] C:\pywk\clp>simiosub.py square(square(x+y)+z)+square(x+w) "square=($1)*($1)"
before: square(square(x+y)+z)+square(x+w)
square=($1)*($1)
after: ((x+y)*(x+y)+z)*((x+y)*(x+y)+z)+(x+w)*(x+w)
Kind of fun, but that's enough following myself up ;-)
Regards,
Bengt Richter
Unfortunately, I cannot assume that. That's the reason why
I said one needs the parser or compile module.
Anyway, I learn always a lot by reading your code, for instance
I didn't realize .join() works on iterators, not I was familiar with
other neat tricks of which ISTM you have an illimited provision... ;-)
Bye,
Michele
>bo...@oz.net (Bengt Richter) wrote in message news:<atdsjv$cka$0...@216.39.172.122>...
>> If you can assume no terms containing
>> nested commas (probably reasonable for math expressions) then it shouldn't be
>> too hard. But much beyond that and you may want to use the Python parser
>> after all ;-)
>
>Unfortunately, I cannot assume that. That's the reason why
>I said one needs the parser or compile module.
Well, a little state machine wouldn't be that big a deal.
Do you have a grammar for the language you need to handle?
Is it all math expressions?
What would be an example showing all the syntax that you have to handle?
And what other operations will you want to do besides rewriting function calls?
That might make a big difference for what parsing info you
Also, I'm a bit curious what this is actually for. I.e., what kind
of tool you're building, and why ;-)
Regards,
Bengt Richter
> Well, for a toy problem, parsing according to python grammar is probably
> overkill. I.e.,
>
> 'square(square(x+y)+z)+square(x+w)'
>
> becomes
>
> ['eval_input',
> ['testlist',
> ['test',
> ['and_test',
> ['not_test',
> ['comparison',
> ['expr',
> ['xor_expr',
[... and on and on and on ...]
Bengt probably knows this, but you can use the compiler package to
make this very much pleasanter...
>>> compiler.transformer.parse('square(square(x+y)+z)+square(x+w)')
Module(None, Stmt([Discard(Add((CallFunc(Name('square'),
[Add((CallFunc(Name('square'), [Add((Name('x'), Name('y')))], None,
None), Name('z')))], None, None), CallFunc(Name('square'),
[Add((Name('x'), Name('w')))], None, None))))]))
>>> _.node.nodes[0].expr
Add((CallFunc(Name('square'), [Add((CallFunc(Name('square'),
[Add((Name('x'), Name('y')))], None, None), Name('z')))], None, None),
CallFunc(Name('square'), [Add((Name('x'), Name('w')))], None, None)))
Hmm, well the fact that pprint.pprint doesn't know how to deal with
these thigns doesn't help, but these really are nicer to work with :)
Cheers,
M.
--
Famous remarks are very seldom quoted correctly.
-- Simeon Strunsky
>bo...@oz.net (Bengt Richter) writes:
>
>> Well, for a toy problem, parsing according to python grammar is probably
>> overkill. I.e.,
>>
>> 'square(square(x+y)+z)+square(x+w)'
>>
>> becomes
>>
>> ['eval_input',
>> ['testlist',
>> ['test',
>> ['and_test',
>> ['not_test',
>> ['comparison',
>> ['expr',
>> ['xor_expr',
>[... and on and on and on ...]
>
>Bengt probably knows this, but you can use the compiler package to
>make this very much pleasanter...
>
Christmas is here early ;-)
>>>> compiler.transformer.parse('square(square(x+y)+z)+square(x+w)')
>Module(None, Stmt([Discard(Add((CallFunc(Name('square'),
>[Add((CallFunc(Name('square'), [Add((Name('x'), Name('y')))], None,
>None), Name('z')))], None, None), CallFunc(Name('square'),
>[Add((Name('x'), Name('w')))], None, None))))]))
>>>> _.node.nodes[0].expr
>Add((CallFunc(Name('square'), [Add((CallFunc(Name('square'),
>[Add((Name('x'), Name('y')))], None, None), Name('z')))], None, None),
>CallFunc(Name('square'), [Add((Name('x'), Name('w')))], None, None)))
>
>Hmm, well the fact that pprint.pprint doesn't know how to deal with
>these thigns doesn't help, but these really are nicer to work with :)
>
How about this as relatively simple prettyprinting for this kind of output?
[ 5:49] C:\pywk\clp>ppcomp.py -h
Usage: python ppcomp.py [-i | -h | -f filename | expression ]
(nothing specified reads stdin, -i prompts, -h prints this, else the obvious)
[ 5:49] C:\pywk\clp>ppcomp.py square(square(x+y)+z)+square(x+w)
Module(
None,
Stmt(
[
Discard(
Add(
(
CallFunc(
| Name(
| | 'square'),
| [
| | Add(
| | (
| | CallFunc(
| | | Name(
| | | | 'square'),
| | | [
| | | | Add(
| | | | (
| | | | Name(
| | | | | 'x'),
| | | | Name(
| | | | 'y')))],
| | | None,
| | | None),
| | Name(
| | 'z')))],
| None,
| None),
CallFunc(
Name(
| 'square'),
[
| Add(
| (
| Name(
| | 'x'),
| Name(
| 'w')))],
None,
None))))]))
[ 5:50] C:\pywk\clp>ppcomp.py -i
Enter expression (or just press Enter to quit):
Expr> a,b,c = 1,2,3
Module(
None,
Stmt(
[
Assign(
[
| AssTuple(
| [
| AssName(
| | 'a',
| | 'OP_ASSIGN'),
| AssName(
| | 'b',
| | 'OP_ASSIGN'),
| AssName(
| 'c',
| 'OP_ASSIGN')])],
Tuple(
[
Const(
| 1),
Const(
| 2),
Const(
3)]))]))
(not very tested, but not too bad? ;-)
====< ppcomp.py >====================================================
# ppcomp.py
# pretty print some compiler output -- bo...@oz.net 2002-12-17
import compiler, re, sys
rxb = re.compile(r'([\[\](),])')
IND = ' '
def ppcomp(s):
out = []; wrout = out.append # sys.stdout.write
sp = [s.strip() for s in rxb.split(`compiler.transformer.parse(s)`)]
nest = 0
for t in sp:
if not t: continue
if t in ['(','[']:
wrout(t)
nest += 1
wrout('\n'); wrout(IND*nest)
elif t in [')',']']:
wrout(t)
nest -= 1
else:
wrout(t)
if t == ',': wrout('\n'); wrout(IND*nest)
# make vertical connecting lines
out = ''.join(out).splitlines()
nl = len(out)
linewd=[]
for y in range(nl):
linewd.append(len(out[y])-len(out[y].lstrip()))
lastwd = 0
for y in range(nl):
w = linewd[y];
if w>=lastwd:
lastwd=w; continue
# draw line upwards if spaces above when indent decreases
ydraw = y-1
while ydraw>=0 and w<linewd[ydraw] and out[ydraw][w]==' ':
s = out[ydraw]
out[ydraw] = s[:w] + '|' + s[w+1:]
ydraw -= 1
return '\n'.join(out)
if __name__ == '__main__':
import sys
s=''
if len(sys.argv)<2:
print 'Enter Python source and end with ^Z'
s = sys.stdin.read()
elif sys.argv[1]=='-f' and len(sys.argv)>2:
f = file(sys.argv[2])
s = f.read()
f.close()
elif sys.argv[1]=='-i':
s='anything'
print 'Enter expression (or just press Enter to quit):'
while s:
s = raw_input('Expr> ').rstrip()
if s: print ppcomp(s)
elif sys.argv[1]=='-h':
print """
Usage: python ppcomp.py [-i | -h | -f filename | expression ]
(nothing specified reads stdin, -i prompts, -h prints this, else the obvious)
"""
else:
s = sys.argv[1]
if s: print ppcomp(s)
=====================================================================
Regards,
Bengt Richter
hey, this begins to look like lisp! :-)
nice utility, btw.
holger