Algorithms on parse trees for simplification and differentiation

Peter Petrov

unread,

Jan 24, 2020, 9:33:04 AM1/24/20

to antlr-discussion

Hello,

I have a simple math expression defining grammar and I generated visitor and parser in JavaScript.

I am able to assign values to variables and to then evaluate expressions (similar to that example in the ANTLR4 book).

Now... I am brainstorming about the following 2 things.

1) Expression simplification.

I have a ANTLR 4 parse tree for say A = 1 + x + x^3 + 2*x + 3. What algorithms are there to simplify this to

B = x^3 + 3*x + 4. I found a read somewhere that I need to traverse the parse tree of A and either modify it in place,

or build a new one by following certain rules. But OK... then two questions come up:

a) how do I know which sub-trees to group together (1 with 3) and (x with 2*x).

b) how do I build that new/resulting parse tree from zero just by a program.

What do I mean? I realized that currently the only way I know of building a parse tree

is through parsing a string-written expression (one like A). But what if I already have a parse tree (from A),

then how do I build the parse tree of B in a programmatic way? Are there other ways? Are there any references or examples?

2) Expression Differentiation.

I have the exact same questions here / a) and b) /.

Also, I know for a person to do 1) is easier than to do 2).

But I start to think that maybe computationally 1) is a harder problem than 2).

Not sure.

In general are there any good references or books or algorithms about this?

And in particular: how do I go about a) and b)?

Many thanks in advance.

Regards,
Peter

Peter Petrov

unread,

Jan 24, 2020, 9:44:35 AM1/24/20

to antlr-discussion

3) I also found that some people say (for my problem) that I need to build

an abstract syntax tree (AST) of the expression A

from the parse tree of A.

And then operate on the AST of A to simplify or differentiate the expression A.

To do that (simplify or differentiate) I will need to generate a new AST for B as I walk the AST of A.

Is that overall strategy correct? And if so... would I ever need

further down the line to generate a parse tree from my AST of the result B.

I guess not.

I guess I should just generate a string representation of the resulting

(simplified or differentiated) expression B from its AST and that's it - problem solved, right?

I just want to see what the forum members' thoughts are on this whole problem,

what are the best practices, strategies, algorithms, etc.

Any references or examples would be mostly welcome.

Many thanks once again.

Marc Jacobi

unread,

Jan 24, 2020, 10:53:21 AM1/24/20

to antlr-discussion

The parse tree for expressions does not process operator precedence for you -think also of ( and ).
I use the shunting-yard algorithm to walk the parse tree of an expression and produce a sorted binary tree. This is what I call my AST.

Seems to me it is easier to apply any algorithm to a tree that has been reduced to the essence and ordered correctly. I cannot help you with the algorithm itself - I'm a math noob.

If you read C++ you can take a look at my code here:
https://github.com/obiwanjacobi/Zlang/blob/master/src/Zlang/zsharp/AST/AstExpressionBuilder.cpp

Here is my grammar:

https://github.com/obiwanjacobi/Zlang/blob/master/src/Zlang/zsharp/grammar/zsharp_parser.g4

Reply all

Reply to author

Forward