Ahas re tree traversals

64 views
Skip to first unread message

Edward K. Ream

unread,
Mar 31, 2022, 4:52:55 PM3/31/22
to leo-editor
I have started work on #2541, which will add support for new ast nodes in Python 3.10.

After looking at leoAst.py with fresh eyes, the following Aha's emerged:

1. Using python generators in the TokenOrderGenerator borders on the absurd. Generators complicate the code patterns while providing no real benefits. Instead, it would be better to use recursive descent, as #2555 suggests. There is little chance of exceeding python's stack limit. Anyway, sys.setrecursionlimit increases that limit.

2. After years of wondering, I have discovered a way to traverse any tree with neither recursion nor generators. Each visitor returns a list of (handler, argument) tuples, which the main loop then executes in the desired order.

This scheme won't take the world by storm. The (handler, argument) tuples correspond to the closures that python creates behind the scenes for generators. Still, it solves a puzzle elegantly.  See the postscript.

Summary

I will soon rewrite much of the code in leoAst.py. The legacy (generator-based) code was particularly stupid because the generators never returned anything!

I may do a full implementation of the iterative tree traversal algorithm, both as a proof of concept and as a benchmark for comparison with other approaches.

Edward

P.S.  Here is a slightly condensed version of the iterative main loop:

def main_loop(self, node):
    func = getattr(self, 'do_' + node.__class__.__name__)
    exec_list = [(func, node)]
    while exec_list:
        func, arg = self.exec_list.pop(0)
        result = func(arg)
        if result:
            # Prepend the result, a list of tuples.
            self.exec_list[:0] = result

For example, here is the visitor for the walrus operator:

# NamedExpr(expr target, expr value)
def do_NamedExpr(self, node):
    return [
        (self.visit, node.target),
        (self.op, ':='),
        (self.visit, node.value),
    ]


Notice: none of the visitors are recursive. They merely represent recursion by returning tuples whose handler is self.visit. The maximum stack depth will be about 5, independent of the ast tree.

EKR

Edward K. Ream

unread,
Mar 31, 2022, 5:00:36 PM3/31/22
to leo-editor
On Thu, Mar 31, 2022 at 3:52 PM Edward K. Ream <edre...@gmail.com> wrote:

P.S.  Here is a slightly condensed version of the iterative main loop:

Oops:  I had intended to replace self.exec_list with exec_list everywhere. Like this:

def main_loop(self, node):
    func = getattr(self, 'do_' + node.__class__.__name__)
    exec_list = [(func, node)]
    while exec_list:
        func, arg = exec_list.pop(0)

        result = func(arg)
        if result:
            # Prepend the result, a list of tuples.
            exec_list[:0] = result

EKR

Edward K. Ream

unread,
Mar 31, 2022, 8:01:34 PM3/31/22
to leo-editor
On Thursday, March 31, 2022 at 3:52:55 PM UTC-5 Edward K. Ream wrote:

> After looking at leoAst.py with fresh eyes, the following Aha's emerged:

There is a third Aha: parent/child (or any other threading links) are almost useless! No amount of links will suffice to drive a tree traversal.

It's not enough to know in what order to visit tree nodes. One must also remember the (coroutine) state as well! Recursive descent, generators and the new iterative scheme all remember the state.  Links do not.  End of story.

My long-time fixation on tree links confused and mislead me for decades. I'm glad to put this delusion to rest. Now I see why ast.walk is so useful: often it doesn't matter how one discovers nodes of a particular type.

For example, Fstringify.visit uses none of the links that were so carefully (and uselessly) created. It only uses the already existing links in an ast.BinOp node whose operator (node.op) is '%'. Otoh, we can't just use ast.walk in the Fstringify class. Fstringify.make_fstring does use the list of tokens associated with the BinOp node.

Edward

tbp1...@gmail.com

unread,
Mar 31, 2022, 8:57:39 PM3/31/22
to leo-editor
Or, to put things another way, when you reach the end of a subtree, you don't automatically know how many levels (or indents, if you think of levels as indents) to return to.  The leaf of a subtree won't have a next link, so you need some state to know where to go next.

Edward K. Ream

unread,
Mar 31, 2022, 9:16:44 PM3/31/22
to leo-editor


On Thursday, March 31, 2022 at 7:01:34 PM UTC-5 Edward K. Ream wrote:

For example, Fstringify.visit uses none of the links that were...created.

Actually, some Fstringify code does use parent links, so I've left them in.

Amazingly, PR #2556 already passes all unit tests!  The only changes were:

- Removing all "yield from" statements.
- Removing "list" statements that were needed to drive the top-level generators.
- Renaming helper functions.

At present, the diffs are huge because I created a new copy of most of the code. I'll probably redo the work to reduce the diffs.

Edward

Edward K. Ream

unread,
Mar 31, 2022, 9:21:01 PM3/31/22
to leo-editor
On Thursday, March 31, 2022 at 7:57:39 PM UTC-5 tbp1...@gmail.com wrote:
Or, to put things another way, when you reach the end of a subtree, you don't automatically know how many levels (or indents, if you think of levels as indents) to return to.  The leaf of a subtree won't have a next link, so you need some state to know where to go next.

I wouldn't put it that way.  When revisiting a visitor, the visitor must be restarted where it left off.  Links simply can't do that. But generators/co-routines can restart visitors where they left off. In effect, so do recursive descent algorithms.

Edward

Edward K. Ream

unread,
Apr 1, 2022, 5:28:31 AM4/1/22
to leo-editor
On Thu, Mar 31, 2022 at 8:16 PM Edward K. Ream <edre...@gmail.com> wrote:

Amazingly, PR #2556 already passes all unit tests!
... 
At present, the diffs are huge because I created a new copy of most of the code. I'll probably redo the work to reduce the diffs.

Done in the ekr-simple-ast branch. This branch also includes the latest work in the ekr-ast-nodes2 branch, which adds support for the new Match* ast nodes.

Edward

Edward K. Ream

unread,
Apr 2, 2022, 9:59:20 AM4/2/22
to leo-editor
On Thursday, March 31, 2022 at 3:52:55 PM UTC-5 Edward K. Ream wrote:

After looking at leoAst.py with fresh eyes, the following Aha's emerged:

I have just completed all planned work on leoAst.py.  The highlights:

- mypy passes leoAst.py, under the requirement that all functions and methods be annotated.  Only one mypy suppression remains. mypy highlighted several improvements. I've become a big fan.

- The TokenOrderGenerator now uses recursive descent instead of generators. This removes tons of blah, blah, blah.

- The FStringify and ReassignTokens classes are now stand-alone classes, a big conceptual simplification.  Both classes now use ast.walk to find nodes of a specific kind, another big simplification.

- The TokenOrderTraverser class is now used nowhere in Leo. It has become an odd curio, nothing more. I'm glad.

This has been a great week for leoAst.py.

Edward
Reply all
Reply to author
Forward
0 new messages