ENB: flattening the TokenOrderTraverser class

38 views
Skip to first unread message

Edward K. Ream

unread,
Nov 13, 2019, 6:09:00 AM11/13/19
to leo-editor
It is not valid to use recursive tree traversals in production code because ast (parse) trees can be arbitrarily deep.  Consider the expression: a = 1 + (2 + (... 500)...).  Recursive traversal of this expression would require more than 1500 entries in python's parse stack!

The tot.visit method is the culprit. It contains the recursion. The visit method must go away.

The TokenOrderTraverser class must become a generator. Here's how:

1. The visit method will be "distributed" into all visitors as follows:

- Each visitor will start by calling tot.start_node. This method will contain everything before the recursive call in visit. 

- Visitors will yield from(z) instead of calling self.visit(z).

- Each visitor will end by calling tot. end_node. This method will contain everything after the recursive call in visit.

2. The put/eat methods will contain the only yield statements.

This works because the "yield from" statements produce the desired results without recursion.

And that should be that.

Edward

P.S.  A note about debugging generators. Many years ago I asked the python community how to "trace" a complex generator.  Nobody had any ideas. Yesterday I saw how:

def seq():
   
"""A test generator."""
   
for i in range(5):
       
yield i

def dump(it, tag):
   
"""A generator that dumps another generator."""
    aList
= list(it)
    g
.printObj(aList, tag=tag)
   
for z in aList:
       
yield z

# For debugging.
for i in dump(seq(), tag='seq'):
   
print(i)

print('done')

This should work for any generator, no matter how complex.

EKR

Edward K. Ream

unread,
Nov 13, 2019, 6:12:32 AM11/13/19
to leo-editor
On Wednesday, November 13, 2019 at 5:09:00 AM UTC-6, Edward K. Ream wrote:

> It is not valid to use recursive tree traversals in production code...
> The TokenOrderTraverser class must become a generator.

There is no great rush to do this.  Debugging lists is a lot easier than debugging generators, despite the recent Aha.

Edward

Edward K. Ream

unread,
Nov 13, 2019, 6:35:37 AM11/13/19
to leo-editor
On Wed, Nov 13, 2019 at 5:09 AM Edward K. Ream <edre...@gmail.com> wrote:
- Each visitor will start by calling tot.start_node.
- Each visitor will end by calling tot. end_node.

Recent revs in "fstring" add these methods, for later use.

Edward

Edward K. Ream

unread,
Nov 13, 2019, 12:16:12 PM11/13/19
to leo-editor
On Wednesday, November 13, 2019 at 5:12:32 AM UTC-6, Edward K. Ream wrote:

> There is no great rush to [use generators].

Generators now work.  I did say that this is an all-consuming project :-)

leoAst.py now contains two new classes:

- class TokenOrderGenerator, the base class for all tree generators.

- class TokenOrderInjector (TokenOrderGenerator).  This class contains a single method, TOI.begin_visitor, that injects links into tokens, then calls super().begin_visitor to get all the default stuff:

def begin_visitor(self, node):
   
# Do this first, *before* updating self.node.
   
self.coverage_set.add(node.__class__.__name__)
    node
.parent = self.node
   
if self.node:
        children
= getattr(self.node, 'children', [])
        children
.append(node)
       
self.node.children = children
   
# *Now* update self.node, etc.
   
super().begin_visitor(node)

> Debugging lists is a lot easier than debugging generators.

Belay that. I have been pleasantly surprised how easy everything has been.  There were a few tweaks to my initial plan, but nothing at all serious.

It's been a good day.

Edward

Edward K. Ream

unread,
Nov 13, 2019, 2:21:27 PM11/13/19
to leo-editor
On Wednesday, November 13, 2019 at 11:16:12 AM UTC-6, Edward K. Ream wrote:

> Generators now work.  I did say that this is an all-consuming project :-)

Another Engineering Notebook entry.  Feel free to ignore.

The new code consists of easy, obvious, clear generator patterns.  There are "yield" and "yield from" statements "everywhere". So subclasses are unlikely ever to need to override the node visitors directly.

Here, I want to emphasize that the code contains no recursion whatever, despite appearances.  For example:

def do_Module(self, node):
   
self.begin_visitor(node)
   
for z in node.body:
       
yield from self.visitor(z)
   
self.end_visitor(node)

Above, "yield from self.visitor()" replaces "visit(z)" in the old recursive code.

Crucially, self.visitor returns a generator. It does not recursively call anything. Here it is:

def visitor(self, node):
   
"""Given an ast node, return a *generator* from its visitor."""
   
# We *do* want to crash if the visitor doesn't exist.
    method
= getattr(self, 'do_' + node.__class__.__name__)
   
# method(node) is a generator, not a recursive call!
   
return method(node)

We can prove that no recursion happens by inserting traces:
   
def visitor(self, node):
    method
= getattr(self, 'do_' + node.__class__.__name__)
   
print('BEFORE', node.__class__.__name__)
    it
= method(node)
   
print('AFTER', node.__class__.__name__,'\n')
   
return it

Nothing ever appears between these two traces, even with traces enabled in the begin/end_visitor methods.

This proves that there are no recursive calls anywhere in the TokenOrderGenerator class.

Edward

P.S.  The entire class actually "yields" nothing at all. Everything happens by way of side effects, namely the properly ordered calls to eat.  Everything (except eat) is dirt simple.  This is the way it is written in The Book.

EKR
Reply all
Reply to author
Forward
0 new messages