This Engineering Notebook post discusses recent studies and thoughts.
I slept all day yesterday, perhaps with a mild case of covid. I've been vaccinated and boosted, so I'm not too worried. I feel better this morning. Thank goodness for immune systems! And now I have an excuse if my writing sounds like ravings :-)
The goal of my thinking is to find my next project. Leo is complete as it is. Nothing urgently needs to be added. But studying other problems might suggest additions to Leo.
Creativity requires a juicy problem. Not necessarily a big problem, just a problem that seems interesting!
Type inference
If I could wave a magic wand, I would wish for a faster pylint. And maybe a faster mypy, though mypy is surprisingly fast. Both pylint and mypy attempt heavy-duty type inference, so I decided (once again) to try to understand Hindley-Milner type inference.
Previous attempts were unsuccessful. The notation is cryptic. This time around, google yielded much better results. I struck paydirt with a great lecture by Adam Doupe at Arizona State University. The second part (of the same lecture) is here.
I enjoyed the way Prof. Doupe would ask questions and patiently wait. It's the style of a relaxed, confident teacher. Very rare, in my experience. Here's what I learned:
First, the algorithm is just common sense! It simply traverses the parse tree (in any order), recording inferences along the way. Nothing could be simpler.
Second, the algorithm is more of a roadmap than a proper algorithm. The possible inferences depend on the language, and the algorithm omits any mention of the myriad housekeeping details that form the bulk of the actual code!
Armed with my new confidence about type inference, I briefly reviewed the source code for mypy and pylint. Mypy and pylint follow this algorithm in very different ways! These programs are large and complex. There is no obvious way to speed them up!
Leonine linting
If improving pylint is out of the question, perhaps we could find a simple way to ensure that Leo's c, g, and p vars always mean what we think they should mean.
The simplest prototype of such an approach is to use cff to search for assignments. I used four cffs with these regexs:
\bg\b\s*= # g =
\bp[0-9]?\b\s*=\s*[A-Z][_\w]+\s*\( # p = Class name
\b[_\w]\s*=\s*(.*?)\bPosition\b # var = Position
\b[_\w]\s*=\s*(.*?)\bCommands\b # var = Commands
The results provide an intuitive view of how Leo defines c, g, and p!
The next step would be to transliterate these text patterns into searches on the parse trees. The transliteration should be straightforward! Indeed, one searches (in any order) for Assign, AugAssign, or AnnAssign nodes. Having found the assignment, one can traverse the subtree looking for c, g, p, Position, etc. Maybe one could even generalize the transliteration, but that seems unnecessary.
A puzzle
I'm relating my recent journey out of order. The first puzzle I set myself was to traverse (parse) trees using neither recursion (including ast.walk) nor generators. Yes, this puzzle seems quixotic, but I do like puzzles :-)
I don't remember exactly why I first wanted to solve this puzzle. I remember telling Rebecca that all the generators in leoAst.py seemed like the Hufflepuff way. I wanted a "Ravenclaw way :-) But this is a bit unfair to myself. There are lots of clever tricks in leoAst.py, as I documented years ago.
Anyway, after several days (!!) of thought, I saw the answer. Indeed, Leo already traverses outlines using neither generators nor recursion. The key is Leo's Position class.
Maybe the original problem was contrived, but the solution is notable. For any traversal, there is a Position class that yields that traversal. For example, Leo's c.all_positions generator is:
def all_positions(self, copy=True):p.moveToThreadNext defines "outline order" iteratively, without generators or recursion. Take a look. So for any desired traversal, we can define a generalized Position class with a new moveToThreadNext method. Do you see?
moveToThreadNext might be complex. For example, for ast trees, moveToThreadNext would examine the type of each tree node, and traverse the tree accordingly. Note: Position's stack eliminates the need for recursion. The generalized stack must contain enough data to move through the entire tree.
moveToThreadNext (roughly) corresponds to all the visitors (generators) in leoAst.py. This approach is, at the least, a revealing change in point of view.
Aside: TokenOrderTraverser.traverse (in leoAst.py) corresponds exactly to p.moveToThreadNext. TOT.traverse manipulates a stack explicitly. So a generalized moveToThreadNext can be a function. Defining a generalized Position class is optional.
Summary
The Hindley-Milner algorithm is common sense dressed up in forbidding notation. One traverses the tree (in any) order, recording inferences.
For most languages, the Hindley-Milner algorithm is merely a roadmap. Bookkeeping is the hard part. Pylint and mypy do the bookkeeping in very different ways.
Straightforward scripts (transliterations of regexs) can probably check the types of all uses of c, g, and p within Leo's code base.
It is straightforward to define any conceivable tree traversal using neither recursion nor generators. Just define a generalized moveToThreadNext function. Optionally, moveToThreadNext can be a member of a generalized Position class.
That's all for today. Stay tuned.
Edward
A puzzle
...The first puzzle I set myself was to traverse (parse) trees using neither recursion (including ast.walk) nor generators.
There has to be an immense body of literature out there on tree traversal, and much if not most of it probably does not depend on generators and recursion.
SummaryAny solution must execute arbitrary actions in some desired order. An elegant solution seems tantalizingly close, but my usual instincts lead me astray.
Thanks for the link. I wasn't clear enough about the problem I am trying to solve. I want to duplicate the TokenOrderGenerator class in leoAst.py without recursion or generators.
Thanks for the link. I wasn't clear enough about the problem I am trying to solve. I want to duplicate the TokenOrderGenerator class in leoAst.py without recursion or generators.I don't think it is possible what you are trying to achieve. To totally avoid any kind of recursion, you must know some facts about the AST in advance, for example how deep it is.
It is O.K. if you just want to have some fun to play around with these ideas, but you should know that you can't avoid recursion because the data is recursive
generators actually produce the fastest code for solving recursive problems in Python.