A snippet from Orange illustrates TOG's power

52 views
Skip to first unread message

Edward K. Ream

unread,
Jan 18, 2020, 2:31:49 AM1/18/20
to leo-e...@googlegroups.com
The Orange class is based on tokens. This token-based approach work perfectly, except for the placement of spaces of around colons in slices. Here are some examples, as recommend by pep 8 and as rendered by black, and now Leo's Orange class:

a[1::]
a
{1:2:]
a
[1 : 1 + 2]
a
[lower:upper:]
a
[lower + offset : upper + offset]
a
[:: step_fn(x)]
a
[: upper_fn(x) : step_fn(x)]
a
[: upper_fn(x) : 2 + 1 :]

As you can see, there should be spaces around all colons, with two constraints:

Constraint 1: Put no spaces around colons if all parts of the slice consists only of names and constants.
Constraint 2: Never put a space between a colon and a parenthesis.

This is a great example of the value of having access both to the token list and the parse tree. Indeed, constraint 1 requires access to the parse tree. It would be extremely clumsy to deduce this constraint from the token list. Otoh, constraint 2 requires access to the token list. It would be extremely clumsy to deduce this constraint from the parse tree.

Folks, this example was the motivation for the entire TokenOrderTraverser class.

Here is Orange.colon, the code that handles incoming colon tokens:

def colon(self, val):
   
"""Handle a colon."""
    node
= self.token.node
   
self.clean('blank')
   
if not isinstance(node, ast.Slice):
       
self.add_token('op', val)
       
self.blank()
       
return
   
# A slice.
    lower
= getattr(node, 'lower', None)
    upper
= getattr(node, 'upper', None)
    step
= getattr(node, 'step', None)
    expressions
= (ast.BinOp, ast.Call, ast.IfExp, ast.UnaryOp)
   
if any(isinstance(z, expressions) for z in (lower, upper, step)):
        prev
= self.code_list[-1]
       
if prev.value not in '[:':
           
self.blank()
       
self.add_token('op', val)
       
self.blank()
   
else:
       
self.add_token('op-no-blanks', val)

The only things you have to know about are the following:

1. self.token.node is the parse tree associated with the incoming colon token.
2. The colon is part of a slice if and only if isinstance(node, ast.Slice).
3. Within a slice, the code puts spaces around colons if and only if any part of the slice is a non-trivial expression.

Getting 3 right is tricky. Suffice it to say that details are handled at the token level, as usual in the Orange class.

Summary

The value of the TOG class lies in its power, not in easy-to-understand snippets. For most people asttokens will suffice. But for ambitious programs like fstringify and orange, the full power of the TOG class simplifies the code enormously.

Since the day I first conceived of token order traversals I have known that the colon method would the acid test of the Orange class.  Until yesterday I had only a vague idea of what Orange.colon would look like. Clearly, it is the simplest thing that could possibly work.

Dozens individual test cases in TestOrange.test_one_line_pet_peeves cover this crucial method.

Edward

Edward K. Ream

unread,
Jan 18, 2020, 6:30:09 AM1/18/20
to leo-editor
On Saturday, January 18, 2020 at 2:31:49 AM UTC-5, Edward K. Ream wrote:

Until yesterday I had only a vague idea of what Orange.colon would look like. Clearly, it is the simplest thing that could possibly work.

Notice the dog that isn't barking. o.colon doesn't use the list of tokens assigned to each parse-tree node. It uses only the incoming token and the parse tree itself. This is good news, because even though the TOG class does a pretty good job of assigning tokens to nodes, token assignment is the most dubious part of the TOG class.

Edward

Edward K. Ream

unread,
Jan 18, 2020, 7:21:49 AM1/18/20
to leo-editor
On Saturday, January 18, 2020 at 2:31:49 AM UTC-5, Edward K. Ream wrote:

> I [knew] that the colon method would the acid test of the Orange class. Until yesterday I had only a vague idea of what o.colon would look like.

The last remaining unfinished part of the Orange class is the code that splits long lines and joins short lines. I want to complete that code now, for two reasons:

1. It's important to compare black and orange as closely as possible.

Splitting and joining lines is one of black's most distinctive and useful features. It's important to mimic it in orange.

2. I want to know, in complete detail, just how useful having access to the parse tree will be for splitting and joining lines.

Just as with the slice logic, I now have only vague ideas about how it all will turn out. I should know a lot more in a day or three.

Edward

Edward K. Ream

unread,
Jan 19, 2020, 6:18:10 AM1/19/20
to leo-editor
On Saturday, January 18, 2020 at 6:30:09 AM UTC-5, Edward K. Ream wrote:

> Notice the dog that isn't barking. o.colon doesn't use the list of tokens assigned to each parse-tree node.

There is another dog that isn't barking: colons are significant tokens. Therefore, within o.colon, self.token.node is the ast node that generated the colon that o.colon is handling.

Token assignment is coming to the fore again as (probably) a crucial part of the logic that splits/joins line.  I'll explain further in a new ENB post.

Edward
Reply all
Reply to author
Forward
0 new messages