ENB: leoTokens.leo progress report

Edward K. Ream

unread,

Jan 13, 2024, 4:47:07 AMJan 13

to leo-editor

Two days ago, all unit tests passed for Leo's new beautifier in leoTokens.leo. My celebrations were premature. Beautifying Leo's sources revealed unexpected (and unwelcome!) changes.

This Engineering Notebook post summarizes the remaining issues and suggests possible fixes.

Background

Leo's legacy colorizer (in leoAst.py) uses data from parse trees to discover proper spacing around the colon, minus sign, star, and 'import' tokens. leoTokens.py does not have that data, so it must compute new context data to resolve ambiguities.

Both colorizers contain visitors and generators. Visitors handle input tokens; generators create output tokens. The new generator adds scanners that discover context.

Unexpected scanning problems

The new beautifier fails because several ad-hoc scanners can move beyond statement boundaries. The unit tests didn't catch such situations because they focused on GvR's single-line pet peeves.

Solutions

I spent yesterday noodling potential solutions. As I went to bed, I saw that the existing scanners are parts of a recursive-descent parser. The new beautifier needs a good enough parser. This morning, the details became clear:

- New scan_statements and scan_statement methods will discover statement boundaries.

They are the top levels of the parser.

- The problematic scanners will use these boundaries to avoid mistakes.

Summary

New scanners will provide the necessary context to problematic code. Various questions remain, but the new scanners are unlikely to impact performance significantly.

The new scanners complete a good enough parser. The final code will look as though the parser was an obvious choice :-)

Edward

Edward K. Ream

unread,

Jan 13, 2024, 6:37:52 AMJan 13

to leo-editor

On Saturday, January 13, 2024 at 3:47:07 AM UTC-6 Edward K. Ream wrote:

> the new scanners are unlikely to impact performance significantly.

The new beautifier takes about 3.0 sec. to beautify Leo's core. The old takes 4.0 sec.

But recall that the new beautifier will be the foundation of a Nim beautifier. We can reasonably expect a 30x to 100x speedup, so there is no need to obsess about speed!

Edward

Edward K. Ream

unread,

Jan 13, 2024, 7:00:15 AMJan 13

to leo-editor

On Saturday, January 13, 2024 at 5:37:52 AM UTC-6 Edward K. Ream wrote:

> The new beautifier takes about 3.0 sec. to beautify Leo's core. The old takes 4.0 sec.

Actually, leoTokens.py takes 2.4 seconds, even with a prototype for the good-enough parser that pre-scans every token. So leoTokens.py is much faster than leoAst.py.

Edward

Reply all

Reply to author

Forward