Design of a super-fast pylint

25 views

Skip to first unread message

Edward K. Ream

unread,

Apr 5, 2024, 7:12:08 AM4/5/24

to leo-editor

The previous Engineering Notebook post explained why type inference (and checking) is much easier than I ever imagined. This ENB will discuss a preliminary design of a new (super fast) pylint. Whenever I say "pylint," I'll mean the new pylint.

Ignore import graphs

In the last ENB, I implied that relationships between import statements matter. I spent several hours exploring two implied graphs. Top-level imports create an implicit graph of load-time relationships, while import statements within "if TYPE_CHECKING:" statements create an implicit graph of mypy-time relationships.

But Aha! None of this analysis is needed! Let's look at a more straightforward design.

A preliminary design

Here are the ground rules:

- Pylint will only run if pyflakes detects no errors.

- Pylint will run after mypy, so pylint can assume all annotations are valid.

- Pylint will not use import graphs.

Here is the plan:

- Pylint will use checking data whose lifetime will end when Leo exits. These data will consist of structural data (parse trees) and semantic data.

- Pylint will recompute checking data for all changed files. The order in which pylint recomputes these data won't matter.

- Pylint will use lazy evaluation to fill in semantic data. Pylint will clear all the semantic data (for a particular file) whenever a file changes. The semantic data will gradually accumulate while pylint checks other files.

Summary

Clearing per-file checking data (whenever a file changes) removes the complexity of more ambitious caching schemes.

Lazy evaluation of data sidesteps the need for dependency graphs.

Pylint will trust mypy's annotations. Pylint will do type checking, not type inference.

The new pylint should be about as fast as pyflakes because the computation required is roughly the same.