ENB: Big Aha re type inference

24 views

Skip to first unread message

Edward K. Ream

unread,

Apr 4, 2024, 6:48:20 AM4/4/24

to leo-editor

This Engineering Notebook post explains why type inference (and checking) is much easier than I ever imagined.

For over a decade, I believed that (while doing type inference) everything depends on everything else. I thought inferring the types (of objects) in module A required inferring the types in all other imported modules M1, M2, etc. That's just wildly wrong!

Aha! It's dead easy to know the types of names imported from other modules!!!

- "Import" statements only give access to top-level names from other modules.

- Top-level declarations (in other modules) explicitly state the types of imported names.

Let's take a closer look. The top-level statements of any module consist only of import statements, assignment statements, and definitions of classes and functions. Annotations give us the types of names in the LHS of the assignments. The types of classes and functions are the classes and functions themselves.

The paragraph above contains some hand waving. Let's look at two difficulties:

The import graph

Imports statements create an implicit graph of import dependencies. mypy devotes considerable code to discovering strongly connected components (SCCs).

These dependencies matter because changing module A might require reanalyzing modules that import A. Otoh, changing module A only actually matters if the change involves a top-level name in A.

The dependency graph gives a new view of Leo. By design, the following are true:

- leoGlobals.py imports no Leo modules, only Python library modules.

- leoNodes.py imports only leoGlobals.py and signal_manager.py, which does not import any Leo imports.

- leoCommands.py only imports leoNodes.py at the top level. Commands.__init__ imports all of Leo's sub-commanders.

These constraints (and others) ensure that Leo contains no circular import dependencies.

In short, it's non-trivial to compute the dependency graph! A prototype might sidestep such complications--perhaps by using a hand-crafted dependency graph.

Function arguments

The only remaining difficulty involves inferring the types of function/method arguments. The Hindley-Milner so-called "algorithm" does this. This algorithm takes a context comprising the already-inferred arguments.

I don't understand all the details. Happily, the details don't matter if (as in Leo) all arguments have annotations!

Summary

Type inference and checking remain tricky. But the details involve bookkeeping, not deep theory!

The Hindley-Milner algorithm isn't needed if all arguments have annotations.

These Ahas open the door to a super-fast Leonine pylint, as I'll explain in another ENB. This Leonine pylint should be about as fast as the existing Leonine pyflakes.