If I were writing Leo's colorizer from scratch, I would scrap the way it works. This Engineering Notebook post:
- Discusses ideas that arose from my recent work with @language jupytext.
- Shows how to use these ideas to extend Leo's colorizer safely.
Terminology
Remember this crucial distinction:
- A mode file is a file in the leo/modes directory.
- Leo's colorizer is an instance of the JEditColorizer class in leo/core/colorizer.py.
Tokenizing mode files
When importing a mode file, say x.py, Leo's colorizer will first look for a file named token_x.py. Let's call such files tokenizing mode files. Such files must have top-level init_text and colorize_line functions. Tokenize mode files may also have other top-level functions. See below.
When colorizing a tokenizing mode file, the colorizer will call init_text(p.b) whenever p changes.
To colorize each line, the colorizer will use this new main loop:
This new main loop should be much faster than the existing main loop! It colorizes by lines, not characters!
Design of tokenizing mode files
Tokenizing mode files will keep track of their own state.
Most mode files will be line-oriented tokenizers. token_python.py might even use Python's tokenizer module! The speedup should be significant.
The top-level colorize_line(colorizer, s, i) function will call colorizer.colorRangeWithTag to do the actual coloring. Tokenizing mode files will know nothing else about the colorizer! No more calls to the colorizer's weird pattern matchers! No more interface dictionaries!
Delegation between tokenizing mode files suddenly becomes straightforward! Tokenizing mode files will call the top-level init_delegated_state function of another tokenizing mode file.
Safety
Experimenting with this scheme carries minimal risks:
- No existing mode files will change!
I'll only add new tokenizing mode files.
- Only JEditColorizer.init_mode will change.
It will contain new code that loads tokenizing mode files if they exist.
All other parts of this method will remain unchanged.
Summary
This scheme is worth investigating:
- Experiments carry no significant risk.
- Colorizing should be much faster.
- Tokenizing mode files will know almost nothing about Leo's colorizer.
- Delegation between tokenizing mode files should collapse in complexity.
Edward
P.S. Tokenizing mode files might call helper modes defined in leo/modes/_helper_modes.py.
Helper modes might handle Leo keywords and decorators. Delegating colorizing to helper modes would provide an early test of the new delegation scheme.
EKR
> To colorize each line, the colorizer will use this new main loop:
I forgot that the new main loop must update the
QSyntaxHighlighter state! Like this:
state = self.new_mode_module.colorize_line(s, n)