Leonistas have long endured second-rate coloring for html. Leo's colorer renders <script> and <style> elements in a single color! Why has nobody complained?
I noticed this botch when preparing to do #726: @language vue. It looks like the mode file, html.py, correctly specifies what to do, but the jedit class makes a mess of things.
There seem to be two ways forward.
Modify mode files?
The mode files html.py, css.py, and javascript.py might cooperate explicitly to switch between (implicit) @language directives.
jupytext.py does something similar. The mode files python.py and md.py switch change the effective @language when they see jupytext comments.
However, this scheme looks like a dead end. Changing rule functions is error-prone, ugly, and does not generalize well.
A stack of delegated modes?
The rules in various mode files should work if the jedit class changes how it handles the "delegate" kwarg. The general idea is as follows:
The colorizer would maintain a stack of delegated modes. For example, jedit would switch from "html mode" to "javascript" mode when it saw "delegate"="javascript".
Somehow, the colorer must pop the stack when the javascript mode sees the ending `>`.
Heh. The match_span and match_span_regex pattern matchers (and their helpers) are probably the methods that should pop the stack. This scheme might work!
Summary
Changing mode files looks like a dead end. It worked well enough for @language jupytext, but this scheme is ugly and error-prone.
Modifying two jedit pattern matchers to support delegated modes would be an elegant and general solution. I'll attempt to make this scheme work, but there is no guarantee of success. Stay tuned.
Edward
The colorizer would maintain a stack of delegated modes. For example, jedit would switch from "html mode" to "javascript" mode when it saw "delegate"="javascript".
Somehow, the colorer must pop the stack when the javascript mode sees the ending `>`.
Parsing html and xml can be hard. For example, the javascript in a <script> element might build a string that includes "</script>". It's hard to prevent a regex from ending its match at that point.
On Tue, Nov 19, 2024 at 9:50 AM Thomas Passin wrote:Parsing html and xml can be hard. For example, the javascript in a <script> element might build a string that includes "</script>". It's hard to prevent a regex from ending its match at that point.I'm not concerned about such details now. The present scheme bulldozes/ignores such niceties.
Aha: The mode files can tell the colorizer when to pop the stack. Something like:
colorizer.end_delegated_mode()
Aha: The mode files can tell the colorizer when to pop the stack. Something like:
colorizer.end_delegated_mode()