ENB: Fixing delegated syntax coloring

37 views
Skip to first unread message

Edward K. Ream

unread,
Nov 19, 2024, 8:56:58 AM11/19/24
to leo-editor

Leonistas have long endured second-rate coloring for html. Leo's colorer renders <script> and <style> elements in a single color! Why has nobody complained?


I noticed this botch when preparing to do #726: @language vue. It looks like the mode file, html.py, correctly specifies what to do, but the jedit class makes a mess of things.


There seem to be two ways forward.


Modify mode files?


The mode files html.py, css.py, and javascript.py might cooperate explicitly to switch between (implicit) @language directives.


jupytext.py does something similar. The mode files python.py and md.py switch change the effective @language when they see jupytext comments.


However, this scheme looks like a dead end. Changing rule functions is error-prone, ugly, and does not generalize well.


A stack of delegated modes?


The rules in various mode files should work if the jedit class changes how it handles the "delegate" kwarg. The general idea is as follows:


The colorizer would maintain a stack of delegated modes. For example, jedit would switch from "html mode" to "javascript" mode when it saw "delegate"="javascript".


Somehow, the colorer must pop the stack when the javascript mode sees the ending `>`.


Heh. The match_span and match_span_regex pattern matchers (and their helpers) are probably the methods that should pop the stack. This scheme might work!


Summary


Changing mode files looks like a dead end. It worked well enough for @language jupytext, but this scheme is ugly and error-prone.


Modifying two jedit pattern matchers to support delegated modes would be an elegant and general solution. I'll attempt to make this scheme work, but there is no guarantee of success. Stay tuned.


Edward

Edward K. Ream

unread,
Nov 19, 2024, 10:33:18 AM11/19/24
to leo-e...@googlegroups.com
On Tue, Nov 19, 2024 at 7:57 AM Edward K. Ream <edre...@gmail.com> wrote:

The colorizer would maintain a stack of delegated modes. For example, jedit would switch from "html mode" to "javascript" mode when it saw "delegate"="javascript".


Somehow, the colorer must pop the stack when the javascript mode sees the ending `>`.


Aha: The mode files can tell the colorizer when to pop the stack. Something like:

colorizer.end_delegated_mode()

end_delegated_mode will pop the mode stack, but only if the mode stack isn't empty.

This scheme is easy and safe!

Edward

Thomas Passin

unread,
Nov 19, 2024, 10:50:29 AM11/19/24
to leo-editor
Parsing html and xml can be hard.  For example, the javascript in a <script> element might build a string that includes "</script>". It's hard to prevent a regex from ending its match at that point.

Edward K. Ream

unread,
Nov 19, 2024, 11:12:06 AM11/19/24
to leo-e...@googlegroups.com
On Tue, Nov 19, 2024 at 9:50 AM Thomas Passin <tbp1...@gmail.com> wrote:
Parsing html and xml can be hard.  For example, the javascript in a <script> element might build a string that includes "</script>". It's hard to prevent a regex from ending its match at that point.

I'm not concerned about such details now. The present scheme bulldozes/ignores such niceties.

Edward

Edward K. Ream

unread,
Nov 19, 2024, 2:33:42 PM11/19/24
to leo-e...@googlegroups.com
On Tue, Nov 19, 2024 at 10:11 AM Edward K. Ream wrote:
On Tue, Nov 19, 2024 at 9:50 AM Thomas Passin wrote:
Parsing html and xml can be hard.  For example, the javascript in a <script> element might build a string that includes "</script>". It's hard to prevent a regex from ending its match at that point.

I'm not concerned about such details now. The present scheme bulldozes/ignores such niceties.

To clarify my remarks, consider this test snippet:

@language html
<style>
body {background-color: powderblue;}
h1   {color: blue;}
p    {color: red;}
</style>

Everything is colored blue, a direct result of Leo's present delegation scheme.

But change @language html to @language css and you will see a better coloring.

This behavior is exactly what I expect.

A few hours of work on the "nested modes" scheme should make a big difference. All the details are becoming clear to me. Stay tuned.

Edward


Edward K. Ream

unread,
Nov 19, 2024, 6:25:52 PM11/19/24
to leo-editor
On Tuesday, November 19, 2024 at 9:33:18 AM UTC-6 Edward K. Ream wrote:

Aha: The mode files can tell the colorizer when to pop the stack. Something like:
colorizer.end_delegated_mode()

Mode files that contain such calls should have a top-level var called supports_nested_modes.

The colorizer will call colorizer.begin_delegated_mode only if the mode file contains that top-level ivar.

In short, all the details seem settled!

Edward

Edward K. Ream

unread,
Nov 19, 2024, 6:25:53 PM11/19/24
to leo-editor
On Tuesday, November 19, 2024 at 9:33:18 AM UTC-6 Edward K. Ream wrote:

Aha: The mode files can tell the colorizer when to pop the stack. Something like:
colorizer.end_delegated_mode()

Edward K. Ream

unread,
Nov 20, 2024, 6:20:31 AM11/20/24
to leo-editor
On Tuesday, November 19, 2024 at 5:25:52 PM UTC-6 Edward K. Ream wrote:

> In short, all the details seem settled!

I spoke way too soon. Both main options are still on the table.

My uncertainty has increased after studying the three mode files: html.py, javascript.py and css.py.

As a preliminary step, PR #4202 refactors the mode files. I merged this PR early to reduce the diffs in later work.

Edward
Reply all
Reply to author
Forward
0 new messages