ENB: About diffing scripts

22 views
Skip to first unread message

Edward K. Ream

unread,
May 9, 2025, 5:11:41 AM5/9/25
to leo-editor
This Engineering Notebook post explains the hidden background of PR #4351. This PR beautifies any script (or any node) and its subtree.

This ENB compares two ways of beautifying a tree of nodes. This ENB will likely be of interest only to Leo's core devs. Please feel free to ignore it.

Two approaches to beautification

The PR beautifies the script node-by-node using the tbo.beautify_script_node method. This method handles the actual body text of each node. I am satisfied that the node-by-node approach is best. It is also sound, meaning that a crucial hack in beautify_script_node is unlikely ever to cause problems. I'll say no more about this method.

An earlier approach beautified the whole script (computed in c.executeScript) using the tbo.propagate_script_changes method. This method handles the text (including sentinels) produced by g.getScript. The  propagate_script_changes method is not part of the final PR, but rev 7313624 contains the last (abandoned) version.

I abandoned propagate_script_changes because I thought that its algorithm had to be the inverse of the (complex!) algorithm in at.putBody. There was no way I was going there. But this morning I realized that the task isn't nearly as daunting as I first supposed. The following section explains why.

Simplifying propagate_script_changes

Recall that the input to this method is the beautified output. Unlike the node-by-node approach, this input contains Leo's sentinels.

Here is a sketch of the algorithm:

Pass 1: Using Leo's sentinels, extract the raw body text of each node. This text:

- Starts at an @+node sentinel comment and continues to the @+node sentinel.
- May contain the expansion of @others, delimited by @+others and @-others sentinels.
- May contain the expansion of section references, delimited by @+<< and @-<< sentinels.

Pass 2: For each raw body text, generate the final body text as follows:

- Replace the lines delimited by the @+others sentinel and the corresponding @-others sentinel by a properly indented @others directive.
- Replace the lines delimited by the @+<< sentinel and the corresponding @-<< sentinel by a properly indented section reference.
- Replace all other sentinel lines in the raw body text with their corresponding text.

That's all! The details hardly matter.

Summary

The tbo.beautify_script_node method beautifies each node separately. I am satisfied that it is sound. It is more than good enough.

The tbo.propagate_script_changes method isn't as infeasible as I first thought. A straightforward scheme can reconstruct each node's body text. This scheme might be applicable in other contexts. It's worth keeping in mind.

Edward
Reply all
Reply to author
Forward
0 new messages