The fruitful collaboration with Thomas continues. This Engineering Notebook post discusses adapting Thomas's prototype script to enhance how Leo handles @jupytext nodes. This enhancement:
- will be rock solid.
- can cover edge cases with ease.
- will be invisible when viewed in a Jupyter notebook.
- should be complete in a day or three.
Background
The jupytext library can translate an .ipynb file to pseudo-Python text. Thomas's script:
- starts with an @jupytext node whose body text (root.b) containing all the pseudo-python code.
- splits root.b into chunks.
- Adds a child node for each chunk.
- Clears (most of) root.b and adds an @others directive.
The details of this script don't much matter. It's the idea that counts.
Overview of enhanced @jupytext processing
Leo currently reads .ipynb files into the body text of @jupytext nodes. So Thomas's script starts where Leo's @jupytext read code ends. Inspired by the script, the read code will then create child nodes as described below in detail.
Safety
The enhanced @jupytext processing will work like one of Leo's importers.
Vitalije's great insight is that importers will round-trip correctly, provided the nodes they produce tile the incoming text without gaps. Let's call this the tiling property. The main line of Thomas's script meets this requirement and so will the enhancement.
Splitting text into nodes
We can imagine various ways of creating child nodes. Let's not worry about the details just now. All that matters is that the tiling property holds.
I'll pick one way that seems most Leonine. If there are differences of opinion, we might add a new Leo setting.
Creating outline structure
Leo could make all nodes as direct children of the root @jupytext node. Users could easily reorder the nodes as they please. However, it should be worthwhile to create nodes that follow the implied hierarchy of markdown sections.
Some nodes of the hierarchy may be missing, but that's not a problem. The importer will create organizer nodes for each missing level. This scheme "just works" because organizer nodes are invisible to the .ipynb file.
Proof: @jupyter works like @clean, so Leo never writes headlines to the .ipynb file. Organizer nodes contain no text, so again they contribute nothing to the .ipynb file.
It's easy to create organizer nodes. A stack contains the positions of the last seen node at each level. Initially, the stack contains the root @jupytext level. When adding a new node, the importer will:
- Cut the stack back if the new level is less than the old.
- Replace the top of the stack if the new level remains unchanged.
- Create organizer nodes as needed if the new level is greater than the old.
I've written this kind of code many times. Indeed, the base Importer class defines the i.create_placeholder method. The jupytext importer might use that method! And perhaps others.
Summary
Thomas's script shows how easy it is to split @jupytext text into nodes. Thank you Thomas!
The enhanced code constitutes a new importer. This importer:
- will be part of Leo's @jupytext support. There is no need for a separate command.
- will be rock solid because it will preserve the tiling property.
- will honor the implied hierarchy of markdown nodes, creating organizer nodes as needed.
- should be complete in a day or three. This is not my first rodeo.
Finally, the newly created outline structure will be invisible with Jupyter notebooks. Let the wild rumpus start!
All of your questions and comments are welcome.
Edward
The enhanced code constitutes a new importer. This importer...should be complete in a day or three.
The fruitful collaboration with Thomas continues. This Engineering Notebook post discusses adapting Thomas's prototype script to enhance how Leo handles @jupytext nodes. This enhancement:
- will be rock solid.
- can cover edge cases with ease.
- will be invisible when viewed in a Jupyter notebook.
- should be complete in a day or three.
I've written this kind of code many times. Indeed, the base Importer class defines the i.create_placeholder method. The jupytext importer might use that method! And perhaps others.
This Engineering Notebook post discusses adapting Thomas's prototype script to enhance how Leo handles @jupytext nodes.
I have just created PR #4138. The first comment of this PR discusses (at length!) how to connect the new importer to Leo's existing @jupytext code. Unless I am mistaken, the connection already works correctly!
The only remaining task is to complete jtm.create_outline. This method will be a straightforward adaptation of Thomas's script, as discussed earlier in this thread.
Edward
The fruitful collaboration with Thomas continues. This Engineering Notebook post discusses adapting Thomas's prototype script to enhance how Leo handles @jupytext nodes.
Some nodes of the hierarchy may be missing, but that's not a problem. The importer will create organizer nodes for each missing level. This scheme "just works" because organizer nodes are invisible to the .ipynb file.
I would like to make one more push for eliminating the '#' characters that comment out each and every line of a juyptext line.
On Tue, Oct 29, 2024 at 8:11 AM Thomas Passin <tbp1...@gmail.com> wrote:I would like to make one more push for eliminating the '#' characters that comment out each and every line of a juyptext line.Here is snippet from my test file:# %%
2 + 666 + 4
# %%
print('hi changed externally')
# %% [markdown]
# This is a markdown cellThe two python lines do not start with '#'. If we remove comment lines it's likely to be challenging to put them back.