A 21+ hour day working on the markdown importer

34 views
Skip to first unread message

Edward K. Ream

unread,
Nov 26, 2016, 7:53:31 AM11/26/16
to leo-editor
Up at 2am.  Down at 11:30pm, with a long walk for a break.

tl;dr: read the summary

This is a long status report.  Feel free to ignore. However, it does discuss an important change, visible to users, of how Leo handle's markdown outlines. And it does discuss some unit testing issues that may be of interest to devs.  So you may want to read the summary ;-)

On my walk I was fortunate to stumble upon this year's last practice of the University of Wisconsin marching band. I often listen to the band for an afternoon break.

After the practice, the band members sang to the seniors, to the student leader, and to Mike Leckrone, the band's director since 1969.  As I was walking away, the band broke into a gorgeous, moving, rendition of Auld Lang Syne.

I had expected to complete the markdown importer easily.  Instead, I got mired in multiple issues.

Markdown issues

1. The old importer would break if anyone actually changed the outline and saved it! Indeed, the write code depended on uA's set by the read code.  But the uA's will become incomplete the instant anyone added a node!  Brain dead.

Instead, the new code supports the following convention in the outline.

- Headlines beginning with '=' or '-' will generate underlined sections.
- Headlines beginning with '!' will not create a section.
- All other headlines create sections delimited by 1 or more '#' characters.

In practice, this will encourage the preferred form of markdown sections.

Note: The importer must generate ! headlines in order to handle lines that precede the first section line.  But it is useless to define a ! node anywhere else.  When re-reading the file, the importers will stick the lines at the end of the preceding node.

This new convention required a change to the write code in leo.importers.writers.markdown.  It was complicated by...

2. Things seemed to be changing for no reason!

At last I realized that there were multiple culprits, all in the base Importer class. The prime offender was i.post_pass.  Doh!  It's supposed to change text.  So the markdown importer disables it by defining a do-nothing override.

Similarly, the markdown importer overrides i.clean_headline and i.v2_create_child_node so they don't strip the headline. This was hard to see at the time, and made life difficult.

Similar changes should be made to the otl and org-mode importers.  I'll do this today.

In short, base classes help reduced redundant code, but they are not without their own problems.

3. The parser, md_i.v2_gen_lines is brand new.  It uses regex's to simplify matters.  However, parsing is still not good, even after all yesterday's work. But we're getting closer.

The present code will delete all input lines consisting only of 4 or more '-' or '=' characters.  This is wrong.  They must be retained unless they immediately follow a non-empty line that is shorter than the following underlining.

Unit test issues

4. Still working on getting imp.reload to work in all situations. This is an essential feature of the new TDD work flow. It's essential to make this as simple and smooth as possible.

5. The same unit test can generate either '@@auto-markdown' or '@@auto-md' nodes depending on unknown factors. I've got to understand why...

6. I spent a lot of work on the unit test themselves.  More tests now verify not only that the file imported "perfectly" (writes the identical file, or nearly so), but also generates the expected outline structure. Besides the TDD preamble, there is lot of other boilerplate involved.  It's possible that an @button script could generate this boilerplate...

There are several other practical details related to unit tests.  I'll discuss them next, in the summary, so more people will see them.

Summary

It's possible to rerun unit tests in unitTest.leo, without reloading unitTest.leo, provided the tests contain the TDD preamble.  This will be a huge win when glitches relating to imp.reload are more fully resolved. A subject for a future post...

The new test-driven development workflow is still in its infancy. I'll continue to improve it.

When single-stepping through code, it's best to run unit tests from unitTest.leo.  As always, unitTest.leo should be done in a separate invocation of Leo.  unitTest.leo freezes while single-stepping, but leoPy.leo does not, so I can look at the outline of version of the code as always.  It's part of the new TDD pattern.

There is a new convention for specifying markdown sections in @auto-md trees.  It replaces a behind-the-scenes scheme that had no chance of working.

It's a good thing Leo is worth any amount of work ;-)

Edward

Edward K. Ream

unread,
Dec 3, 2016, 6:30:01 PM12/3/16
to leo-editor
On Sat, Nov 26, 2016 at 6:53 AM, Edward K. Ream <edre...@gmail.com> wrote:

​[The new markdown importer] supports the following convention in the outline.


- Headlines beginning with '=' or '-' will generate underlined sections.
- Headlines beginning with '!' will not create a section.
- All other headlines create sections delimited by 1 or more '#' characters.

​This is obsolete.  As stated elsewhere, the markdown importer effectively converts all sections to '#' sections.  This is the simplest, most Leonine way. We can not let the perfect-import tail wag the dog.

EKR
Reply all
Reply to author
Forward
0 new messages