ENB: The way forward: using positions and vnodes in the npyscreen code

50 views
Skip to first unread message

Edward K. Ream

unread,
May 15, 2017, 11:28:49 AM5/15/17
to leo-editor
In conversation with Rebecca last night at dinner I saw more clearly what the next step is.

tl;dr: We need an interface class that delivers (wrappers to) Leo's position and/or vnode classes instead of some (as yet unknown) npyscreen array.

The problem

The npyscreen code (including subclasses in cursesGui2.py), uses a tree of LeoTreeData nodes.  This tree is completely separate from the tree of vnodes that Leo uses.  Some way must be found to use Leo's positions/vnodes in the npyscreen code.  That is, the npyscreen code must get the (contents of) a vnode when drawing the screen or when changing the outline.

The solution

The solution will be to create an interface that "stands in" for the MultiLine._my_widgets array, or maybe some other array. This is the switcheroo. Ideally, the npyscreen code will know nothing about the switcheroo. The stand-in class will use special (__xxx__) methods (aka dunder methods) to simulate array access.

Complications!

The MLTree tree handlers for insert/delete/expand/contract nodes must call Leo's own tree-handling methods, which will then, when calling c.redraw, cause the npyscreen outline to be updated.

But oh my, the complications involved in the switcheroo itself are mind-boggling. I have been studying the code for days, trying to understand every little wrinkle.

I'm not going to stupefy you, dear reader, with the details. It's not even clear where the switcheroo will happen. It may involve the _contained_widgets ivar (another list) of some class rather than MultiLine._my_widgets list.

Summary

I am pretty sure I know the general way forward, but I have no idea where the complications will take me.

Ideally, the npyscreen code will be unaware of the switcheroo and can be used unchanged, but I'll override npyscreen methods if need be.

Edward

Edward K. Ream

unread,
May 16, 2017, 6:47:27 AM5/16/17
to leo-editor
On Monday, May 15, 2017 at 10:28:49 AM UTC-5, Edward K. Ream wrote:

> tl;dr:
We need an interface class that delivers (wrappers to) Leo's position and/or vnode classes instead of some (as yet unknown) npyscreen array.
...

> But oh my, the complications involved in the switcheroo itself are mind-boggling. I have been studying the code for days, trying to understand every little wrinkle.

It's never a good idea to "predict" difficulties, or to dwell on complications. My previous strategy of studying all the sources was inefficient. There is no need to do that! Instead, tracing the name of a method reveals the method (in whatever superclass!) that will actually be called.

For example, LeoMLTree.delete_line isn't working.  The line appears to be deleted from the screen, but collapsing and expanding the deleted node's parent shows the deleted node again.  This bug is directly related to the _my_widgets and the switcheroo, so I want to fix it asap.

The LeoMLTree.delete_line sources give no indications of the class to which self.display refers.  No problem! Just add g.trace(self.display). The result will be:

    <bound method Widget.display of <leo.plugins.cursesGui2.LeoMLTree object at 0x086E7B10>>

So now we know that Widget.display is being executed, not the display method of any other class. This trick saves a lot of sleuthing. No need for a debugger, which is how I previously discovered the actual method being called.

Edward

Edward K. Ream

unread,
May 17, 2017, 9:58:52 AM5/17/17
to leo-editor
On Monday, May 15, 2017 at 10:28:49 AM UTC-5, Edward K. Ream wrote:

In conversation with Rebecca last night at dinner I saw more clearly what the next step is.

tl;dr: We need an interface class that delivers (wrappers to) Leo's position and/or vnode classes instead of some (as yet unknown) npyscreen array.

The way is now open to create this class.

This post describes a collapse in complexity in the LeoMLTree and LeoTreeData classes. This post also gives a very fast caching scheme for redrawing npyscreen outlines.

This post continues the original Engineering Notebook post.  It will be of interest only to a few. It will be pre-writing for a post to the npyscreen list.

Two non issues

What I will say later might be taken as a criticism of the npyscreen code base. But any such complaints would be silly. It would be extremely dangerous and unwise to make any non-local changes to the npyscreen sources.  Doing so would almost surely break existing npyscreen apps. I have felt free to make minor local simplifications to the sources in the leo/external/npyscreen folder, but that's all.

The quality of the npyscreen docs is also a non-issue. Sources, not words, are the ultimate underlying truth. After brief perusal of the npyscreen docs, I have only read the npyscreen sources.

Collapsing the LeoMLTree class

I am going to speak plainly. Elsewhere I have described the npyscreen sources as a maze.  I have spent days wandering the maze trying to unravel the code.  Much of the maze is irritating, useless, C-like, blah, blah blah.

In particular, the base npyscreen code seems deathly afraid of crashing. There is no reason for this fear in Python. During development, crashes help the app's developer find problems. In the end, it's up to the app's developer to eliminate crashes.

Happily, it's possible for subclasses of MLTree, such as LeoMLTree, to cut through this maze without touching MLTree at all. Now, LeoMLTree is dead simple. Here is how I did it:

1. Eliminated weak references. Weak references infect the npyscreen sources like a rash. In fact, there is not the slightest reason to use them in the tree code, and many good reasons not to do so. For starters, weak references complicate debugging. Imo, removing weak references was an essential first step toward navigating the maze.

2. Simplified the all-important values property. The actual drawing code uses what looks like a values array. It took a long time to see that the values property simulates an array whose values may be cached.  Here are the over-ridden getter and setter methods, as defined in LeoMLTree:

    def _setValues(self, tree):
        self._myFullValues = tree or LeoTreeData()

    def _getValues(self):
        '''
        Return the (possibly cached) list returned by self._myFullValues.get_tree_as_list().
   
        Setting _cached_tree to None invalidates the cache.
        '''
        if getattr(self, '_cached_tree', None):
            return self._cached_tree_as_list
        else:
            self._cached_tree = self._myFullValues
            self._cached_tree_as_list = self._myFullValues.get_tree_as_list()
            return self._cached_tree_as_list

This code is way simpler than the corresponding code in the base MLTree class. But even this simpler code is faux clever, for at least three, reasons:

1. Code in MLTree (or LeoMLTree) must explicitly disable the cache by setting the _cached_tree ivar to None. This makes the client code aware of a nerdy little helper ivar.

2. Instead of setting _cached_tree to None, the client code might as well refresh the cache directly, say by calling self.refresh_values(). So the values property has no real purpose. To anticipate a bit, Leo will need to refresh/invalidate the cache in exactly one place, namely in its c.frame.tree.redraw method.

3. The values property doesn't even do a very good job of caching. Refreshing the cache creates a TreeData node for every visible node in the tree. This could be a problem for large trees. A better caching scheme, given below, creates TreeData (LeoTreeData) nodes only when actually referenced.

Clever caching

At last I can describe how the big switcheroo will work:

- A values class will replace the values property.

- The class will have a __getitem__ method that will return only the requested LeoTreeData item. Something like this (untested):

    def __getitem__(self, n):
        return self.get_data(n)

    def get_data(self, n)
        '''Return a LeoTreeData for the n'th visible position of the outline.'''
        c = self.c
        # Look for n or n-1 in the caches.
        data = self.data_cache.get(n)
        if data:
            return data
        p = self.position_cache.get(max(0,n-1))
        if p:
            p = p.moveToNextVisible()
            if p:
                self.position_cache[n] = p
                self.data_cache[n] = data = LeoTreeData(p)
                return data
            else:
                return None
        # Search the tree, caching the result.
        i, p = 0, c.rootPosition()
        while p:
            if i == n:
                self.position_cache[n] = p
                self.data_cache[n] = data = LeoTreeData(p)
                return data
            else:
                p.moveToNextVisible()
        return None

This is clever code. In practice, it will have close to O(0) running time! Indeed, LeoMLTree._redraw (adapted from MultiLine.update) access lines in numeric order. As a result, a search of the Leo outline will be needed for only the first displayed line. Thereafter, n-1 will exist in the position cache, so only a single call to p.moveToNextVisible() will needed.

Summary

Subclasses of MLTree (and TreeData) can be much simpler than the corresponding base classes. The actual LeoMLTree and LeoTreeData classes contain many more examples than shown here. I hope the actual code will help developers of npyscreen devs navigate the maze.

Significantly better caching of TreeData objects is possible. LeoMLTree.values[i] should be faster than MLTree.values[i], even (especially) for very large Leo outlines. The caching scheme shown above reduces the stress on the GC by greatly reducing the number of allocated LeoTreeData objects.

c.frame.tree.redraw will clear the two caches in a new LeoValues class and then call LeoMLTree.display().

Edward

Edward K. Ream

unread,
May 18, 2017, 10:38:19 AM5/18/17
to leo-editor
On Wednesday, May 17, 2017 at 8:58:52 AM UTC-5, Edward K. Ream wrote:

This post describes a collapse in complexity in the LeoMLTree and LeoTreeData classes. This post also gives a very fast caching scheme for redrawing npyscreen outlines.

The caching code is now working, exactly as described. This is the LeoValues class.

As I write this, I'm not sure any caching is required!  Perhaps the desired range of lines can be gotten in a single function call.  In the meantime, LeoValues is simple enough and good enough.

Notes

Here are some notes about recent work. They will be useful pre-writing for the upcoming post to the npyscreen group.  As I think about that post, I see that it has a single intention, namely to help people to subclass npyscreen.MLTree productively.  Yes, the post will tell people how to avoid pitfalls, but complaining about those pitfalls is not part of the plan ;-)

In any case, these notes relate to the pitfalls, and imo #1 might be quite helpful to others.

1 Connecting the singleton LeoValues object: This required several experiments. Happily, the final code is gratifying simple.  First replace the LeoMLTree.values property with plain ivars:

    if native:
        _myFullValues = LeoTreeData() # Used only during startup.
        values = []
    else:
        @others
        values = property(_getValues, _setValues)

Later, in createCursesTree, set the values ivar:

    if native:
        leo_tree.values = LeoValues(c=c, tree=leo_tree)

2. LeoValues.__len__: As shown above, the LeoValues instance replaces LeoMLTree.values array. The base code asks for len(self.values), so the following LeoValues method is needed:

    def __len__(self):
        '''
        Return the putative length of the values array,
        that is, the number of visible nodes in the outline.
        '''
        c = self.c
        n, p = 0, c.rootPosition()
        while p:
            n += 1
            p.moveToVisNext(c)
        return n

This will be called once per full redraw, so its performance doesn't matter.

3. LeoTreeData getters: Previously, the dummy tree created a list of "linked" LeoTreeData nodes, such that the default getters in the npyscreen.TreeData class worked properly.  Now (when native is True) the LeoTreeData objects are simply wrappers for positions, so it will be necessary to over-ride all the getters.  This will be straightforward.

Summary

I expect to finish this phase of the project in a day or three.  Replacing the getters will take very little time, but I may experiment with removing the LeoValues class completely. This would entail a rewrite/revision of LeoMLTree._redraw.

It may also be possible to eliminate the LeoTreeData class. We shall see. Otoh, the LeoTreeLine class is almost certainly essential.

Edward
Reply all
Reply to author
Forward
0 new messages