ENB: Fixing c.deletePositionsInList

50 views
Skip to first unread message

Edward K. Ream

unread,
Mar 17, 2020, 3:20:31 AM3/17/20
to leo-editor
This is #1539. I assigned this issue to 6.3 because because of this method's long history of failure.

Yesterday I saw a new approach. If this new approach is sound, and I think it is, I am willing to consider putting the new code in 6.2.

Thought experiment

I considered the possibility the following approach might be the only way:

1. Generate a list of positions meeting some criterion.
2. Delete one of the positions.
3. Regenerate the list of positions meeting the same criterion.

This surely must work. However, c.deletePositionsInList is not privy to the criterion, so a practical way forward is required.

Aha: p.position_after_deleting(p2)

The essence of our woes is that deleting any position P in the outline may affect all the other positions in the list passed to c.deletePositionsInList. Rather than guessing, we must compute the effect of deleting P on all positions in this list.

p.position_after_deleting(p2) will return None if p will not exist after deleting p2. Otherwise, it will return the adjusted value of p.

Calculating this adjusted value is non-trivial, but surely it is possible. Something like this:

- If p2 is p or an ancestor of p, p will no longer exist. Return None.
- Otherwise, len(p.stack) will remain unchanged, as will parent.v for parents in p.stack.
- If p2 is a previous sibling of p_i, (p or one of p's parents), then p_i.childIndex will decrease by 1.

A sound algorithm

The following is untested. Imo, it's ideas are most likely sound.

def deletePositionsInList(self, aList):
   
"""Delete all positions in the given list."""
    c
= self
   
while aList:
        # del_p is the next position to be deleted.
        del_p
= aList.pop()
        # Calculate new_positions.
        new_positions
= []
       
for p in aList:
            new_p
= p.position_after_deleting(del_p)
           
if new_p:
                new_positions
.append(new_p)
       
# Delete del_p.
        p
.doDelete()
       
# Check the new positions.
       
for p in new_positions:
           
assert c.positionExists(p)
       
# Update aList.
        aList
= new_positions
   
# Make sure c.hiddenRootNode always has at least one child.
   
if not c.hiddenRootNode.children:
        v
= leoNodes.VNode(context=c)
        v
._addCopiedLink(childIndex=0, parent_v=c.hiddenRootNode)
   
# Redraw.
    c
.selectPosition(c.rootPosition())

Notes:

- aList.pop() ensures that the loop terminates.

- p.position_after_deleting makes it possible to call p.doDelete rather than vnode-based code.

Summary

Since day 1, c.deletePositionsInList has been based on guesswork and wishful thinking. I am completely responsible for this.

p.position_after_deleting promises to put c.deletePositionsInList on a solid foundation. I am willing to consider putting the new code in Leo 6.2, provided a substantial set of unit tests for p.position_after_deleting exists.

I'll attempt p.position_after_deleting later today. It's time for bed.

All comments welcome.

Edward

vitalije

unread,
Mar 17, 2020, 3:56:48 AM3/17/20
to leo-editor
I would suggest another approach. What is the essence of this operation? If we compare the tree before and after this operation, the difference between them is just a bunch of links that were cut. To be able to specify which links are/(or should be) cut we can use the pair of values (childIndex, parent_v) for each link.

For each position in the input list we can produce the pair of values (childIndex, parent_v) that should be broken after this operation.
Now, if we try to delete some children of the cloned nodes (where parent_v is a clone), it is possible that the input list contains two or more different positions that will result in the same link being cut. That is why we need to eliminate duplicates from this list of pairs. Once we have this list of links that needs to be cut, we can sort them from the largest childIndex to the smallest one. Afterwards we can safely cut each of the links in the final list. They all should be cut and they all will exist until their turn to be processed is ready.
 
def p2link(p):
    parent_v
= p.stack[-1][0] if p.stack else c.hiddenRootNode
   
return (p._childIndex, parent_v)

def deletePositionsInList2(aList):
    links_to_be_cut
= sorted(set(p2link(x) for x in aList), reverse=True)
   
for i, v in links_to_be_cut:
        ch
= v.children[i]
        j
= ch.parents.index(v)
       
del ch.parents[j]
       
del v.children[i]


However, I believe it has the same effect as the solution I wrote before:

def deletePositionsInList(self, aList, redraw=True):
    seen
= set()
   
for p in sorted(aList, reverse=True):
        par
= p.stack[-1][0] if p.stack else c.hiddenRootNode
        x
= par, p._childIndex
       
if x not in seen:
            seen
.add(x)
            p
.doDelete()
   
if redraw:
        c
.redraw()
   
return


Although I expect the new solution (deletePositionsInList2) to be faster.

The only requirement for this function to work is that the input list contains only valid positions, but that should be the caller's responsibility.

Vitalije


Edward K. Ream

unread,
Mar 17, 2020, 8:44:05 AM3/17/20
to leo-editor
On Tue, Mar 17, 2020 at 2:56 AM vitalije <vita...@gmail.com> wrote:

I would suggest another approach. What is the essence of this operation? If we compare the tree before and after this operation, the difference between them is just a bunch of links that were cut.

Yes, I think this is correct. Thank you for these comments, they are helpful.

My intuition says that that the computation of the links to be cut must be done anew for every position in the original list of positions. That way we can be sure that deleting positions can be done in any (or all) orders.

In other words, the code you suggest might lead to a simpler way of doing p.position_after_deleting, but I think the main loop of p.deletePositionsInList should stay pretty much as I have shown.

I have just created the delete branch for #1539. This will allow us both to propose code and unit tests. Imo, a necessary, but not sufficient unit test for p.deletePositionsInList will verify that all permutations of the positions in the list lead to identical results.

This issue is presently scheduled for 6.3. I am open to making it part of 6.2 if you and I can agree that the new code is sound.

Edward

vitalije

unread,
Mar 17, 2020, 9:26:38 AM3/17/20
to leo-editor
Apart from your intuition do you see any other reason to recompute the links after each delete? What could possible be changed? Child index? If the deletions are done in the order from largest to smallest one, none of them will change. The other part is vnode instance which can't be changed either. So, what is the reason you think link recalculation is necessary?

The solution I suggested doesn't operate on positions, but directly on vnodes. It is short and clean and I am rather sure it is correct. By all means we should create some tests for it, but I would not hesitate too much to put it in both devel and 6.2. In rather unlikely case that someone reports a bug regarding this it would be trivial to revert the code in its current state.

There is already a unit test in unitTest.leo for this method and my code is passing this test. I have created a test that shows the original code is not working and my code is passing this test too. I am not sure what else to test. The only thing that comes to my mind is to use hypothesis to generate thousands of random trees and try this method on.

Vitalije


Edward K. Ream

unread,
Mar 17, 2020, 9:44:47 AM3/17/20
to leo-editor
On Tue, Mar 17, 2020 at 8:26 AM vitalije <vita...@gmail.com> wrote:

Apart from your intuition do you see any other reason to recompute the links after each delete? What could possible be changed? Child index?

Yes.
If the deletions are done in the order from largest to smallest one, none of them will change.

Because of clones, there is no such total ordering.

My intuition says that child indices change in "unpredictable" ways after each deletion.

Yes, I could be wrong, but I'm not convinced that clever solutions are possible.

Edward

Edward K. Ream

unread,
Mar 17, 2020, 9:50:36 AM3/17/20
to leo-editor
On Tuesday, March 17, 2020 at 8:44:47 AM UTC-5, Edward K. Ream wrote:

> I'm not convinced that clever solutions are possible.

Let's do this. I am going to add c.ekr_deletePositionsInList to the delete branch. Feel free to add your own version, with a unique name. We can then compare the two solutions.

We can also add our own unit tests in unitTest.leo.

Edward

Edward K. Ream

unread,
Mar 17, 2020, 10:03:26 AM3/17/20
to leo-editor
On Tuesday, March 17, 2020 at 8:26:38 AM UTC-5, vitalije wrote:

Apart from your intuition do you see any other reason to recompute the links after each delete?

Here's another part of the problem that can, and will cause problems.

Positions in the original list might no longer exist after a previous delete operation. That's why the code I suggest contains:

# Calculate new_positions.
 new_positions
= []
 
for p in aList:
     new_p
= p.position_after_deleting(del_p)
     
if new_p:
         new_positions
.append(new_p)

Without this kind of check, there is a real danger of adjusting child indices too often.

I can't say for sure whether these precautions are necessary, but I strongly suspect they are. Even if they aren't strictly necessary, they seem like the simplest, most prudent, approach.

Edward

vitalije

unread,
Mar 17, 2020, 10:20:04 AM3/17/20
to leo-editor
Here's another part of the problem that can, and will cause problems.

Positions in the original list might no longer exist after a previous delete operation. That's why the code I suggest contains:


You are thinking in the terms of positions, but they are irrelevant for my code. After initial transformation of input positions into the list of immutable pairs, positions have no influence on the execution of main loop. Links that are requested to be cut, will be cut and no link will be cut twice, which means when it is time to cut it it is not cut yet. The code is so obvious once you shift your mind away from the mutable positions to immutable pairs of integers and vnode instances. Nor the integer index can change, neither the vnode instance can become invalid.

Vitalije

Edward K. Ream

unread,
Mar 17, 2020, 10:59:21 AM3/17/20
to leo-editor
On Tue, Mar 17, 2020 at 9:20 AM vitalije <vita...@gmail.com> wrote:
 
The code is so obvious once you shift your mind away from the mutable positions to immutable pairs of integers and vnode instances.

I'm not convinced. Please push your code to the delete branch.

Edward

vitalije

unread,
Mar 17, 2020, 12:54:26 PM3/17/20
to leo-editor

I'm not convinced. Please push your code to the delete branch.


Done at 7bf61c368c

The implementation is inside c.deletePositionsInList and disabled with the if False:

Unit tests run with enabled or disabled have the same outcome
----------------------------------------------------------------------
Ran 918 tests in 11.925s

FAILED
(failures=3, errors=2, skipped=65)

Failures and errors have nothing to do with c.deletePositionsInList.

Vitalije

Edward K. Ream

unread,
Mar 17, 2020, 4:14:49 PM3/17/20
to leo-editor
Thanks for this. I'll give this another thought soon.

Edward
Reply all
Reply to author
Forward
0 new messages