Qt-prototype report and some thoughts about executeScript command

vitalije

unread,

May 7, 2020, 2:42:24 PM5/7/20

to leo-editor

In last few days I've been working on tests to be sure that all commands in new prototype are working correctly and that no crash will ever occur. I am pretty sure now that the implementation is correct and there are no remaining bugs in the prototype.

First of all I must say that testing with hypothesis is really great way to discover hidden bugs. Several bugs were found in the previous implementation that are very hard to imagine as a possible scenario. Of course most of these bugs were related to clones. Hypothesis did find them really quickly and I had to think hard how to solve them. After several iterations of running hypothesis and solving found bugs, prototype is now able to survive 5000 test sessions. At first I have started each test with the complete LeoPyRef.leo outline, test would choose and select a random position in this outline, and then it would perform a random sequence of commands, checking after each command that both models (v-nodes and tree widget items) are in sync. It used to take several minutes for a series of 500 tests.

To speed up tests, I have changed test to use a smaller outline as starting position. At start it inserts just five ordinary nodes interspersed with five cloned nodes (total of 15 node). Now hypothesis runs 5000 tests in about minute. After several executions no bug has been found.

It is highly unlikely that a new bug will be discovered using this test. That means we can be pretty sure that no matter what operations and in whichever order user executes both models: v-nodes and tree widget items will always remain in sync. The outline represented by each of them is exactly the same.

Now the question of executing scripts

How should we re-synchronize tree widget with the possible changes in the outline made by user script? The more I think about this problem, the more I am sure that my initial plan of using diff algorithm won't work. It would mess up with the undo system. Undo operations rely on the stability of tree widget items. If any of items is destroyed and replaced with the new one, undo blocks containing this item will be broken. But not all hope is lost. It is possible to execute full redraw after executing any script. The prototype contains now the `performance` command. This command measures the time required for full redraw and storing all the information necessary to undo the execute script command. On my computer this takes about 80ms for LeoPyRef.db. On little older computers it may take even 200-300ms. That means that every execute script command will take that much time longer to execute. This is the worst case scenario. Usually outlines are smaller than LeoPyRef.leo and this time delay will be much smaller in most cases. In return we gain something that we didn't have before: the executeScript command now becomes undoable as any other command.

I can't speak for everyone else but I would gladly accept this delay on each executeScript command and in return have an insurance that no error in my script will destroy my outline irrecoverably. I would like to be sure that in case of an error in script I still can undo the whole operation and return to the previous state. I know for certain that on more than one occasion I've missed this undo ability very much.

Your comments please.

Vitalije

Thomas Passin

unread,

May 7, 2020, 2:55:31 PM5/7/20

to leo-editor

Do you think it would be feasible to simply copy the root of the tree to be operated on? Then the undo would entail pointing the tree's original parent to the copy and then redrawing. I don't know about copy performance (I suppose that a deep copy would be needed) on Leo nodes, but as a naive idea it seems simple.

vitalije

unread,

May 7, 2020, 3:27:11 PM5/7/20

to leo-editor

On Thursday, May 7, 2020 at 8:55:31 PM UTC+2, Thomas Passin wrote:

Do you think it would be feasible to simply copy the root of the tree to be operated on? Then the undo would entail pointing the tree's original parent to the copy and then redrawing. I don't know about copy performance (I suppose that a deep copy would be needed) on Leo nodes, but as a naive idea it seems simple.

QTreeWidgetItem class has a clone method which runs quite fast. Cloning entire tree using this method is not problem at all. The problem is that if a script changes vnodes we need to synchronize them after script has finished. This synchronization can drop some items that are required for undoing earlier operations. To avoid this Leo should add to the undo list a new step which contains the copy of the tree items before executing script. All these operations are pretty cheap. The time will be spent in the redrawing tree from scratch after the script execution.

As I write this I've just realize that it would be possible to use diffing algorithm to update tree after the script. The only thing that is necessary is to make execute script undoable. Updating the tree using the diff algorithm will have exactly the same tree items as they were before the script. Nothing would be lost. All changes that must be made can be easily recorded in the undo bead for later undo/redo. This has a chance to make things a lot faster.

Vitalije

Brian Theado

unread,

May 7, 2020, 8:50:03 PM5/7/20

to leo-editor

Vitalije,

On Thu, May 7, 2020 at 2:42 PM vitalije <vita...@gmail.com> wrote:

First of all I must say that testing with hypothesis is really great way to discover hidden bugs. Several bugs were found in the previous implementation that are very hard to imagine as a possible scenario. Of course most of these bugs were related to clones. Hypothesis did find them really quickly and I had to think hard how to solve them. After several iterations of running hypothesis and solving found bugs, prototype is now able to survive 5000 test sessions. At first I have started each test with the complete LeoPyRef.leo outline, test would choose and select a random position in this outline, and then it would perform a random sequence of commands, checking after each command that both models (v-nodes and tree widget items) are in sync. It used to take several minutes for a series of 500 tests.
To speed up tests, I have changed test to use a smaller outline as starting position. At start it inserts just five ordinary nodes interspersed with five cloned nodes (total of 15 node). Now hypothesis runs 5000 tests in about minute. After several executions no bug has been found.

That's awesome. A while back when I read about hypothesis property based testing, I was wondering how to apply it to leo outlines. I knew it would have to be done either by creating a strategy which could generate recursive structures or a strategy which could perform sequences of commands. I didn't see how to proceed. Looks like you nailed it with the second approach. Nice work!

I looked at the code but I don't understand how to run the hypothesis tests. I would love to see it in action. Could you share the steps?

It is highly unlikely that a new bug will be discovered using this test. That means we can be pretty sure that no matter what operations and in whichever order user executes both models: v-nodes and tree widget items will always remain in sync. The outline represented by each of them is exactly the same.

Have you also considered using the property of random operation + undo = original tree widget state? And random operation + undo + redo = 2nd state? I'm not sure that would reveal anything from what you are already testing. Just a thought.

Brian

vitalije

unread,

May 8, 2020, 2:42:12 AM5/8/20

to leo-editor

On Friday, May 8, 2020 at 2:50:03 AM UTC+2, btheado wrote:

I looked at the code but I don't understand how to run the hypothesis tests. I would love to see it in action. Could you share the steps?

I use pytest for running the tests. Just 'pytest myleoqt.py'.

Have you also considered using the property of random operation + undo = original tree widget state? And random operation + undo + redo = 2nd state? I'm not sure that would reveal anything from what you are already testing. Just a thought.

Brian

I haven't explicitly test that after undo outline reverts to what it was before and redo does the opposite. Actually for redo part there is no need to test because all commands are implemented in such a way that command first prepares and creates two functions dosomething and undosomething, and then just calls dosomething(). It means that if undo works correctly it is guaranteed that redo will do correctly as well. However, both of this commands undo and redo are on the list of possible choices for hypothesis. So, the test executes both of them in the combination with all other commands. I think I'll add that at the end of the test full undo stack gets undone and then to check whether the outline has the same shape as it was on the start. Checking every step with undo, redo is possible too.

Vitalije

Edward K. Ream

unread,

May 8, 2020, 7:05:46 AM5/8/20

to leo-editor

On Thursday, May 7, 2020 at 1:42:24 PM UTC-5, vitalije wrote:

First of all I must say that testing with hypothesis is really great way to discover hidden bugs.

I am going to start by studying hypothesis. It looks like a great idea.

I'll defer further work on unit testing until I see what you have done.

The only work I have done so far is to change the `if __name__ == "__main__" code. I created a new "main" function. This function adds one feature: if the file name is "test" it calls unittest.main(). I assume this will be useful to you. If not, feel free to delete it.

Now the question of executing scripts

How should we re-synchronize tree widget with the possible changes in the outline made by user script? The more I think about this problem, the more I am sure that my initial plan of using diff algorithm won't work.

I don't have an opinion about what might work. However, if the tree-refresh branch is still relevant, I would like to see it merged into the mvc-prototype branch. That will make for me (and others) to study the latest code.

I can't speak for everyone else but I would gladly accept this delay on each executeScript command and in return have an insurance that no error in my script will destroy my outline irrecoverably. I would like to be sure that in case of an error in script I still can undo the whole operation and return to the previous state. I know for certain that on more than one occasion I've missed this undo ability very much.

A few added milliseconds is a trivial price to pay for safety. Go for it.

Edward

Edward K. Ream

unread,

May 8, 2020, 7:13:51 AM5/8/20

to leo-editor

On Thu, May 7, 2020 at 2:27 PM vitalije <vita...@gmail.com> wrote:

QTreeWidgetItem class has a clone method which runs quite fast.

Wow. I didn't know that. Sounds like a great fit for the new code.

Cloning entire tree using this method is not problem at all. The problem is that if a script changes vnodes we need to synchronize them after script has finished. This synchronization can drop some items that are required for undoing earlier operations. To avoid this Leo should add to the undo list a new step which contains the copy of the tree items before executing script. All these operations are pretty cheap. The time will be spent in the redrawing tree from scratch after the script execution.

I like the simplicity of this scheme. I'm a bit worried about memory.

In Leo's present undo scheme no data are ever reclaimed because references are never deleted. Leo's present undo scheme minimizes the per-operation memory requirements as much as possible. I worry about creating O(N) data items on every undo operations, where N is the total number of tree nodes.

As I write this I've just realize that it would be possible to use diffing algorithm to update tree after the script. The only thing that is necessary is to make execute script undoable. Updating the tree using the diff algorithm will have exactly the same tree items as they were before the script. Nothing would be lost. All changes that must be made can be easily recorded in the undo bead for later undo/redo. This has a chance to make things a lot faster.

Imo, the big factor concerning speed is python's GC. Could the diff scheme reduce the per-operation cost from O(N) to O(1)?

Edward

Edward K. Ream

unread,

May 9, 2020, 8:15:43 AM5/9/20

to leo-editor

On Friday, May 8, 2020 at 1:42:12 AM UTC-5, vitalije wrote:

> I use pytest for running the tests. Just 'pytest myleoqt.py'.

When I do that a blank python window appears and everything hangs.

I've tried waiting several minutes. What should I do?

Edward

Brian Theado

unread,

May 9, 2020, 9:04:36 AM5/9/20

to leo-editor

You can try making these changes:

-from hypothesis import given, settings
+from hypothesis import given, settings, Verbosity

-@settings(max_examples=5000, deadline=timedelta(seconds=4))
+@settings(max_examples=5000, deadline=timedelta(seconds=4), verbosity=Verbosity.verbose)

And then add the '-s' option to the pytest command line so the stdout will be displayed as the test run. That will at least give you feedback in the console if anything is happening. But it make the total runtime of the tests longer. For me the tests are finishing within two minutes.

I tried the verbose output because I was curious what tests were being run. Really I wanted to run them with screen updates in between each step so I could get a visual feel for what the tests are doing. But I haven't investigated yet to see what I would have to do to force Qt to refresh the screen.

You could also try setting max_examples really low just as a baseline-sanity check.

--
You received this message because you are subscribed to the Google Groups "leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leo-editor+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/0f5bb32d-20a9-49ea-ac86-bdc2b845e4f0%40googlegroups.com.

vitalije

unread,

May 9, 2020, 9:37:13 AM5/9/20

to leo-editor

If you want to have visual feedback add the following line in the test_select_and_commands function:

...
meth = getattr(app, name)
meth()
app.processEvents() # <--- add this line
...

Tests wilil run about 3 times slower, so you might want to change max_examples=5000 in the decorator to something less.

Vitalije

Edward K. Ream

unread,

May 9, 2020, 9:48:41 AM5/9/20

to leo-editor

On Sat, May 9, 2020 at 8:37 AM vitalije <vita...@gmail.com> wrote:

If you want to have visual feedback add the following line in the test_select_and_commands function:

... meth = getattr(app, name) meth() app.processEvents() # <--- add this line ...

Tests will run about 3 times slower, so you might want to change max_examples=5000 in the decorator to something less.

Thanks. The test eventually did pass on my machine. It took 538 seconds.

With processEvents enabled tests took 43 sec. to run with a count of 500.

Edward

Brian Theado

unread,

May 9, 2020, 11:34:21 AM5/9/20

to leo-editor

Vitalije

On Thu, May 7, 2020 at 2:42 PM vitalije <vita...@gmail.com> wrote:

In last few days I've been working on tests to be sure that all commands in new prototype are working correctly and that no crash will ever occur. I am pretty sure now that the implementation is correct and there are no remaining bugs in the prototype.

I added set_hoist to the list of methods randomly called and was able to get a crash. For me it was a hard-crash with core dump which makes it harder to find out the scenario causing the crash. It wasn't too hard though and the steps are simple:

1. hoist any node which has a previous sibling

2. move node right

3. crash:

Traceback (most recent call last):
File "leo/extensions/myleoqt.py", line 333, in select_item
v = newitem.data(0, 1024)
AttributeError: 'NoneType' object has no attribute 'data'
Fatal Python error: Aborted

Current thread 0x00007fad45034740 (most recent call first):
File "leo/extensions/myleoqt.py", line 1227 in move_just_treeitem
File "leo/extensions/myleoqt.py", line 1213 in move_treeitem
File "leo/extensions/myleoqt.py", line 430 in domove
File "leo/extensions/myleoqt.py", line 449 in make_undoable_move
File "leo/extensions/myleoqt.py", line 557 in move_node_right
File "leo/extensions/myleoqt.py", line 1594 in main
File "leo/extensions/myleoqt.py", line 1597 in <module>
Aborted (core dumped)

BTW, in order to get the bottom part of the stack trace I added this:

import faulthandler; faulthandler.enable()

vitalije

unread,

May 9, 2020, 4:48:45 PM5/9/20

to leo-editor

Good catch Brian. Revision 8184a2023e5f9 contains the fix for this issue. Now hoist/dehoist commands are also tested using hypothesis.

Vitalije

Brian Theado

unread,

May 9, 2020, 9:02:01 PM5/9/20

to leo-editor

It no longer crashes for me, thanks.

While it doesn't crash, it is still possible to move a node left out of a hoist and the node "disappears". If it is the last node then it disappears and the outline pane becomes empty. The existing leo doesn't behave this way. Another behavior difference. Leo doesn't allow a top-level sibling to be inserted while a node is hoisted. Instead all top-level insert operations are forced to be child insertions.

Maybe the behavior you have is reasonable, but it doesn't match the way leo operates.

I'm just pointing these out to let you know. Probably polishing the prototype is not as important as figuring out how to proceed to see how/whether it can replace the existing qt gui.

On Sat, May 9, 2020 at 4:48 PM vitalije <vita...@gmail.com> wrote:

Good catch Brian. Revision 8184a2023e5f9 contains the fix for this issue. Now hoist/dehoist commands are also tested using hypothesis.

Vitalije

--

You received this message because you are subscribed to the Google Groups "leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leo-editor+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/ede74c56-e1b8-44e7-adb5-85a07b9355e8%40googlegroups.com.

vitalije

unread,

May 10, 2020, 6:14:54 AM5/10/20

to leo-editor

21ead02d970 fixes empty looking tree. It is still possible to add more siblings of hoisted node, but I don't mind this behavior.

On Sunday, May 10, 2020 at 3:02:01 AM UTC+2, btheado wrote:

It no longer crashes for me, thanks.

While it doesn't crash, it is still possible to move a node left out of a hoist and the node "disappears". If it is the last node then it disappears and the outline pane becomes empty. The existing leo doesn't behave this way. Another behavior difference. Leo doesn't allow a top-level sibling to be inserted while a node is hoisted. Instead all top-level insert operations are forced to be child insertions.

Vitalije

Edward K. Ream

unread,

May 10, 2020, 8:53:15 AM5/10/20

to leo-editor

On Saturday, May 9, 2020 at 3:48:45 PM UTC-5, vitalije wrote:

> Good catch Brian. Revision 8184a2023e5f9 contains the fix for this issue. Now hoist/dehoist commands are also tested using hypothesis.

I'm just beginning to see how cool Hypothesis is. All you had to do is add two command names to the list of method names. After that, the line:

data.draw(lists(sampled_from(method_names),...)

will do the rest.

Vitalije, would you say Hypothesis is most useful during development? We don't want to take 300+ seconds for routine unit tests, but Hypothesis did find several unexpected edge cases. Would it make sense to test for these edge cases explicitly in unitTest.leo?

Edward

vitalije

unread,

May 10, 2020, 9:59:21 AM5/10/20

to leo-editor

I didn't find that hypothesis was slowing me down during development. When a bug combination is found, hypothesis checks this combination early. It means that until you fix the bug running hypothesis is almost exactly as running a special hand-written test for just this sequence of commands. If the bug is fixed then you'll have to wait more to see if there is another one.

Almost all serious bugs were found in less than 10s, and each next run until bug was fixed took less than a second. If there is no more bugs you'll have to wait a lot for the test to finish, but for me it is a pleasure waiting and knowing that thousands of tests are passing.

The biggest challenge was not the time hypothesis takes for testing. Many bugs were causing hard crashes so hypothesis could not finish its job. This is because we have to deal with Qt. In pure python development hard crashes never happen.

Vitalije

Edward K. Ream

unread,

May 10, 2020, 10:28:51 AM5/10/20

to leo-editor

On Sun, May 10, 2020 at 8:59 AM vitalije <vita...@gmail.com> wrote:

I didn't find that hypothesis was slowing me down during development. When a bug combination is found, hypothesis checks this combination early. It means that until you fix the bug running hypothesis is almost exactly as running a special hand-written test for just this sequence of commands. If the bug is fixed then you'll have to wait more to see if there is another one.

Almost all serious bugs were found in less than 10s, and each next run until bug was fixed took less than a second. If there is no more bugs you'll have to wait a lot for the test to finish, but for me it is a pleasure waiting and knowing that thousands of tests are passing.