ENB: pyzo syntax coloring: progress and a big mystery

34 views
Skip to first unread message

Edward K. Ream

unread,
Jan 12, 2017, 9:05:03 AM1/12/17
to leo-e...@googlegroups.com
Rev fbe428c shows the results of recent experiments using pyzo for syntax coloring in Leo.

You could call this an Engineering Notebook post.  You could also say that this discussion should be in #365, but I wanted to make this discussion more public because it contains a major mystery. Maybe you can help.

tl:dr; Within pyzo,  syntax coloring is 10x faster than Leo's.  When embedded in Leo,  the same pyzo coloring code is 10x slower than Leo's.

This is the mother of all performance bugs. Despite several hours of investigation, including profiling with cProfile, it remains (to paraphrase 2001, a Space Odyssey) a total mystery.

This is actually very good news, for several reasons:

1. I am now certain that #273 can be fixed. We can get rid of the "Big Text" hack! The proof is simple. Not only does pyzo load (and colorize!) big files quickly, but pyzo delays the display of big files until the loading and colorizing is complete.

There should be no reason (in principle) while the same behavior can't happen in Leo.  In fact, when the pyzo code is enabled, Leo displays the new body text only after it has been fully loaded and colorized! The only problem is that it takes 11 seconds (on a slow laptop) for @edit leoApp.py to become visible!

2. The performance bug that afflicts pyzo might also afflict Leo's present syntax coloring code. This is speculative, but worth serious investigation.

Pyzo loads and colorizes text in about the time it takes Leo to call QTextEdit.setPlainText.  This is an important clue.

Prototype code

The top-level pyzo switch in leoColorizer.py enables pyzo syntax coloring. When this switch is True, all of Leo's legacy coloring code simply returns.

The present pyzo coloring code is intended to be the simplest adaptation possible.  I made the minimum changes to the code.  These changes mostly involved initing color settings.  The hacks all happen when creating a PyzoHighlighter instance.  It's unlikely, but not impossible, that such changes affect performance.

The way forward

Both Leo's existing timing traces and cProfile data clearly indicate all 11 seconds of time are spent in the single call to colorizer.setHighlighter in LeoTree.set_body_text_after_select. This is expected.

When pyzo coloring is enabled, this results in a single call to Highlighter.rehighlight as expected.  Highlighter is the Pyzo highlighter. As I write this I see it should be named PyzoHighlighter within Leo.

There are several possible sources of the performance bug to investigate:

1. How many times does PyzoHighlighter.highlightBlock get called?  It might be called way too many times in Leo, for some obscure reason.

2. Are the proper QDocument and QTextEdit being used? The present prototyping code uses an instance of pyzo's CodeEditorBase as way of instantiating the pyzo's syntax colorer.  Maybe the highlighter must be attached to Leo's actual QTextEdit body widget. The pyzo colorizer uses QDocument to do crucial caching. Two different QTextEdit's or QDocument's might create the performance bug.

3. What Qt events are being generated? Leo's tree and text events are surely different from pyzo's.  Does that make a difference?  Could they be causing additional redraws?

4. Does background processing somehow mess things up? Leo's present coloring code spits long coloring tasks into pieces.  When pyzo is true, all of the legacy coloring code is supposed to be disabled. But perhaps that background code has some pernicious effect.

Summary

Leo's "ecosystem" is very different from pyzo.  Clearly, this makes an enormous difference for the pyzo coloring code. Something outside the pyzo coloring code is likely the culprit. Finding it could require a lot of sleuthing.

I would appreciate your help on this project.  If I get stuck, I might even ask Almar Klein for help...

Solving the performance mystery is worth any amount of work. There will be many happy repercussions, including closing #273. However, this work competes for my attention with other very important work.

Edward

Edward K. Ream

unread,
Jan 12, 2017, 9:37:13 AM1/12/17
to leo-editor
On Thursday, January 12, 2017 at 9:05:03 AM UTC-5, Edward K. Ream wrote:
 
> 1. How many times does PyzoHighlighter.highlightBlock get called?

Adding a counter shows that this method is called once per line of the input file, as expected. So this is not the culprit.

There is another way forward. It's possible to measure and trace pyzo itself.  In fact, I did this in order to understand how pyzo inits syntax settings.

I'd like to trace all of pyzo's events.  This capability would shine light both on pyzo and Leo. It would be a good tool to have.

EKR

Terry Brown

unread,
Jan 12, 2017, 10:17:47 AM1/12/17
to leo-e...@googlegroups.com
2. Are the proper QDocument and QTextEdit being used? The present prototyping code uses an instance of pyzo's CodeEditorBase as way of instantiating the pyzo's syntax colorer.  Maybe the highlighter must be attached to Leo's actual QTextEdit body widget. The pyzo colorizer uses QDocument to do crucial caching. Two different QTextEdit's or QDocument's might create the performance bug.
Can you check that the cache is being hit, by reporting / counting hits?  If it's never being used, or being asked to cache the same values repeatedly, then some communication issue must be effectively disabling the cache.  I don't know how much pyzo relies on it for speed, but a broken cache could certainly slow things down.

Cheers -Terry

Edward K. Ream

unread,
Jan 12, 2017, 11:39:26 AM1/12/17
to leo-editor
On Thu, Jan 12, 2017 at 10:13 AM, 'Terry Brown' via leo-editor <leo-e...@googlegroups.com> wrote:

​> ​
Can you check that the cache is being hit, by reporting / counting hits?  If it's never being used, or being asked to cache the same values repeatedly, then some communication issue must be effectively disabling the cache.  I don't know how much pyzo relies on it for speed, but a broken cache could certainly slow things down.

​Here is the good bits from the checkin log for f0f4b091:

QQQ
The pyzo colorer now is attached directly to Leo's real body pane QTextEdit.
...
Performance stats are as before. All counts are as expected. A real mystery.
QQQ

Actually, Leo's body text is a QTextBrowser.  Could that make a difference??

I also enabled tracing of events.  There aren't nearly enough to make a difference.

Something very weird, and very bad, is happening.  When it gets fixed we will all be much happier.

EKR

john lunzer

unread,
Jan 12, 2017, 12:37:12 PM1/12/17
to leo-editor
If you think you've hit a wall. I would not hesitate to contact Almar, he has been extremely responsive to my questions in the past.

Edward K. Ream

unread,
Jan 12, 2017, 4:22:09 PM1/12/17
to leo-editor
On Thu, Jan 12, 2017 at 12:37 PM, john lunzer <lun...@gmail.com> wrote:
If you think you've hit a wall. I would not hesitate to contact Almar, he has been extremely responsive to my questions in the past.

​Thanks for this.  It's good to know.  But I haven't done my homework yet, and I certainly am not out of ideas.

Something in Leo is the culprit.  Hehe, I just opened a beer, and I plan to remove things until the culprit is caught.  Starting with --plugins.

Hmm. I just found an important clue.  refresh-from-disk appears to be recoloring "instantly".  Otoh, my test.leo isn't loading @edit leo.core.leoApp.py at all! Solving these little mysteries may lead to the big bad wolf.

Maybe a few more beers will be required ;-) I have little doubt that it's just a matter of time...

EKR
Reply all
Reply to author
Forward
0 new messages