Delete performance

3 views
Skip to first unread message

Joe Walker

unread,
Aug 14, 2009, 7:47:20 AM8/14/09
to mobw...@googlegroups.com

Hi,

I think there's an edit case that can cause performance issues that we
can handle better. In Bespin we've chosen to raise the limit on a
single file size, so we've seen issues than might not affect other
deployments.

Our issue is that large deletions (like deleting over 100k) cause all
other browsers to have a fit processing the diff. Commonly all
browsers would hang for tens of seconds, displaying the "slow script"
message, and occasionally never recover. The issue seems roughly
exponential, so deletes of 200k are likely to kill all connected
browsers. Given that some of the mobwrite source files are over 70k
this doesn't seem like a strange side case.

It would seem like there is a possibility of detecting this case and
providing a shortcut.
Thanks,

Joe.

Neil Fraser

unread,
Aug 17, 2009, 8:48:27 PM8/17/09
to mobw...@googlegroups.com, diff-mat...@googlegroups.com
2009/8/14 Joe Walker <jwa...@mozilla.com>:

> Our issue is that large deletions (like deleting over 100k) cause all
> other browsers to have a fit processing the diff.
> [...]

> It would seem like there is a possibility of detecting this case and
> providing a shortcut.

Good point. BTW, this is a diff-match-patch issue, not one specific
to MobWrite. But since I'm the maintainer of both...

There's an emergency fix ready right now:
http://neil.fraser.name/temp/diff_match_patch_uncompressed.js
It passes the unit tests, but hasn't been extensively tested. I'll
translate it to the other languages, run it through code review and
post an official version on Wednesday-ish.

The change is how to handle a large delete. Currently we chop it into
pieces, then match and remove each piece. Looking at the algorithm,
it should scale O(n), not exponential as you indicate, but either way
it's inefficient. The new algorithm matches the start point, the end
point, then kills everything in between. Much more efficient.

This directness has a side effect. If you type one sentence as
someone else deletes the whole paragraph, previously your sentence (or
at least 32 character chunks of it) would likely fall out and be
preserved. Now it gets toasted. To be honest, I think the new
behaviour is better for an editor with syncs every second or two. A
lumbering editor with syncs every 30 seconds or so would want such
preservation.

+ diff-mat...@googlegroups.com

--
Neil Fraser, Programmer & Wizard
http://neil.fraser.name

Joe Walker

unread,
Aug 18, 2009, 7:47:17 AM8/18/09
to mobw...@googlegroups.com
On Tue, Aug 18, 2009 at 1:48 AM, Neil Fraser <ro...@neil.fraser.name> wrote:

2009/8/14 Joe Walker <jwa...@mozilla.com>:
> Our issue is that large deletions (like deleting over 100k) cause all
> other browsers to have a fit processing the diff.
> [...]
> It would seem like there is a possibility of detecting this case and
> providing a shortcut.

Good point.  BTW, this is a diff-match-patch issue, not one specific
to MobWrite.  But since I'm the maintainer of both...

There's an emergency fix ready right now:
 http://neil.fraser.name/temp/diff_match_patch_uncompressed.js
It passes the unit tests, but hasn't been extensively tested.  I'll
translate it to the other languages, run it through code review and
post an official version on Wednesday-ish.

Awesome.
I've just run some tests and it performs well.

The change is how to handle a large delete.  Currently we chop it into
pieces, then match and remove each piece.  Looking at the algorithm,
it should scale O(n), not exponential as you indicate, but either way
it's inefficient.  The new algorithm matches the start point, the end
point, then kills everything in between.  Much more efficient.

To be honest we didn't plot results or anything. That was the assumption based on what seemed like a step change in unresponsiveness after an 8k line delete.

This directness has a side effect.  If you type one sentence as
someone else deletes the whole paragraph, previously your sentence (or
at least 32 character chunks of it) would likely fall out and be
preserved.  Now it gets toasted.  To be honest, I think the new
behaviour is better for an editor with syncs every second or two.  A
lumbering editor with syncs every 30 seconds or so would want such
preservation.

Agreed.
Many thanks,

Joe.

Reply all
Reply to author
Forward
0 new messages