Comment #70 on issue 109587 by
xiaoche...@chromium.org: Pasting lots of
chrome://tracing shows that
SpellChecker::chunkAndMarkAllMisspellingsAndBadGrammar is a major (maybe
not the only) source of the inefficiency.
With some code digging, every time something is pasted, even if it's just a
single character, the above mentioned method examines the full text in the
textarea. My gut feeling is that we should examine only the pasted part
(and possibly a few characters before/after it) instead.
Even worse, this method runs in O(n^2) time in the worst case, where n is
the length of the full text in the textarea (including the text that was
just pasted). To get an idea, (in a sketch) the function does the following:
1. Cut the text to be examined (i.e., everything in the textarea) into
chunks, where each chunk has length kChunkSize = 16384 (hardcoded)
2. Expand each chunk to align with sentence boundaries by calling
expandRangeToSentenceBoundary
3. Call SpellChecker::markAllMisspellingsAndBadGrammarInRanges on each
expanded chunk
As far as I have checked, step 2 is a troublemaker. Function
expandRangeToSentenceBoundary calls startOfSentence and endOfSentence to
find sentence boundaries, which in turn call previousBoundary and
nextBoundary, respectively. previousBoundary copies to a temporary buffer a
text fragment that ends at the starting position of the chunk, and then
scans the buffer with a TextBreakIterator to find a sentence boundary. In
the worst case, the text fragment starts at the beginning of the full text.
nextBoundary does something symmetric.
As a result, in the worst case, step 2 needs to process everything in the
textarea, in which case the chunking trick is completely useless. And the
worst case is very easily achievable: just paste some very long text into
an empty textarea, and then the two boundary functions together copy the
full text into their buffers every time step 2 is run.