rich text support/closure-library integration

218 views
Skip to first unread message

Andy Hochhaus

unread,
Oct 27, 2010, 12:52:44 PM10/27/10
to diff-mat...@googlegroups.com
Hi,

Thanks for the great library Neil.

I'm am interested synchronizing rich text between multiple clients
using the google-closure editor [1].

I believe that google-diff-match-patch is intended/optimized for plain
text. At the end of your Tech Talk [2] you mention work on a
tree-based diff algorithm to support rich text editors. Does a plan
exist to open source the tree-based diff either as part of
google-diff-match-patch or closure-library?

Can you comment/suggest readings on the reasons a tree based diff is
necessary for rich text? From my very primitive understanding, it
seems that the unmodified diff-match-patch algorithm could be used to
sync the HTML generated by the rich text editors (i.e. HTML itself is
"plain" text).

After the time that diff-match-patch was released several other google
utilities have been open sourced including closure-compiler,
closure-library, closure-linter and the google javascript style guide.
I see that you are now compressing this library with closure-compiler
[3]. Do you have plans or would you be willing to accept patches
making this library comply with the google javascript style guide and
pass closure-linter's checks in strict mode? Lastly, what are your
thoughts on adding (the javascript portion) of this library into
closure-library? Adding diff-match-patch to closure-library seems like
it would open the door to better integration with many of Google's
other javascript widgets.

Thanks for your feedback,
Andy

[1] http://closure-library.googlecode.com/svn/docs/class_goog_editor_Field.html

[2] http://www.youtube.com/watch?v=S2Hp_1jqpY8

[3] http://groups.google.com/group/diff-match-patch/browse_thread/thread/ab451ce13a64b87b

Neil Fraser

unread,
Oct 27, 2010, 9:24:02 PM10/27/10
to Diff Match Patch
On Oct 27, 9:52 am, Andy Hochhaus <ahochh...@samegoal.com> wrote:
> I'm am interested synchronizing rich text between multiple clients
> using the google-closure editor [1].

This question is asked fairly frequently, so here's my standard
response:
http://code.google.com/p/google-diff-match-patch/wiki/Plaintext

> I believe that google-diff-match-patch is intended/optimized for plain
> text. At the end of your Tech Talk [2] you mention work on a
> tree-based diff algorithm to support rich text editors. Does a plan
> exist to open source the tree-based diff either as part of
> google-diff-match-patch or closure-library?

I cannot comment on future Google products. However, you might be
interested in this article:
"Efficient Change Control of XML Documents"
Sebastian Rönnau, DocEng 2009
This contains all the algorithms required to create tree-based diffs.
Sebastian might be willing to share code, I dont know.

> Can you comment/suggest readings on the reasons a tree based diff is
> necessary for rich text? From my very primitive understanding, it
> seems that the unmodified diff-match-patch algorithm could be used to
> sync the HTML generated by the rich text editors (i.e. HTML itself is
> "plain" text).

Agreed, but it is dangerous since there are no guarantees that the
output is legal HTML. Mangle a </table> tag and things go downhill
very quickly.

> After the time that diff-match-patch was released several other google
> utilities have been open sourced including closure-compiler,
> closure-library, closure-linter and the google javascript style guide.
> I see that you are now compressing this library with closure-compiler
> [3]. Do you have plans or would you be willing to accept patches
> making this library comply with the google javascript style guide and
> pass closure-linter's checks in strict mode?

Definitely. Yes, it whines about "found: (number|string); required:
string". I've checked each one and they are phantom warnings, so I've
ignored them. If you know how to shut them up that would be great.

> Lastly, what are your
> thoughts on adding (the javascript portion) of this library into
> closure-library? Adding diff-match-patch to closure-library seems like
> it would open the door to better integration with many of Google's
> other javascript widgets.

That's a tricky one. Since DMP was started four years ago outside of
Google, it doesn't conform to the Google style guide. I've dealt with
the Closure team (they are awesome) and do not believe they'd touch
DMP as it is. Rewriting it to comply with Google's style would be
fairly painless. However, there would be a steep price to pay in
terms of maintenance. First, it wouldn't be backwards compatible with
the existing version. So either I'd have to abandon the current
install base, or I'd have to support two versions indefinitely.
Second it wouldn't match the implementations of the C++, C#, Python,
Java, Lua (and in the pipeline, ActionScript and PHP) versions. This
would make maintenance of the library even more painful than it
already is.

The only way I could see this happening is if I draw a line in the
sand, and come out with an incompatible version 2 in all languages.
That's a fairly heavy decision. Though it would allow me to correct a
few design flaws.

Neil Fraser

unread,
Nov 17, 2010, 9:28:24 PM11/17/10
to Diff Match Patch
On 27 October 2010 18:24, Neil Fraser <neil....@gmail.com> wrote:
> On Oct 27, 9:52 am, Andy Hochhaus <ahochh...@samegoal.com> wrote:
>> After the time that diff-match-patch was released several other google
>> utilities have been open sourced including closure-compiler,
>> closure-library, closure-linter and the google javascript style guide.
>> I see that you are now compressing this library with closure-compiler
>> [3]. Do you have plans or would you be willing to accept patches
>> making this library comply with the google javascript style guide and
>> pass closure-linter's checks in strict mode?
>
> Definitely.  Yes, it whines about "found: (number|string); required:
> string".  I've checked each one and they are phantom warnings, so I've
> ignored them.  If you know how to shut them up that would be great.

Moments ago I pushed a new version of the DMP library which produces 0
warnings in the Closure compiler.
http://code.google.com/p/google-diff-match-patch/source/detail?r=75
The new version is in the Subversion repository, but not in the
downloadable archive yet.

Thanks for highlighting this issue.
--
Neil Fraser, Programmer & Wizard
http://neil.fraser.name

Hochhaus, Andrew

unread,
Jan 6, 2011, 2:42:02 PM1/6/11
to diff-mat...@googlegroups.com
Hi Neil,

Happy New Year.

On Wed, Oct 27, 2010 at 8:24 PM, Neil Fraser <neil....@gmail.com> wrote:
> you might be
> interested in this article:
> "Efficient Change Control of XML Documents"
> Sebastian Rönnau, DocEng 2009
> This contains all the algorithms required to create tree-based diffs.
> Sebastian might be willing to share code, I dont know.

Thanks for the pointers. Sebastian's papers are indeed very interesting.

For those interested, his (java) implementation of XCC is located here:

https://launchpad.net/xcc

>> After the time that diff-match-patch was released several other google
>> utilities have been open sourced including closure-compiler,
>> closure-library, closure-linter and the google javascript style guide.

Would you consider accepting this small patch to make the js
implementation pass closure-linter --strict?

For your reference, on a clean r81 client gjslint outputs:

$ gjslint.py --strict javascript/diff_match_patch_uncompressed.js
----- FILE : javascript/diff_match_patch_uncompressed.js -----
Line 56, E:0240: @return descriptions must end with valid punctuation
such as a period.
Line 172, E:0222: Member "diff_match_patch.prototype.diff_compute"
must not have @private JsDoc
Line 284, E:0006: (New error) Wrong indentation: expected any of {6,
48} but got 5
Line 302, E:0222: Member "diff_match_patch.prototype.diff_bisect" must
not have @private JsDoc
Line 307, E:0002: Missing space after ":"
Line 308, E:0002: Missing space after ":"
Line 400, E:0222: Member
"diff_match_patch.prototype.diff_linesToChars" must not have @private
JsDoc
Line 458, E:0222: Member
"diff_match_patch.prototype.diff_charsToLines" must not have @private
JsDoc
Line 511, E:0006: (New error) Wrong indentation: expected any of {6,
10, 49, 53, 57} but got 26
Line 610, E:0240: @param descriptions must end with valid punctuation
such as a period.
Line 714, E:0006: (New error) Wrong indentation: expected any of {10,
14, 18, 43, 56, 60, 64, 68} but got 11
Line 1005, E:0006: (New error) Wrong indentation: expected any of {18,
36, 65} but got 20
Line 1356, E:0222: Member "diff_match_patch.prototype.match_bitap"
must not have @private JsDoc
Line 1471, E:0222: Member "diff_match_patch.prototype.match_alphabet"
must not have @private JsDoc
Line 1493, E:0222: Member
"diff_match_patch.prototype.patch_addContext" must not have @private
JsDoc
Line 1551, E:0110: Line too long (81 characters).
Line 1907, E:0006: (New error) Wrong indentation: expected any of {16,
20, 44, 48, 56, 60, 64} but got 47
Found 17 errors, including 5 new errors, in 1 files (0 files OK).

Finally, would you be willing to share the command line that you use
to generate the compressed version of the library?

Thanks,
Andy

gjslinter.patch

Neil Fraser

unread,
Jan 7, 2011, 3:18:49 AM1/7/11
to Diff Match Patch
On Jan 6, 11:42 am, "Hochhaus, Andrew" <ahochh...@samegoal.com> wrote:
> Would you consider accepting this small patch to make the js
> implementation pass closure-linter --strict?

Thanks. Wow, that feels just like an internal code review at Google.
I'll definitely roll that into next week's release. :)

> Finally, would you be willing to share the command line that you use
> to generate the compressed version of the library?

I've never used the publicly released version of the Closure
compiler. Internally it's all generated by an automated build
system. However I suspect our build system's configuration flags
share most of the same flags as the public command line version.
Here's our build directive, let me know if this is useful:

js_binary(name = 'diff_match_patch_bin',
srcs = [':diff_match_patch'],
compile = 1,
externs_list = ['//javascript:externs.js'],
defs = ['--variable_renaming=LOCAL',
'--property_renaming=OFF',
'--remove_unused_prototype_props=False',
'--output_wrapper="(function(){%output%})()"'])

Hochhaus, Andrew

unread,
Jan 7, 2011, 3:48:49 PM1/7/11
to diff-mat...@googlegroups.com
Hi Neil,

Thanks for the build rule and gjslinter patch consideration.


I have started working on a re-styling of the javascript code with three goals:

1) Allow DMP to be easily compressed with the publicly released
closure-compiler (see comment at top of
diff_match_patch_uncompressed.js).

Some flags from the internal build description (eg:
remove_unused_prototype_props) do not appear to be accessible by
default in the public version of the compiler. However, attaching the
entirety of the public interface for DMP to "window" works too. I want
to remove all such "internal only" build dependencies.


2) Allow DMP to be easily compiled directly alongside closure-library
code instead of only built as a stand-alone library.

The current code does not lend itself well to be compiled directly
with closure-library style code (eg: no goog.provides, global
namespace pollution -- missing goog top level namespace, forcefully
attaches itself to window, etc). I want to refactor DMP so that it can
easily be built as a stand-alone library (without a closure-library
dependency) but can also easily be included in a closure-library based
project.


3) General readability improvements. Try to match the style guide a bit more.

Use enums more (instead of hard coded values), only compute
Match_MaxBits once (instead of once/object), fix a few warnings (eg:
casts) when compiling with all warnings turned on, etc.


Do you think that these goals are general purpose enough to be
considered for inclusion? I have attached a very preliminary patch
(still needs much work) to give you an idea of what I am proposing. If
you think these ideas have general value I'll finish the patch and
submit something for your feedback.


Thanks,
Andy

dmp-closure-friendly.patch

Neil Fraser

unread,
Jan 18, 2011, 7:28:40 PM1/18/11
to Diff Match Patch
On Jan 6, 11:42 am, "Hochhaus, Andrew" <ahochh...@samegoal.com> wrote:
> Would you consider accepting this small patch to make the js
> implementation pass closure-linter --strict?
>
> For your reference, on a clean r81 client gjslint outputs:

Done. Subversion updated. Thanks!

Will look at your second patch next.

Neil Fraser

unread,
Jan 18, 2011, 8:12:18 PM1/18/11
to Diff Match Patch
On Jan 7, 12:48 pm, "Hochhaus, Andrew" <ahochh...@samegoal.com> wrote:
> Do you think that these goals are general purpose enough to be
> considered for inclusion? I have attached a very preliminary patch
> (still needs much work) to give you an idea of what I am proposing. If
> you think these ideas have general value I'll finish the patch and
> submit something for your feedback.

This is tricky. The disadvantage of the Closure version is that it is
unusable for people who do not wish to use Closure within their
project. Thus we would need to support two versions of the code.
That imposes an additional penalty on maintenance, and the current
maintenance costs already make the most trivial change into a multi-
hour ordeal.

Balanced against this negative, are the positives of having a Closure
version. This would mean not needing an externs file. I'm not seeing
much benefit beyond this.

On the other hand, if DMP were make Closure-compatible and checked
into the Closure code base (something I could do a Google), then
Closure itself could depend on DMP. This would be great. However the
maintenance penalty would go through the roof since all changes to
Closure are very carefully scrutinized (and rightly so). Closure is
also very picky about doing things the Closure way, which means
significant API changes would naturally follow. Now we'd have a
version of DMP that's unlike any of the others, and would be even more
expensive to maintain.

Assuming the above list of advantages and disadvantages is complete,
I'm of the opinion that DMP should remain outside of Closure for the
time being.
Reply all
Reply to author
Forward
0 new messages