Sharejs in Quill

Jason Chen

unread,

Sep 29, 2014, 6:11:11 PM9/29/14

to sha...@googlegroups.com

Hey guys,

Just wanted to share something I've been working on with the guidance of Joseph: https://github.com/ottypes/rich-text. I'd be very interested in any feedback you guys might have.

This is a new ottype for rich text documents compatible with ShareJS. It's also a big step in getting Quill and ShareJS to work together without too many hacks (update on Quill's side). The next step is getting everything to work with ShareJS itself and I'll hopefully have a demo/example ready for that soon.

Dmitry

unread,

Oct 8, 2014, 4:10:51 PM10/8/14

to sha...@googlegroups.com

Hey, that's interesting. I've noticed you don't have invert() operation, how undo operations should be performed?

btw I've almost done my work on the new rich-text (popular topic, heh) OT as well that is basically an extension to easysync protocol from etherpad, but rewritten from scratch. It's compatible with ShareJS as well, have invert(), full support for attributes, works with natural document representation (lines, not linear positions - bye bye complexity and x<->(x,y) [user space]<-> [document space] transformations) very compact representation and 2x+ times faster than Jason's code in the fuzzy test (sorry! :)). But the code is a little bit more complex than this one, so I'm always looking for a new ideas if something could be done in a more simple manner :)

Jason Chen

unread,

Oct 8, 2014, 5:13:58 PM10/8/14

to sha...@googlegroups.com

There’s definitely a lot of design decisions each with tradeoffs so I’d be very interested in seeing your implementation and what choices were made and how, especially how it differs with Etherpad’s easysync.

I’m not surprised rich-text might be slower than other implementations with similar functionality. This is my second rich text OT implementation and this time I opted for simplicity over execution speed. If you look at the source code for most OT libraries, they are a multi-screen mess of indexes and offsets, impossible to reason about without a debugger or pen and paper and very intimidating to an unfamiliar reader. The iterator abstraction is one example of a tradeoff I made where it greatly simplified the implementation (and a reader’s ability to understand the code) but certainly adds a small amount of execution overhead. I think this is a good tradeoff considering modern CPUs and the fact that this type can still run thousands of operations a second but yes execution speed was sacrificed when implementing this type.

I will say that though the fuzzer is not a good test of performance. The complexity of ops that you give the fuzzer will greatly influence execution speed since most of these functions scale linearly to the number of operations. So an easy way to "game" the fuzzer is to give it simple or small operations but then the type won’t be as thoroughly tested.

Rich-text is not an invertible type since it doesn’t store what gets deleted. It’s impossible to invert the “delete 100 characters” instruction since it doesn’t store what those 100 characters were but of course the benefit is the 100 characters doesn’t have to be stored. The decision to not support this was based off the observation that invert is not used very often but adds significant overhead to normal operations. In its most popular use case in an undo manager (the only place Quill might use invert), the same can be accomplished with diff() since `type.invert(delta, originalDoc) === type.diff(originalDoc, type.apply(originalDoc, delta))`.

In any case happy to see more development and discussion in the space!

Dmitry

unread,

Oct 9, 2014, 7:09:42 AM10/9/14

to sha...@googlegroups.com

Yes I agree that fuzzier is not a good performance test as it involves many unknowns, including quality of randomly generated OPs. It was just the quickest way to compare apples with oranges :)

In my implementation I've tried to avoid all "mess of indexes and offsets" type of code and spent a lot of time understanding original library. Currently all I need is to find some time to create a good readme and reorganize the code before publishing, probably in a week or two.

The difference from easysync is that I store deleted text and attributes and introduce "delete attribute" operation, which allows me to quickly perform invert() without having access to the original snapshot. Talking about storage, the compact representation still takes noticeably less than any of the JSON-based OT format. Also second and very important goal was to perform as many checks as possible, because with most of the OT libraries there aren't many checks and they assume client will always send valid operations that does not break state or document. From my practice I can say that's not a good assumption, and if there is a bug in the changes builder code that collects changes from contentediable area, it shouldn't break document on the server.

There is one more function in etherpad that takes advantage of the invert function and which we heavily use in our project – ability to create diffs between 2 revisions, highlighting what was removed, changed or inserted and by whom. General diff-match-patch algorithms are not that accurate comparing to what you can get from the exact modifications history.

Pablo

unread,

Nov 26, 2014, 3:03:11 AM11/26/14

to sha...@googlegroups.com

Is there a demo using sharejs with quill?

Jason Chen

unread,

Nov 28, 2014, 9:55:05 AM11/28/14

to sha...@googlegroups.com

Not yet unfortunately.

Reply all

Reply to author

Forward