CRDT vs OT

1,306 views
Skip to first unread message

Daniel Charbonneau

unread,
Feb 26, 2014, 9:23:03 AM2/26/14
to sha...@googlegroups.com
Would someone expand on why you would want to use OT instead of CRDTs?

Joseph Gentle

unread,
Feb 26, 2014, 3:27:32 PM2/26/14
to sha...@googlegroups.com
Good question. The tradeoff is that CRDTs avoid the need for an
operation log in exchange for putting more stuff in the documents
themselves. And we don't know how to make CRDTs for some data
structures.

For a text document, this takes the form of markers between characters
with unique ids to act as insertion points, and these markers can
never be removed. I think this makes the document at least 4x bigger,
but it depends on the implementation. If inserts & removes keep
happening to a document (even if the document itself doesn't grow),
the amount of data you need to store & send to all clients increases
without bound.

For JSON documents, ... well, I've still never seen a CRDT
implementation that lets you arbitrarily edit JSON structures. I'm not
sure its possible to implement list-move using CRDTs - I certainly
don't know anyone who's done it. Even doing a set (where objects
either exist or don't exist) is extremely hard - you need to store the
number of inserts of a key and the number of removes, and then compare
them every time to see if the removes >= inserts.

On the other hand, an operation log isn't such a bad structure to have
around. Its a combined auditing trail and emergency undo system. The
structure itself is append-only, so you can store it extremely
efficiently on disk and accesses are always in known ranges, so you
can easily implement a high performance version of the structure. It
can grow large if you're doing a lot of edits, but you can always
discard old ops if you want.

The biggest downside is that the oplog has to have a known order - so
scaling is harder. There's solutions - you can shard, or you can use
STM on foundationdb or something like that. And there's OT algorithms
which replace version numbers with hashes which you can use instead if
you want.

-J

On Wed, Feb 26, 2014 at 6:23 AM, Daniel Charbonneau
<novo.char...@gmail.com> wrote:
> Would someone expand on why you would want to use OT instead of CRDTs?
>
> --
> You received this message because you are subscribed to the Google Groups
> "ShareJS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sharejs+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Jason Morrison

unread,
Apr 11, 2014, 12:33:31 AM4/11/14
to sha...@googlegroups.com
An interesting manuscript that discusses some tradeoffs between OT and CRDT and a proposed architecture for taking advantage of each:


from a conference coming up in a few days: http://eventos.fct.unl.pt/papec/

Joseph Gentle

unread,
Apr 11, 2014, 2:01:20 PM4/11/14
to sha...@googlegroups.com

Heh - I discussed that algorithm with a researcher friend of mine a few years ago. Maybe we should have written a paper about it :)

Essentially, you have a crdt system with nodes that aren't full peers. The nodes have lower network overhead (because they use OT), but you end up with the complexity of both systems. Its like a really complicated last leg protocol compression algorithm. You end up with many of the downsides of CRDTs (higher storage requirements, more restrictive operation types - no JSON editing) with all the complexity that ot transform functions bring.

Its a good idea, but I'm holding out for some better tricks before I start reengineering ShareJS.

As for the conference, I knew about it and I'm a little sorry I didn't go. Maybe next year.

-J

For more options, visit https://groups.google.com/d/optout.

Collin Miller

unread,
Apr 12, 2014, 1:49:59 AM4/12/14
to sha...@googlegroups.com
Are you familiar with the riak crdt systems? 

I'm not very familiar with them, but I watched some of the most recent ricon presentations and they made a big fuss about their new crdt types that don't have unbounded storage requirements as they accumulate changes.

I'd be curious to hear what a non-partisan's opinion of them is.
Reply all
Reply to author
Forward
0 new messages