Hi everyone!
I've been working on and off for the past few weeks on the new OT
type. The progress is thus:
- Transform is working (O_o). There aren't any more bugs that I know
about (I have about 30 test cases), although I'm sure the randomizer
will find some new edge cases I didn't think of. Still not implemented
in transform: embedded types (easy), set-null (hard).
- Apply is working, although that was really easy.
- The only hard-ish function left to do is compose, although compose
will probably be a lot simpler than transform. The trick will be
factoring both of them to reuse as much code as possible.
Transform is by far the hardest code in the type. I'd say its about
65% done in total.
I had a big design review with Nate last week while I was briefly in
SF, and we talked through some things:
Invertibility:
So, I think the only thing non-invertible in operations is the removed
data. Instead of just specifying that data is being removed, I'm going
to make operations optionally invertible. I don't know how this will
play out exactly in livedb.
It might also be worth adding a standard getInvert(doc, op) function
for ops which returns an op's invert. Formally, given op' =
invert(doc, op) then doc * op * op` == doc. This would be really easy
to write for all the other OT types too and it would make OT-level
undo easy to implement.
Initializing data (aka set null):
The ability to use OT to initialize is actually super useful /
important. Its used to express 'initialize doc to {count:0} and then
increment count'. Without explicit support, its impossible to express
that operation in a way that transforms correctly.
The JSON-patch specification contains arbitrary conditionals (instead
of just if-null-then-insert), but unfortunately I can't think of a way
to capture the semantics of that in a general way through transform.
Format of operations:
We spent a lot of time talking about the awkwardness of the current
operation format. If you want to set doc.name.first = "Nate" then
currently you have to say:
{o:{name:{o:{first:{di:"Nate"}}}}}
Nate (fairly) complained that adding the o's everywhere is awkward,
and making it work properly for lists requires coercing strings into
numbers in javascript and sorting.
He suggested instead we use lists-of-lists:
['name','first',{i:'Nate'}]
... Which is a compacted version of this:
['name',[['first',[{i:'Nate'}]]]]
If we wanted to set first and last names:
['name',[['first',{i:'Nate'}], ['last',{i:'Smith'}]]]
I'm not sure what to think about this. The fact that there's multiple
ways to write the same operation is going to either make the OT code
more complicated, or require translating between the expanded and
compact representations at the start and end of all the OT functions.
So ... ugh.
I translated all the examples from the draft spec to nate's proposed system:
https://gist.github.com/josephg/bd05e9dd240dc0ac7888
In terms of bytes-on-wire its about the same for the operations I
looked at, though if you have super deep objects and simple operations
(super common) then nate's proposal will start looking a lot better.
I'd love some input - otherwise I'm going to sit on it and think for a
week or two. There might be a compromise solution somewhere, but in
any case I can't make much more progress until we figure this out.
Cheers
-J